| java.lang.Object com.ibm.icu.dev.demo.rbnf.RbnfSampleRuleSets
RbnfSampleRuleSets | public class RbnfSampleRuleSets (Code) | | A collection of example rule sets for use with RuleBasedNumberFormat.
These examples are intended to serve both as demonstrations of what can
be done with this framework, and as starting points for designing new
rule sets.
For those that claim to represent number-spellout rules for languages
other than U.S. English, we make no claims of either accuracy or
completeness. In fact, we know them to be incomplete, and suspect
most have mistakes in them. If you see something that you know is wrong,
please tell us!
author: Richard Gillam |
Field Summary | |
final public static String | abbEnglish The example shows large numbers the way they often appear is nwespapers:
1,200,000 is formatted as "1.2 million". | final public static String | arabicNumerals Arabic digits. | final public static String | chinesePlaceValue This example formats numbers using Chinese characters in the Arabic
place-value method. | final public static String | closestFraction Number with closest fraction. | final public static String | decimalAsFraction This rule set shows the fractional part of the number as a fraction
with a power of 10 as the denominator. | final public static String | dollarsAndCents This example formats a number in one of the two styles often used
on checks. | final public static String | dozens This example formats a number in dozens and gross. | final public static String | durationInHours This example formats a number of hours in sexagesimal notation (i.e.,
hours, minutes, and seconds). | final public static String | durationInSeconds This example formats a number of seconds in sexagesimal notation
(i.e., hours, minutes, and seconds). | final public static String | dutch Spellout rules for Dutch. | final public static String | french Spellout rules for French. | final public static String | german Spellout rules for German. | final public static String | greek Spellout rules for Greek. | final public static String | greekAlphabetic Greek alphabetic numerals. | final public static String | hebrew Spellout rules for Hebrew. | final public static String | hebrewAlphabetic Hebrew alphabetic numerals. | final public static String | italian Spellout rules for Italian. | final public static String | japanese Spellout rules for Japanese. | final public static String | message1 This is a simple message-formatting example. | final public static String | message2 A more complicated message-formatting example. | final public static String | ordinal This rule set adds an English ordinal abbreviation to the end of a
number. | final public static String | poundsShillingsAndPence This rule set formats a number of pounds as pounds, shillings, and
pence in the old English system of currency. | final public static String | romanNumerals Roman numerals. | final public static String | russian Spellout rules for Russian. | final public static String[] | sampleRuleSetCommentary | final public static Locale[] | sampleRuleSetLocales The base locale for each of the sample rule sets. | final public static String[] | sampleRuleSetNames The displayable names for all the sample rule sets, in the same order as
the preceding array. | final public static String[] | sampleRuleSets A list of all the sample rule sets, used by the demo program. | final public static String | spanish Spellout rules for Spanish. | final public static String | stock American stock-price formatting. | final public static String | swedish Spellout rules for Swedish. | final public static String | swissFrench Spellout rules for Swiss French. | final public static String | ukEnglish Spellout rules for U.K. | final public static String | units This example takes a number of meters and formats it in whatever unit
will produce a number with from one to three digits before the decimal
point. | final public static String | usEnglish Spellout rules for U.S. | final public static String | wordsForDigits Words for digits. |
abbEnglish | final public static String abbEnglish(Code) | | The example shows large numbers the way they often appear is nwespapers:
1,200,000 is formatted as "1.2 million".
|
arabicNumerals | final public static String arabicNumerals(Code) | | Arabic digits. This example formats numbers in Arabic numerals.
Normally, you'd do this with DecimalFormat, but this shows that
RuleBasedNumberFormat can handle it too.
|
chinesePlaceValue | final public static String chinesePlaceValue(Code) | | This example formats numbers using Chinese characters in the Arabic
place-value method. This was used historically in China for a while.
|
closestFraction | final public static String closestFraction(Code) | | Number with closest fraction. This example formats a value using
numerals, but shows the fractional part as a ratio (fraction) rather
than a decimal. The fraction always has a denominator between 2 and 10.
|
decimalAsFraction | final public static String decimalAsFraction(Code) | | This rule set shows the fractional part of the number as a fraction
with a power of 10 as the denominator. Some languages don't spell
out the fractional part of a number as "point one two three," but
always render it as a fraction. If we still want to treat the fractional
part of the number as a decimal, then the fraction's denominator
is always a power of 10. This example does that: 23.125 is formatted
as "twenty-three and one hundred twenty-five thousandths" (as opposed
to "twenty-three point one two five" or "twenty-three and one eighth").
|
dollarsAndCents | final public static String dollarsAndCents(Code) | | This example formats a number in one of the two styles often used
on checks. %dollars-and-hundredths formats cents as hundredths of
a dollar (23.40 comes out as "twenty-three and 40/100 dollars").
%dollars-and-cents formats in dollars and cents (23.40 comes out as
"twenty-three dollars and forty cents")
|
dozens | final public static String dozens(Code) | | This example formats a number in dozens and gross. This is intended to
demonstrate how this rule set can be used to format numbers in systems
other than base 10. The "/12" after the rules' base values controls this.
Also notice that the base doesn't have to be consistent throughout the
whole rule set: we go back to base 10 for values over 1,000.
|
durationInHours | final public static String durationInHours(Code) | | This example formats a number of hours in sexagesimal notation (i.e.,
hours, minutes, and seconds). %with-words formats the value using
words for the units, and %in-numerals formats the value using only
numerals.
|
durationInSeconds | final public static String durationInSeconds(Code) | | This example formats a number of seconds in sexagesimal notation
(i.e., hours, minutes, and seconds). %with-words formats it with
words (3740 is "1 hour, 2 minutes, 20 seconds") and %in-numerals
formats it entirely in numerals (3740 is "1:02:20").
|
dutch | final public static String dutch(Code) | | Spellout rules for Dutch. Notice that in Dutch, as in German,
the ones digit precedes the tens digit.
|
french | final public static String french(Code) | | Spellout rules for French. French adds some interesting quirks of its
own: 1) The word "et" is interposed between the tens and ones digits,
but only if the ones digit if 1: 20 is "vingt," and 2 is "vingt-deux,"
but 21 is "vingt-et-un." 2) There are no words for 70, 80, or 90.
"quatre-vingts" ("four twenties") is used for 80, and values proceed
by score from 60 to 99 (e.g., 73 is "soixante-treize" ["sixty-thirteen"]).
Numbers from 1,100 to 1,199 are rendered as hundreds rather than
thousands: 1,100 is "onze cents" ("eleven hundred"), rather than
"mille cent" ("one thousand one hundred")
|
german | final public static String german(Code) | | Spellout rules for German. German also adds some interesting
characteristics. For values below 1,000,000, numbers are customarily
written out as a single word. And the ones digit PRECEDES the tens
digit (e.g., 23 is "dreiundzwanzig," not "zwanzigunddrei").
|
greek | final public static String greek(Code) | | Spellout rules for Greek. Again in Greek we have to supply the words
for the multiples of 100 because they can't be derived algorithmically.
Also, the tens dgit changes form when followed by a ones digit: an
accent mark disappears from the tens digit and moves to the ones digit.
Therefore, instead of using the [] notation, we actually have to use
two separate rules for each multiple of 10 to show the two forms of
the word.
|
greekAlphabetic | final public static String greekAlphabetic(Code) | | Greek alphabetic numerals. The Greeks, before adopting the Arabic numerals,
also used the letters of their alphabet as numerals. There are three now-
obsolete Greek letters that are used as numerals; many fonts don't have them.
Large numbers were handled many different ways; the way shown here divides
large numbers into groups of four letters (factors of 10,000), and separates
the groups with the capital letter mu (for myriad). Capital letters are used
for values below 10,000; small letters for higher numbers (to make the capital
mu stand out).
|
hebrew | final public static String hebrew(Code) | | Spellout rules for Hebrew. Hebrew actually has inflected forms for
most of the lower-order numbers. The masculine forms are shown
here.
|
hebrewAlphabetic | final public static String hebrewAlphabetic(Code) | | Hebrew alphabetic numerals. Before adoption of Arabic numerals, Hebrew speakers
used the letter of their alphabet as numerals. The first nine letters of
the alphabet repesented the values from 1 to 9, the second nine letters the
multiples of 10, and the remaining letters the multiples of 100. Since they
ran out of letters at 400, the remaining multiples of 100 were represented
using combinations of the existing letters for the hundreds. Numbers were
distinguished from words in a number of different ways: the way shown here
uses a single mark after a number consisting of one letter, and a double
mark between the last two letters of a number consisting of two or more
letters. Two dots over a letter multiplied its value by 1,000. Also, since
the letter for 10 is the first letter of God's name and the letters for 5 and 6
are letters in God's name, which wasn't supposed to be written or spoken, 15 and
16 were usually written as 9 + 6 and 9 + 7 instead of 10 + 5 and 10 + 6.
|
italian | final public static String italian(Code) | | Spellout rules for Italian. Like German, most Italian numbers are
written as single words. What makes these rules complicated is the rule
that says that when a word ending in a vowel and a word beginning with
a vowel are combined into a compound, the vowel is dropped from the
end of the first word: 180 is "centottanta," not "centoottanta."
The complexity of this rule set is to produce this behavior.
|
japanese | final public static String japanese(Code) | | Spellout rules for Japanese. In Japanese, there really isn't any
distinction between a number written out in digits and a number
written out in words: the ideographic characters are both digits
and words. This rule set provides two variants: %traditional
uses the traditional CJK numerals (which are also used in China
and Korea). %financial uses alternate ideographs for many numbers
that are harder to alter than the traditional numerals (one could
fairly easily change a one to
a three just by adding two strokes, for example). This is also done in
the other countries using Chinese idographs, but different ideographs
are used in those places.
|
message1 | final public static String message1(Code) | | This is a simple message-formatting example. Normally one would
use ChoiceFormat and MessageFormat to do something this simple,
but this shows it could be done with RuleBasedNumberFormat too.
A message-formatting example that might work better with
RuleBasedNumberFormat appears later.
|
message2 | final public static String message2(Code) | | A more complicated message-formatting example. Here, in addition to
handling the singular and plural versions of the word, the value is
denominated in bytes, kilobytes, or megabytes depending on its magnitude.
Also notice that it correctly treats a kilobyte as 1,024 bytes (not 1,000),
and a megabyte as 1,024 kilobytes (not 1,000).
|
ordinal | final public static String ordinal(Code) | | This rule set adds an English ordinal abbreviation to the end of a
number. For example, 2 is formatted as "2nd". Parsing doesn't work with
this rule set. To parse, use DecimalFormat on the numeral.
|
poundsShillingsAndPence | final public static String poundsShillingsAndPence(Code) | | This rule set formats a number of pounds as pounds, shillings, and
pence in the old English system of currency.
|
romanNumerals | final public static String romanNumerals(Code) | | Roman numerals. This example has two variants: %modern shows how large
numbers are usually handled today; %historical ses the older symbols for
thousands.
|
russian | final public static String russian(Code) | | Spellout rules for Russian.
|
sampleRuleSetCommentary | final public static String[] sampleRuleSetCommentary(Code) | | |
sampleRuleSetLocales | final public static Locale[] sampleRuleSetLocales(Code) | | The base locale for each of the sample rule sets. The locale is used to
determine DecimalFormat behavior, lenient-parse behavior, and text-display
selection (we have a hack in here to allow display of non-Latin scripts).
Null means the locale setting is irrelevant and the default can be used.
|
sampleRuleSetNames | final public static String[] sampleRuleSetNames(Code) | | The displayable names for all the sample rule sets, in the same order as
the preceding array.
|
sampleRuleSets | final public static String[] sampleRuleSets(Code) | | A list of all the sample rule sets, used by the demo program.
|
spanish | final public static String spanish(Code) | | Spellout rules for Spanish. The Spanish rules are quite similar to
the English rules, but there are some important differences:
First, we have to provide separate rules for most of the twenties
because the ones digit frequently picks up an accent mark that it
doesn't have when standing alone. Second, each multiple of 100 has
to be specified separately because the multiplier on 100 very often
changes form in the contraction: 500 is "quinientos," not
"cincocientos." In addition, the word for 100 is "cien" when
standing alone, but changes to "ciento" when followed by more digits.
There also some other differences.
|
stock | final public static String stock(Code) | | American stock-price formatting. Non-integral stock prices are still
generally shown in eighths or sixteenths of dollars instead of dollars
and cents. This example formats stock prices in this way if possible,
and in dollars and cents if not.
|
swedish | final public static String swedish(Code) | | Spellout rules for Swedish.
|
swissFrench | final public static String swissFrench(Code) | | Spellout rules for Swiss French. Swiss French differs from French French
in that it does have words for 70, 80, and 90. This rule set shows them,
and is simpler as a result.
|
ukEnglish | final public static String ukEnglish(Code) | | Spellout rules for U.K. English. U.K. English has one significant
difference from U.S. English: the names for values of 1,000,000,000
and higher. In American English, each successive "-illion" is 1,000
times greater than the preceding one: 1,000,000,000 is "one billion"
and 1,000,000,000,000 is "one trillion." In British English, each
successive "-illion" is one million times greater than the one before:
"one billion" is 1,000,000,000,000 (or what Americans would call a
"trillion"), and "one trillion" is 1,000,000,000,000,000,000.
1,000,000,000 in British English is "one thousand million." (This
value is sometimes called a "milliard," but this word seems to have
fallen into disuse.)
|
units | final public static String units(Code) | | This example takes a number of meters and formats it in whatever unit
will produce a number with from one to three digits before the decimal
point. For example, 230,000 is formatted as "230 km".
|
usEnglish | final public static String usEnglish(Code) | | Spellout rules for U.S. English. This demonstration version of the
U.S. English spellout rules has four variants: 1) %simplified is a
set of rules showing the simple method of spelling out numbers in
English: 289 is formatted as "two hundred eighty-nine". 2) %alt-teens
is the same as %simplified, except that values between 1,000 and 9,999
whose hundreds place isn't zero are formatted in hundreds. For example,
1,983 is formatted as "nineteen hundred eighty-three," and 2,183 is
formatted as "twenty-one hundred eighty-three," but 2,083 is still
formatted as "two thousand eighty-three." 3) %ordinal formats the
values as ordinal numbers in English (e.g., 289 is "two hundred eighty-
ninth"). 4) %default uses a more complicated algorithm to format
numbers in a more natural way: 289 is formatted as "two hundred AND
eighty-nine" and commas are inserted between the thousands groups for
values above 100,000.
|
wordsForDigits | final public static String wordsForDigits(Code) | | Words for digits. Follows the same pattern as the Arabic-numerals
example above, but uses words for the various digits (e.g., 123 comes
out as "one two three").
|
|
|