UTF-8 (196, 129) ā | name: LATIN SMALL LETTER A WITH MACRON |
old name: | |
Adobe glyph name: amacron | |
mnemonic name(s): <a-> | |
category: Ll (Letter, Lowercase) | |
combining: 0 | |
decomposition info: 0061 0304 | |
comment: | |
found in charsets: 8859-10 (E0); 8859-13 (E2); 8859-4 (E0); CP1257 (E2); CP775 (83); | |
found in languages: hawa [Hawaiian]; livo [Livonian]; lv [Latvian]; mars [Marshallese]; mi [Maori]; | |
used in romanization of: am_r [Amharic (ethiopic)]; ar_r [Arabic (perso-arabic)]; as_r [Assamese (assamese)]; bn_r [Bengali (bengali)]; fa_r [Persian (perso-arabic)]; gu_r [Gujarati]; hi_r [Hindi (devanagari)]; kn_r [Kannada]; ml_r [Malayalam]; or_r [Oriya]; pa_r [Punjabi]; ps_r [Pashto (perso-arabic)]; ta_r [Tamil (tamil)]; te_r [Telugu]; ur_r [Urdu (perso-arabic)]; zh_r [Chinese (sino-japanese)]; | |
uppercase: 0100 |
Most languages also have at least some loanwords containing characters
not listed in the query results. While these characters clearly do not
belong under the 'required' category, one can argue about their
'importantness'. So far, only Norwegian has comments of this kind.
Glyph
Standard disclaimer applies -- the glyph presented here is by no means
normative. U0101 is the Unicode value (= 101 hexadecimal).
Decimal
Decimal representation of the Unicode value that can be used
e.g. in HTML 4.
UTF-8
UTF-8 representation of the Unicode value that can be used
e.g. in HTML 4. You can see the result if you change the document
encoding to UTF-8. Netscape users: View - Character set - Unicode (UTF-8).
Font support is also needed so don't be surprised if an hollow box
is displayed for many characters.
Name
Glyph names as currently defined in the Unicode standard
(UnicodeData-Latest.txt).
Old name
Deprecated names as currently defined in the Unicode standard
(UnicodeData-Latest.txt).
Adobe glyph name
Glyph names as currently defined in the Adobe Glyph List. This
name is used in PostScript. Be sure to read
Unicode and Glyph Names.
Mnemonic name
Mnemonic representation as currently defined in
mnemonic,ds.
Mainly used in POSIX environment to define mapping tables, collating rules etc.
Enclosing <angle brackets> are not part of the name.
Character sets
ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/
Mapping tables for various character sets can be found in subdirectories.
The bracketed hexadecimal number atfer the set corresponds to the character's
code position in this set.
Category, combining, decomposition info, comment, upper/lowercase
Various categories as currently defined in the Unicode standard.
Source UnicodeData-Latest.txt.
Explanations about these fields can be found in
ReadMe-Latest.txt.
Decomposition info is shown only if present. In this example, LATIN SMALL LETTER A WITH MACRON can be decomposed into 0061 (LATIN SMALL LETTER A) and 0304 (COMBINING MACRON). Note that some parts may be decomposed further and the full decomposition requires a recursive algorithm.
In addition to the Unicode comment field some characters may be provided with an additional "note"-field.
The upper case equivalent for the LATIN SMALL LETTER A WITH MACRON is 0100 (LATIN SMALL LETTER A WITH MACRON). A few exceptions to the common upper-lowercase behaviour of the Latin and Cyrillic scripts are
Not all fields defined by Unicode are shown and used by Letter Database.