REPORT ON THE CURRENT STATUS OF
UNITED NATIONS ROMANIZATION SYSTEMS FOR GEOGRAPHICAL NAMES
Compiled by the UNGEGN Working Group on Romanization Systems
Version 2.2, January 2003

Khmer

The United Nations recommended system was approved in 1972 (resolution II/10), based on the system used by the BGN/PCGN (1972), this being a modified version of the Service Géographique Khmère (SGK) 1959 system. The table and notes on its usage were published in volume II of the conference report (Second United Nations Conference on the Standardization of Geographical Names. London, 10-31 May 1972. Vol. II. Technical papers. United Nations. New York 1974, pp. 163-164.).

The system is used in many international cartographic products. In 1994-1995 the Gazetteer of Cambodia was produced using this system with some proposed modifications. However, since 1995 the Geography Department of the Ministry of Land Management and Urban Planning of Cambodia has been developing a new romanization system which was subsequently used in the second edition of the Gazetteer of Cambodia in 1996. This provisional system which does not contain any diacritical marks was further modified in 1997.

Khmer uses an alphasyllabic script whereby each character represents a syllable rather than one sound. Vowels and diphthongs are marked in two ways: as independent characters (used syllable-initially) and in an abbreviated form, to denote vowels after consonants. The romanization system is complicated by many additional rules. In Khmer writing word division is not ordinarily indicated and Khmer diacritical marks are often omitted. The romanization is generally not reversible to its original script form.

Romanization

I. Consonant characters

1
2 khâ
3
4 khô
5 ngô
6 châ
7 chhâ
8 chô
9 chhô
10 nhôA
11
12 thâ
13
14 thô
15
16
17 thâ
18
19 thô
20
21 B
22 phâ
23
24 phô
25
26
27
28
29
30
31
32
33 ’âC

II. Subscript consonant characters (ក stands for any consonant character, see also note 3)

Character numbers correspond to those in Table I.

1 ក្ក k
2 ក្ខ kh
3 ក្គ k
4 ក្ឃ kh
5 ក្ង ng
6 ក្ច ch
7 ក្ឆ chh
8 ក្ជ ch
9 ក្ឈ chh
10 ក្ញ nhA
11 ក្ដ d
12 ក្ឋ th
13 ក្ឌ d
14 ក្ឍ th
15 ក្ណ n
16 ក្ត t
17 ក្ថ th
18 ក្ទ t
19 ក្ធ th
20 ក្ន n
21 ក្ប b
22 ក្ផ ph
23 ក្ព p
24 ក្ភ ph
25 ក្ម m
26 ក្យ y
27 ក្រ r
28 ក្ល l
29 ក្វ v
30 ក្ស s
31 ក្ហ h
33 ក្អ

III. Independent vowel characters

1 ĕ
2 ei
3 ŏ, ŭA
4 âu
5 rœ̆
6
7 lœ̆
8
9 ê
10 ai
11 B
12 au

IV. Vocalic nuclei (ក stands for any consonant character)

Where variants in romanization separated by a dash are given the first is to be used in the â-series and the second in the ô-series.

1 កា a–éaA
2 កិ ĕ–ĭ
3 កី ei–i
4 កឹ œ̆
5 កឺ œ
6 កុ ŏ–ŭ
7 កូ o–u
8 កួ
9 កើ aeu–eu
10 កឿ œă
11 កៀ
12 កេ é
13 កែ ê
14 កៃ ai–ey
15 កោ aô–oŭ
16 កៅ au–ŏu

V. Shortened syllables and vocalic nuclei with anuswara or visarga

Where variants in romanization separated by a dash are given the one before the dash is to be used in the â-series and the one(s) after the dash in the ô-series. (ក stands for any consonant character.)

1 កក់ á–ó
2 កាក់ ă–oă,eăA
3 ក័ក ă–oă,eăA
4 កំ âm–um
5 កុំ om–ŭm
6 កាំ ăm–ŏâm
7 កះ ăh–eăh
8 កុះ ŏh–ŭh
9 កេះ éh
10 កោះ aôh-ŏăh
11 កាំង ăng-eăng

Notes

  1. Khmer consonants are divided into 2 series, the â-series and the ô-series, as indicated in the consonant table. With vocalic nuclei these consonants may produce different romanizations, as shown in the abbreviated vowel table: ក , ក្រ krâ, គ , គ្រ krô. A Khmer consonant in syllable-final position, not accompanied by a vowel marker or by ៏, should generally be romanized without a vowel letter following: កក kâk, អង្គ 'ângk (exception: ពង្រ pôngrô, also written ពង្ររ pôngrôr and ពង្រហ pôngrôh).
  2. The Khmer diacritical mark ៊ or ៉ written above an â-series consonant (except ប and បា; see note 4) changes it to the ô-series: ហ៊ាង héang. The diacritical mark ៉ written above an ô-series consonant changes it to the â-series: ញ៉ង nhâng. When either of these marks would conflict with another symbol written above a character, the mark ុ or ូ may be written in its place: ហ៊ី hi, ដំរីូ dâmrei. (These marks are frequently omitted in Khmer writing, particularly in words of Indic provenance.)
  3. The second consonant of a Khmer graphic cluster is generally written below the base consonant in the special form called a "foot": ខ្នង khnâng. There is no foot for the character ឡ . The "feet" ្ដ and ្ធ usually represent the characters ដ and ធ respectively, rather than ត and ឋ: ក្ដី kdei, កន្ធាយ kânthéay, but កន្ត្រប់ kântráb.

    A "foot" determines the series of the following vocalic nucleus unless it is a nasal (ង ng, ញ nh, ណ n, ន n, ម m) or យ y, រ r, ល l, វ v, ស s. In that case, the base consonant determines the vocalic series: ខ្ពង khpông, ល្អ l’â, ថ្ម thmâ, ស្វាយ svay. Syllable-final យ and ង sometimes appear as "feet": ស្វាយ or ស្វា្យ svay, ទាំង or ទាំ្ង teăng. This practice appears to be optional and such irregular Khmer spellings are not reflected as such in romanization.

  4. The combination ប plus ា is written បា ba. The latter character is a graphic device designed to prevent confusion with ហ . The characters ប and បា with the diacritical mark ៉ are romanized p in the â-series, rather than as b in the ô-series: ប៉ង pâng, ប៉ាតៅ patau. The diacritical mark ុ or ូ is substituted where a conflict with another symbol written above a character would occur: ប៉ី pei. The characters ប and បា when accompanied by a "foot" are also romanized as p in the â-series, although the Khmer diacritical mark is generally omitted: ប្លែង plêng, ប្អ p’â, ប្រាប់ práb.
  5. The â-series consonant អ is romanized by means of an apostrophe (’): ក្អែក k’êk, ចង្អៀត châng’iet, រអិល rô’el, អ្វី ’vei, អាង ’ang. In word-initial position before a vowel, ’ may be omitted: អាង ang.
  6. The Khmer diacritical mark ់ appears only in two combinations (• exemplifies any consonant): •់ (examples: បត់ bát, ខ្ពស់ khpós) and ា់. The diacritical mark ័ appears only in the combination ័•. In the â-series both ា់ and ័• are romanized ă: ចាក់ and ច័ក are both romanized chăk. In the ô-series both ា់ and ័• are romanized when followed by k, ng or h; otherwise, they are romanized : រពាក់ rôpeăk, មាត់ moăt, វ័ង្ក veăngk, ភ័ព្វ phoăpv.
  7. The combination ៌ is romanized r before a consonant: ធម៌ thôrm. The combination ៌់ is romanized as r before a consonant preceded by a shortened vocalic nucleus: គារ៌់ koărr.
  8. The symbol ៏ in syllable-initial position is ignored in romanization: ស៏ , ស៏សស sâsâs. In syllable-final position ៏ indicates that the consonant is vowelled, i.e. followed by â in the â-series, by ô in the ô-series: តំណ៏ tâmnâ, ពម៏ pômô.
  9. The diacritical mark ៍ (which appears above characters and/or vowel markers which are not vocalized) is ignored in romanization: បុណ្យ៍ bŏny, ពោធិ៍ poŭthĭ, ភូមិ៍ phumĭ.
  10. The independent character ឨ is romanized either ŏ or ŭ. A reference source should be consulted where doubt arises.

Other systems of romanization

The provisional romanization system by the Geography Department (Geographical Names of the Kingdom of Cambodia. Submitted by Cambodia. Eighth United Nations Conference on the Standardization of Geographical Names. Berlin, 27 August – 5 September 2002. Document E/CONF.94/ INF.30) of the Ministry of Land Management and Urban Planning of Cambodia (1995, modified 1997) renders the consonants in the same way as described above but the presentation of vowels is somewhat different. As a rule, diacritical signs used in the UN system are omitted but the following equivalents are different. (Numbers refer to the tables and characters in the UN system. Some of the provided romanizations have no explicit counterparts in the UN system.)

No. Char. UN system Provisional
III.3 ŏ, ŭ o
III.- (not given) ou
III.5 rœ̆ rue
III.6 rueu
III.7 lœ̆ lue
III.8 lueu
III.9 ê ae
III.12 au ov
IV.4 œ̆ oe–ue
IV.5 œ eu–ueu
IV.7 o–u ou–u
IV.10 œă oea
IV.13 ê ae–eae
IV.16 au–ŏu au–ov
V.2 ា់ ă–oă,eă (a–ea?)
V.(2) ាក់ (ăk–eăk) ak–eak
V.3 ័• ă–oă,eă oa(?)–oa
V.(3) ័យ (not given) ai–ey
V.10 ោះ aôh–ŏăh aoh–uoh
V.- (not given) ak–eak
V.- ិះ (not given) eh–is

Where variants in romanization separated by a dash are given the one before the dash is to be used in the â-series and the one(s) after the dash in the ô-series. Uncertain romanization equivalents are indicated by a question mark.

Before the last modification in 1997 to the system was made, the vowel a / ă in combinations ា, ា់ and ័• (see UN system, Table IV, line 1, Table V, lines 2 and 3) was romanized as aa, and the vowel é in the combination េ (Table IV, line 12) was romanized as ee.