[ Up (Contents) ] [ Appendix A ] Page Last Updated: 5 April 1998

Some Character Set Names and Descriptions. Most of these names are supported by current browsers.

Character Set Label Description
us-ascii US ASCII
iso-8859-1 ISO Latin-1
x-mac-roman Similar to Latin-1 (Mac only)
iso-8859-2 Central/East European (Slavic: Czech, Croat, German, Hungarian, Polish, Romanian, Slovak, and Slovenian)
x-mac-ce Central/East European (Windows only)
CP-1250, or Win 1251 Central/East European (Mac only)
iso-8859-3 Southern European (Esperanto, Galician, Maltese, and Turkish)
x-mac-cyrillic Cyrillic (Mac only)
CP-1251, or Win 1251 Cyrillic (Windows only)
KOI8-R Cyrillic (RFC 1489)
iso-8859-4 Cyrillic (Estonian, Latvian, Lithuanian)
iso-8859-5 Cyrillic (Bulgarian, Byelorussian, Macedonian, Serbian, and Ukrainian)
iso-8859-6 Arabic
iso-8859-7 Modern Greek
iso-8859-8 Hebrew
x-mac-turkish Turkish (Mac only)
iso-8859-9 Turkish
iso-8859-10 Greenlandic/Icelandic/Lapp
iso-2022-jis Japanese
iso-2022-jp Japanese (RFC 1468)
x-sjis, or ShiftJIS Japanese Shift-JIS (Microsoft code set)
iso-2022-kr Korean (RFC 1557)
x-euc-jp, or euc-jp Extended UNIX Code for Japanese
x-euc-kr, or euc-kr Extended UNIX Code for Korean (RFC 1557)
gb_2312-80 Simplified Chinese--People's Republic (RFC 1345)
x-euc-tw Extended UNIX Code for Chinese--Taiwan
x-cns11643-1 Traditional Chinese--Taiwan
x-cns11643-2 Traditional Chinese--Taiwan
Big5 Traditional Chinese--Taiwan--multi-byte set
UCS-2 Unicode, two-byte encoding (same as UCS portion of ISO 10646)
UTF-8 Unicode, one-byte (8-bit) encoding--universal transformation format (same as UCS portion of ISO 10646)