UTF-8 and ISO-8859-1 Overview

The ISO-8859-1 encoding standard provides for single byte representation of characters that are in the LATIN-1 language group. The Latin 1 language group includes:

  • Afrikaans
  • Finnish
  • Italian
  • Basque
  • French
  • Norwegian
  • Catalan
  • Galician
  • Portuguese
  • Danish
  • English
  • Scottish
  • Dutch
  • German
  • Spanish
  • English
  • Icelandic
  • Swedish
  • Faeroese
  • Irish

For a complete list of characters supported in the ISO-8859-1 character set refer to: http://unicode.org/Public/MAPPINGS/ISO8859/8859-1.TXT

The following table lists examples of ISO-8859-1 (single byte) character encodings.

Hex ValueDisplay CharacterDescription
300The number zero
41ACapital letter A
A5¥YEN SIGN
DFßLATIN SMALL LETTER SHARP S
BF¿INVERTED QUESTION MARK
E6ÆLATIN SMALL LIGATURE AE
FFŸLATIN SMALL LETTER Y WITH DIAERESIS

The UTF-8 encoding standard provides for multi-byte representation of characters that are in all language groups (over 650 languages). In some cases, characters are represented using one byte, in some cases two bytes, and in other cases they are represented using three and four bytes.

The table shows examples of characters that are encoded in UTF-8. In this example:

The first two characters have encodings that are identical to ISO-8859-1.

The next five characters are valid ISO-8859-1 characters but are encoded with two bytes in UTF-8 versus one byte in ISO-8859-1.

The remaining four characters are encoded as two bytes in UTF-8 and are NOT part of the ISO character set.

For a complete list of UTF-8 character encodings refer to: http://unicode.org/.

Hex ValueDisplay CharacterDescription
300The number zero
41ACapital letter A
C2 A5¥YEN SIGN
C2 BF¿INVERTED QUESTION MARK
C3 9FßLATIN SMALL LETTER SHARP S
C3 A6ÆLATIN SMALL LIGATURE AE
C3 BFŸLATIN SMALL LETTER Y WITH DIAERESIS
D1 88шCYRILLIC SMALL LETTER SHA
D1 89щCYRILLIC SMALL LETTER SHCHA
D1 8AъCYRILLIC SMALL LETTER HARD SIGN
D1 8BыCYRILLIC SMALL LETTER YERU