Documentation Center

Encoding of XML Files

Byte Order Marks (BOMs) can be provided for Unicode files. Contained in the first bytes of the file, the BOMs specify the Unicode encoding. The BOM codes are:

EF BB BFUTF-8
FF FELittle Endian
FE FFBig Endian

For files without BOMs the XML parser automatically assumes UTF-8 encoding (no codepage specification). If the file uses codepage-based encoding it must begin with an XML declaration containing the codepage specification. The following code sample shows the XML declaration for use of the ISO 8859-1 (Western European) character set.

<xml version="1.0" encoding="ISO-8859-1" >
XML files with BOMs should not contain an encoding declaration since the encoding is already defined by the BOMs. Contradictory specifications through BOMs and encoding declarations will result in an error when the file is parsed.