Documentation Center

Managing the code page of published content

When a user publishes Component Presentations and Pages, code page values for these items are included. These values are initially set in the Publication Target in the Default Code Page setting. You can override this setting in your Template. You can also map the code page to a Java character set name.

Setting the Default Code Page
By default, the Default Code Page is set to System Default, the code page specified by the Windows operating system of the machine from which you are publishing. However, if you publish from a Content Manager with one code page (using that source code page) to a Presentation Server whose Web server uses a different code page, you may lose characters in publishing. In such situations, set the Default Code Page to the code page of the Web server(s) to which you are publishing.
  1. Open Content manager Explorer.
  2. Select Administration in the navigation pane.
  3. Expand the Publishing Management > Publication Targets node.
  4. Open a Publication Target.
  5. Set the Default Code Page to the code page of the Web server(s) to which you are publishing.
Setting the code page of your publishable content
If you do not use the System Default, you must set the code page of your publishable content in your templates or globally in your web.config to ensure that Web browsers correctly interpret content rendered.
  • To set the code page of your publishable content in your Page Templates and Component Templates, to ensure that your templates produce the following line at the top of the publishable piece of content you generate (to Page Templates before any <html> tags):
    <%@page pageEncoding="UTF-8" %>

    where UTF-8 sets the code page to UTF-8. Set the value of pageEncoding to the value you intend to use.

  • To set the code page of your publishable content globally in your web.config file, open the file in a text editor and in the <configuration><system.web> subsection configure:
    <globalization fileEncoding="UTF-8" requestEncoding="UTF-8" responseEncoding="UTF-8" />
Mapping the code page of published content to a Java character set name
When the Content Deployer retrieves the published Component Presentations and Pages, it translates the character encodings it finds into Java character set names using a mapping file that is located in the Content Broker JAR file. This mapping file contains most commonly used code pages. If this mapping file does not include the code page you are using, or if it maps your code page to the wrong Java character set name, you can add a custom mapping file.
To override or add mappings:
  1. Create or open the Java properties file codepage_encoding.properties.
  2. In codepage_encoding.properties, specify name=value pairs where:
    • name is the code page number
    • value is the Java character set name
    The following example maps Microsoft code page 1200 (which represents UTF-16) to the Java equivalent (which is the character set UTF-16):
    1200=UTF-16
    Start a new line for each mapping and add comments with lines starting with "#". Note that the mapping example above is for illustration purposes only and is already part of default mappings.
  3. Put codepage_encoding.properties on the classpath of the Content Deployer, for example in the same location as the configuration files.