Documentation Center

XML v2 custom settings

The XML 2 custom file type has the *.xml extension.

Detection

When configuring the Parser section of the XML 2 custom file type, you can edit all the Parser rules manually from scratch, or adjust the Parser rules from an uploaded file and preview the results.

SettingInstruction
Root element namesType an element name and select plus sign.
xsi:schemaLocation URIsType a declaration.
DOCTYPE declaration element namesType a declaration.
Namespace declaration (xmlns) URIsType a namespace declaration and select plus sign.
xPath rulesType an XPath rule and select plus sign.

Note that if you choose to upload an *.xml file via the Using rules generation with a dynamic preview, the fields are automatically populated with information from the *.xml file, but you can still add or delete values.

Parser

OperationsInstruction
Add a rule manuallyParsing rules define how elements are handled. To add a rule:
  1. Select Add New Rule.
  2. Under Rule select either XPath rules or Element rules, and then select By manually defining the rules. If you select XPath rules, enter a value in the XPath field, if you selected Element rules, enter an element and its attribute.
  3. Under Properties, select Basic Settings and edit the following:
    • Translate: Yes (default), No, Inherit.
    • Tag Type: Structure (default), Inline.
    • Whitespace: Inherit from Parent (default), Always preserve, Always normalize, Normalize unless xml:space='preserve'
  4. Under Properties, select Advanced Settings and edit the following:
    • SID XPATH - Leave the field empty, as this option is only used by WorldServer. For more information about SID, consult the WorldServer documentation.
    • Segmentation Hint - This option is available only if you work with inline tags and only if you specified the extraction rule (segmentation hint) in the Embedded Content section of your file type. Select one of the available options: May Exclude (default), Include, Include With Text, Exclude.
    • Length restrictions - Specify a minimum length and a maximum length.
  5. This option is available only if you work with inline tags. Under Formatting, specify: the size, the color, the position (Inherit, Normal, Superscript, Subscript), and the style (Bold, Italic, Strikethrough, Underline).
  6. This option is available only if you work with structure tags. Under Structure Information:
    1. Select Add new.
    2. Under Properties, select one of the available structure elements from the Name list, and then specify a code, an identifier, a description, color.
    3. Under Formatting, specify: the size, the color, the position (Inherit, Normal, Superscript, Subscript), and the style (Bold, Italic, Strikethrough, Underline).
  7. Select Save.
Add a rule based on an uploaded *.xml file of maximum 1MBParsing rules define how elements are handled. To add a rule:
  1. Select Add New Rule.
  2. Under Rule select either XPath rules, and then select Using rules generation with a dynamic preview. Save the file type and consult the Preview tab populated with the information and default rules from your uploaded *.xml file. In the file preview, you can perform several operations:
    • Hover a start tag and have the end tag highlighted automatically. The rule opens on the right-hand side of the Preview tab, with the default rules pre-populated.
    • Hover a start tag and consult the most important information displayed in a text box displayed inline.
    • Consult the translatable text displayed in bold and the untranslatable text displayed in gray.
    • Continue adding, editing, or deleting the rules, as instructed below.
  3. Under Properties, select and edit the basic settings:
    • Translate: Yes (default), No, Inherit.
    • Tag Type: Structure (default), Inline.
    • Whitespace: Inherit from Parent (default), Always preserve, Always normalize, Normalize unless xml:space='preserve'
  4. Under Properties, select Advanced Settings and edit the following:
    • SID XPATH - Leave the field empty, as this option is only used by WorldServer. For more information about SID, consult the WorldServer documentation.
    • Segmentation Hint - This option is available only if you work with inline tags and only if you specified the extraction rule (segmentation hint) in the Embedded Content section of your file type. Select one of the available options: May Exclude (default), Include, Include With Text, Exclude.
    • Length restrictions - Specify a minimum length and a maximum length.
  5. This option is available only if you work with inline tags. Under Formatting, specify: the size, the color, the position (Inherit, Normal, Superscript, Subscript), and the style (Bold, Italic, Strikethrough, Underline).
  6. This option is available only if you work with structure tags. Under Structure Information:
    1. Select Add new.
    2. Under Properties, select one of the available structure elements from the Name list, and then specify a code, an identifier, a description, color.
    3. Under Formatting, specify: the size, the color, the position (Inherit, Normal, Superscript, Subscript), and the style (Bold, Italic, Strikethrough, Underline).
  7. When finished, select Done and then select Save.

Writer settings

SettingInstruction
Unicode UTF-8 byte order mark (BOM)From the list, choose an option to determine how BOM is handled during translation: Preserve; don't add if not originally present, Preserve: add if not originally present, Remove if present.
Values of xml:lang and lang attributeFrom the list, choose an option to determine how 'lang' is handled during translation: Change matching source language to target language, Always change to target language, Do not change.

Whitespace settings

SettingInstruction
Whitespace in contentDecide how whitespace is treated during translation: Normalize unless xml:space='preserve' ; Always preserve; Always normalize.
Normalize whitespace in tagsSelect the check box to ensure that whitespace is normalized in tags.

Namespace settings

SettingInstruction
Namespace resolutionDecide how you want to use namespaces: Use namespaces when declared in document, Always use prefix even when namespace is declared.
NamespacesAdd a prefix and an URI for your namespace, and then select plus sign.

Validation

SettingInstruction
Perform schema and DTD validation during file detectionDecide how schemas and DTD files are validated:
  • Treat all validation warnings as file parsing errors
  • Report warning if no DTD/schema can be found
Perform schema validation when verifying translation
Manually specify schemaEnable this option to specify schemas manually. Decide how the schemas are used:
  • Use for all XML documents
  • Use only for documents which do not specify DTD/schema
Master SchemasKeep the default values or add more files.
Dependency Schemas and DTDsKeep the default values or add more files.

Entity settings

SettingInstruction
Enable entity conversionEntity settings specify whether special characters are converted to their corresponding HTML entity. Select or clear the check box depending on whether you want to have entity conversion enabled or not.
Convert numeric entity references to inline placeholder tagsSelect or clear the check box depending on whether you want to have the numeric entity conversion enabled or not.
Skip conversion inside locked contentSelect or clear the check box depending on whether you want conversion skipped for locked content.
Add an entitySelect plus sign, add the character, and decide whether you need the following check boxes enabled or not:
  • READ AS CHARACTER - When enabled, this setting specifies which entities are converted to their respective characters during parsing.
  • WRITE AS ENTITY - When enabled, this setting specifies which characters are converted to their respective entities during writing.
Edit an entityAdjust the given values.

Embedded Content

SettingInstruction
Process embedded content Select the check box to enable the processing of embedded content. Then, specify one processing method: Inside CDATA element with; Defined by parser rules; Defined by document structure information.
Inside CDATA element withCDATA stands for character data and refers to a portion of element content that is marked to be interpreted literally, as textual data, instead of marked up content. If you enable this option, this entity interprets the element: Embedded Content Plain Text v 1.0.0.0.
Defined by parser rulesTo add a parser rule:
  1. From the PARSER RULE and EMBEDDED PROCESSOR ID lists, select one of the available values.
  2. Select plus sign.
Defined by document structure information - Document Structure InformationFrom the Document Structure Information list, select one of the available values and then select plus sign.
Defined by document structure information - Tag definition rulesIf you choose the tag element from the Document Structure Information list above, add a tag definition rule:
  1. Select Add New Rule.
  2. From the Tag Type list, choose a value: Placeholder or Tag Pair.
  3. In the Regular expression field, enter the regular expression.
  4. Select the Ignore Case check box to ignore the case of the identified content. Otherwise, keep the default value (clear check box).
  5. From the Segmentation Hint list, choose a value from the list to determine how segmentation is performed: May Exclude (default), Include, Include With Text, Exclude. The value you specify here is available when you configure a parser rule under Parser > Add New Rule > Properties > Advanced Settings > Segmentation Hint. Check this topic to learn what each segmentation hint does.
  6. Select Save.