Configuring Legacy Embedded content processor
The Embedded content (Legacy) page is available for file types that still use the legacy embedded content processor. This is a generic processor that relies on the file type parser to extract embedded content and does not differentiate between the type of embedded content. As a result, this restricts you from specifying custom extraction and display settings for different types of embedded content.
The legacy embedded content is available for the following file types:
Microsoft Excel (all versions)
Java Resources
XML: Any XML
New XML (Legacy Embedded Content) file types
| Enable embedded content processing | Select to enable processing of embedded content. |
Document structure information
Use the Add... button to create extraction rules based on document structure information. Make sure that the document structure information you specify here is covered by a parser rule on the Parser page of your file type. Studio can only extract embedded content that is recognized by the file type parser.
Tag definition rules
| Tag Type |
| ||||
| Start Tag Expression (Placeholder) | This is a regular expression that identifies embedded content, and converts each occurrence to a placeholder tag. For example, to convert all HTML | ||||
| Start Tag Expression and End Tag Expression (Tag Pair) | These are regular expressions that identify embedded content by start and end tags. The start and end tags may enclose some content or none. The processor will try to match the tag pair before tries to match each tag expression. That is, it looks for any section of text that starts with the Start Tag expression and ends with the End Tag expression before it tries to match individual start and end tags.
For example, to identify all HTML <tr>...</tr> (table row) tag pairs, enter:
| ||||
| Ignore case | Check this box and the letter case of your defined tags is not taken into consideration when the embedded content is identified. | ||||
| Translate | Not translatable means that the content between the tag pairs is displayed to the translator as locked content. Text within tag pairs can be translatable or non-translatable. Placeholder tags are Not translatable. | ||||
| Formatting | You can edit how the embedded content will be displayed in the Editor view. |
Advanced Settings
| Inside text the tag acts as a word end | This option changes the behavior of cursor placement in the Editor window. When selected, the editor treats the tag as a word for the purposes of navigation. For example, in the editor, pressing Ctrl+Left Arrow will move the cursor to the beginning of the tag and Ctrl+Right Arrow will move the cursor to the end of the tag. | ||||||||
| Text lines can be wrapped after the tag | Selecting this option indicates that a line break after this tag does not indicate the end of a segment. For example: Gather ye rosebuds while ye may, Old Time is still a-flying: And this same flower that smiles to-day To-morrow will be dying. | ||||||||
| Tags represent formatting only and can be hidden in the editor | When this option is selected, text is formatted correctly and the standard formatting tags (for example, bold, italic, and font type) are not displayed. Selecting this option does not mean that the tag is always hidden; the user can change the editor settings to force the tag to be displayed. | ||||||||
| Tags represents the text | Placeholder (standalone) tags only. A tag can have a text equivalent. For example, the entity tag | ||||||||
| Segmentation Hint |
A segmentation hint is a property of a tag that helps the software to segment the file better when converting the file to a translatable format: whether to position the tag within a segment or outside of the segment, or to force a segmentation break. Choose one of the following options.
|