Documentation Center

Configuring Legacy Embedded content processor

The Embedded content (Legacy) page is available for file types that still use the legacy embedded content processor. This is a generic processor that relies on the file type parser to extract embedded content and does not differentiate between the type of embedded content. As a result, this restricts you from specifying custom extraction and display settings for different types of embedded content.

The legacy embedded content is available for the following file types:

Microsoft Excel (all versions)

Java Resources

XML: Any XML

New XML (Legacy Embedded Content) file types

To configure the legacy embedded content processor, use the following options on the Embedded Content page of one of the file types above. These options control how SDL Trados Studio converts the extracted content into tags and whether these tags show up as translatable or non-translatable in the Editor view.
Enable embedded content processingSelect to enable processing of embedded content.

Document structure information

Use the Add... button to create extraction rules based on document structure information. Make sure that the document structure information you specify here is covered by a parser rule on the Parser page of your file type. Studio can only extract embedded content that is recognized by the file type parser.

Tag definition rules

Add tag definition rules to specify how to treat the embedded content defined in Document structure information box.
Tag Type
Placeholder

Converts embedded content to standalone (placeholder) tags.

Tag Pair

Identifies tag pairs (a start tag and an end tag) in the embedded content.

Start Tag Expression (Placeholder)

This is a regular expression that identifies embedded content, and converts each occurrence to a placeholder tag. For example, to convert all HTML <br> (line break) tags to placeholder tags, enter <br.*?>

Start Tag Expression and End Tag Expression (Tag Pair)

These are regular expressions that identify embedded content by start and end tags. The start and end tags may enclose some content or none.

The processor will try to match the tag pair before tries to match each tag expression. That is, it looks for any section of text that starts with the Start Tag expression and ends with the End Tag expression before it tries to match individual start and end tags.

For example, to identify all HTML <tr>...</tr> (table row) tag pairs, enter:
  • Start Tag: <tr.*?>
  • End Tag: </tr>
Ignore case

Check this box and the letter case of your defined tags is not taken into consideration when the embedded content is identified.

Translate

Not translatable means that the content between the tag pairs is displayed to the translator as locked content.

Text within tag pairs can be translatable or non-translatable. Placeholder tags are Not translatable.

Formatting

You can edit how the embedded content will be displayed in the Editor view.

Advanced Settings

The Advanced Settings specify how tags are displayed.
Inside text the tag acts as a word end

This option changes the behavior of cursor placement in the Editor window.

When selected, the editor treats the tag as a word for the purposes of navigation. For example, in the editor, pressing Ctrl+Left Arrow will move the cursor to the beginning of the tag and Ctrl+Right Arrow will move the cursor to the end of the tag.

Text lines can be wrapped after the tag

Selecting this option indicates that a line break after this tag does not indicate the end of a segment. For example:

Gather ye rosebuds while ye may,<br>

Old Time is still a-flying: <br>

And this same flower that smiles to-day <br>

To-morrow will be dying.

Tags represent formatting only and can be hidden in the editor

When this option is selected, text is formatted correctly and the standard formatting tags (for example, bold, italic, and font type) are not displayed.

Selecting this option does not mean that the tag is always hidden; the user can change the editor settings to force the tag to be displayed.

Tags represents the text

Placeholder (standalone) tags only.

A tag can have a text equivalent. For example, the entity tag &quot; has the text equivalent ".

Segmentation Hint
A segmentation hint is a property of a tag that helps the software to segment the file better when converting the file to a translatable format: whether to position the tag within a segment or outside of the segment, or to force a segmentation break. Choose one of the following options.
IncludeIf selected, the tag is displayed in the editor, even if it has no associated text. You would rarely select this option.
Include with text

If selected, when the tag has associated text, the tag is displayed in the editor.

Example: the tag specifies a footnote marker. Where this is the case, the translator needs the ability to move the marker to another word in the same sentence, so the tag should be included with the text.

Exclude

If selected, the software will, where possible, use the tag or tag pair to segment the text. For example, if <p>...</p> or <br> tags are marked Exclude , then if an XML document includes embedded HTML code, the software will use the HTML tags <p>...</p> and <br> to segment the document. This segmentation is in addition to the segmentation that is already applied to the embedding XML code.

May exclude, Undefined

These two are effectively the same. The editor determines whether the tag is part of the text.

Related topics