Language processing rules

Speed up the process of creating TMs that are similar or identical in terms of the language processing rule they use. By using language processing rules, you can work consistently across your TMs.

How can I use language processing rules?

Language processing rules are language resources which consist of multiple groups containing the rules that can be set up for:
  • Recognizable elements (for example, variables which must not be translated)
  • Segmentation rules which are used for splitting the source content into segments, clarifying how to interpret words which function as one word (for example, specific words which function as one word when connected via a hyphen or a dash)
  • Customized language resources (for example, a custom date-time format that you want recognized for a specific target language)

Customized language resources are configured while creating or editing language processing rules.

Language processing rules are used when setting up TMs and translation engines. Language processing rules must be set at TM level and at translation-engine level. Language processing rules are used by TMs and translation engines for different purposes:
  • In TMs, language processing rules have an impact on how segment matches are identified and retrieved.
  • In translation engines, language processing rules impact reporting (for example, how words are counted) and other language processing.
The language processing rule, which is selected as part of a translation engine, is used for other functionality inside project processing, such as:
  • Segmentation of the source files
  • Word counting
  • Editing content in Trados Studio and Trados Online Editor
RWS strongly recommends that you observe both requirements below:
  1. Use the same language processing rule for all the TMs you included in a translation engine.
  2. Create translation engines which have the same language processing rule as the TMs included in the translation engine.

Failure to do this, can result in various conflicts around recognizing tokens and inconsistencies in expected results.

What types of language processing rules are there?

When you set up language processing rules, you configure the following elements (language resources):
  • Abbreviations are standardized short strings that replace full words or expressions.
  • Ordinal followers are the symbols used in numbers, figures or dates.
  • Segmentation rules are the rules which define how content is parsed.
  • Variables are words or phrases which are used to capture the names of entities which usually do not require a translation. Some examples include: product names, institution names, concepts and so on.
  • Dates & Times refer to the date/time format used in the source text. The format is retained during translation.
  • Numbers refer to the decimal or digit group symbols used when expressing numbers.
  • Measurements refer to the standard measurement units which do not get translated.
  • Currency refers to the standard currency symbols which do not get translated.

Customized language resources are configured while creating or editing language processing rules.