Importation des fiches de base terminologique

When you import termbase data, you populate your termbase with terminology. Cross-references created in MultiTerm are preserved when importing them in Trados Studio cloud capabilities.

Avant de commencer

The termbase import is successful only if you meet the following requirements:
  • The import file has one of the accepted formats: *.tbx, MultiTerm *.xml, *.xlsx, *.csv.
  • The *.xlsx or *.csv import files are structured correctly against the target termbase. Familiarize yourselves with the rules for structuring your *.xlsx or *.csv files correctly before the import.
  • The import file and the termbase to which the import is made have the same mandatory fields, the same format/structure, and at least one language pair.

Rules for structuring *.xlsx or *.csv files correctly

If you want to import an *.xlsx or *.csv file in your termbase make sure you follow all the guidelines below.
1. Structure and languages

The import file and the termbase to which your import must have the same mandatory fields, the same format/structure, and at least one common language pair.

What happens if the import file has more languages than the termbase?

The entries of the languages not specified in the termbase cannot be imported.

2. Headers and columns - introduction and example

This is the most sensitive part of structuring your import file. In this part, we will look at a simple structure and an example.

Before we begin, we must remember that:
  • Any termbase is a collection of concepts. The equivalent of the term "concept" in Trados Studio cloud capabilities is "entry" in MultiTerm.
  • Each concept is a multi-level entity consisting of an entry which:
    • Is translated in several languages
    • Has a term for each language
    • Can have various fields (attributes) at entry level, at language level, or at term level
  • Any *.xlsx or *.csv import file will consist of several columns (fields) and several rows (values). The columns have headers and these correspond to the levels discussed earlier: entry, language, term, and the attributes which can be present at any of the three levels. When you give names to your column headers, you must have a prefix in the name of all the language-level cells. The default language prefix (which the system detects automatically upon import) is >>L<<.
To clarify all these elements, let's consider the most common header columns and the things to be careful about:
  • ID - You can have a column where each entry has a unique ID.
  • Entry-level fields - You can have an entry-level field like "Definition". Do not add the language prefix (">>L<<") in the header name. You can have an optional custom prefix of your choice, but it is not necessary.
  • Language-level fields - You can have language-level fields like ">>L<<[language_name]" (mandatory) or ">>L<<[language_attribute]" (optional). You can have as many language level fields as required languages. All language level fields must have the >>L<< prefix in the header name.
  • Term-level fields - You can have term-level fields defined for various purposes: "Synonyms", "Special notes" etc. Do not add the language prefix (">>L<<") in the header name. You can have an optional custom prefix of your choice, but it is not necessary.
When you import a file you can specify prefixes for each header level (as can be seen below under Advanced settings), which means that you instruct the system to accurately match the headers in your import file with the correct level. However, note that the only mandatory prefix is the language-level one. If your language-level prefix is not >>L<<, use the Advanced settings section to change it to match the one you use in your import file. termbase advanced settings

Let's look at an example.

You have a termbase (en-fr) for which you defined the following fields and structure:
  • A Definition field (text) at entry level
  • A Part of Speech field (picklist) at language level
  • A Used field (Boolean) at term level. Here, you want to mark whether a specific term should be used (TRUE) or not (FALSE)
termbase fields and structure
You want to import an *.xlsx into your termbase to populate it with content. Your *.xlsx must match your termbase fields and structure. Here is how you could organize your *.xlsx:
  • ID
  • Definition - This is an entry-level field. Since this is an entry-level field, it does not need a prefix.
  • >>L<<English - This is the English language-level field and it is mandatory. The language prefix (">>L<<") is also mandatory. Any field (language-related or term-related) that comes anywhere between >>L<<English and >>L<<French refers to English entries (columns C, D, E).
    • >>L<<Part of Speech - This is the Part of Speech field for any English entries. Since this is a language-level field, it must have the >>L<< prefix.
    • Used - This is the Used field for any English terms. Since this is a term-level field, it does not need any prefix.
  • >>L<<French - This is the French language-level field and it is mandatory. Any field (language-related or term-related) that comes after >>L<<French refers to English entries (columns F, G, H) in the sample above.
This is how the entries look like in the termbase after the import:
3. Headers and columns - guidelines
In this section, we will list all the guidelines about how the import file must be structured at header and column level.
  1. Any ID column is interpreted and matched as the concept ID.
  2. Any column that can be matched to a language or a sub-language will be considered a language for that the concept.
  3. There must be at least one language column.
  4. All language-level columns must have a prefix. The >>L<< prefix is recognized by the system by default. If you used another prefix in you import file, during import, you can change the default prefix to match the one in your file.
  5. All entry/concept-level and term-level columns do not need a prefix.
  6. All language-level attributes (namely, all attributes for a given language) must sit between 2 language columns.
  7. All term-level attributes (namely, all attributes for a term in a given language) must sit between 2 language columns.
4. Field values
A field can support one data type at a time. The following data types are available:
  • Number
  • Boolean: true/false/0/1
  • Date/Time: yyyy-MM-dd HH:mm:ss
  • Text: max. 1024 characters
  • Picklist
5. Separators for synonyms and multiple values
The default separators (detected automatically by the system during import) are | for synonyms and / for multiple values. If you used different separators in your import file, during the import process, you can change the default separators to match the ones you used.
6. Handling synonyms
If you have synonyms for a term, then:
  1. Add the synonyms in the same cell as the term.
  2. Separate the synonyms by | , which is the default separator the system recognizes automatically. If your import file uses another separator, then, during import, change the default separator to match the one in your import file.
  3. Make sure the mandatory attributes of your term are also defined for each synonym.
For example, you have a term (EngTerm) which has 2 mandatory attributes:
  • Register: formal, informal
  • Used: true, false

Your term has 2 synonyms: Syn1 (formal, used) and Syn2 (informal, used).

How do you represent this correctly in your import file?

What happens if we add a third synonym (Syn3 which is informal and not used), but do not add its mandatory attributes? The third synonym will not be imported.

This is the incorrect way of entering synonym 3:
This is the correct way of entering synonym 3:
7. Handling multiple values
If you have multiple values for a field, then:
  1. Add the multiple values in the same cell.
  2. Separate the multiple values by / , which is the default separator the system recognizes automatically. If your import file uses another separator, then, during import, change the default separator to match the one in your import file.

What happens if your import file contains multiple values for a given field, but the target termbase field does not allow multiple values?

If the field does not allow multiple values, the cell value will be split and only the first value will be saved as a unique value.

8. Example - Importing terms and system status
Let's imagine a scenario where you want to import the System status automatically from an *.xlsx file. Consult the general guidelines for importing termbase content, and then make sure you structure the *.xlsx header as follows: ID (if available, in the first column), >>L<<[Language] (in the second column), >>TS<<SYSTEM TermStatus (in the third column), >>E<<Term type (in the fourth column), >>E<<Status (in the fifth column). Note that you must add the >>TS<< prefix in front of the SYSTEM TermStatus column name, as this ensures the correct import of the term status. You must place the >>TS<< SYSTEM TermStatus column before any existing columns dedicated to term attributes.import term and status

Procédure

  1. Connectez-vous à Trados Studio cloud capabilities (en cliquant sur ce lien : http://languagecloud.sdl.com/fr/lc).
  2. Accédez à l'affichage Ressources et sélectionnez Ressources linguistiques > Bases terminologiques OU sélectionnez l'affichage Terminologie dans le menu principal, puis sélectionnez Bases terminologiques.
  3. Do one of the following:
    • Select the check box corresponding to the termbase to which you want to import data.
    • Click inside the termbase row to open the termbase.
  4. Select Import.
  5. In the Upload file dialog, do the following:
    1. Drag and drop the import file or browse for it. The available import formats are: *.tbx, MultiTerm *.xml, *.xlsx, *.csv. If you import content from *.xlsx or *.csv files, system fields are also imported.
    2. Choose one of the options:
      • If Strict import is enabled, the termbase must include the exact language (language code) as the language from which the import is performed.
      • If Strict import is disabled, the termbase must include the same language as the one from which the import is performed or one of its sub-languages (language flavors).
    3. Select an option for duplicate entries:
      • Ignore - If there is an existing entry which has the same ID as the new entry which is about to be imported, the new entry is ignored.
      • Merge - If there is an existing entry which has the same ID as the new entry which is about to be imported, the content of the existing entry will be merged with the content of the new entry. If the import file does not include any ID, the matching is performed based on the term text.
      • Overwrite - If there is an existing entry which has the same ID as the new entry which is about to be imported, the new entry will replace the existing entry.
      A duplicate termbase entry is an entry which has the same ID or the same text (if no ID is present) as another entry.
    4. Select Import.
  6. A window opens asking you to match the fields in your import file against the fields specified in the termbase structure:
    1. Identify the STRUCTURE column, identify the language fields and check if the language values are correct. If there is no language value (NO MATCH FOUND), select the language field and choose a value from the dropdown.
    2. Identify the IMPORT? column, and identify the check boxes marked by a warning sign. Select all the check boxes corresponding to the levels you want to import (entry level, language level, term level).
    3. If you are working with an *.xlsx or *.csv import and you need to adjust your separators (for synonyms, for multiple values), in case they differ from the ones the system can interpret, select Settings, and change the separator values.
    4. Select Finish.