Documentation Center

Determining file types for segmentation

File types process the data in project assets to expose the translatable content and to the hide non-translatable content. Translatable content is presented in segments; the action that file types perform on assets is called segmentation. File type configurations allow you to customize how file types process the information.

Most of the time, the Segment Asset automatic action segments assets in a project workflow. Apart from this automatic action, assets are segmented when you perform one of the following actions:
  • Opening an asset in the Browser Workbench
  • Generating a scoping report for an asset
  • Exporting an asset to a translation kit
Whether WorldServer performs segmentation through the automatic action or through one of these ad-hoc operations, the process is the same:
  • WorldServer inspects the MIME type information for each asset to obtain an appropriate file type instance.
  • WorldServer determines which file type configuration to use.
  • WorldServer segments the asset using the specified configuration.

Associating file types with MIME types

The segmentation process is performed by a file type that is appropriate to the format of the asset. For example, if an asset has the .XML extension, WorldServer uses the file type associated with the text/xml MIME type in the MIME type table (see Management > Administration > Customization > Custom component type: MIME Types). The default MIME type for the .XML file extension is text/xml and those files use the Any XML File Type file type by default.

If the MIME type does not have an associated file type, the asset is considered not segmentable (that is, not translatable).

Choosing file type configurations

If the default configuration is the only configuration available for a certain file type, then the default configuration is used. If there are multiple configurations, the system searches for the project type or the AIS property to figure out which configuration to use, depending on how the project was created:
  • Upload Files and Create Projects on the Legacy Home page

    The file type configuration is the one specified in the project type of the project.

  • Create New Project on the Assignments > Project page

    As in the first case, the file type configuration is the one specified in the project type of the project.

  • Project > Create Project or ad hoc in WorldServer Explorer

    You can assign a specific file type configuration for a target asset or folder in the Change Properties window. Go to Management > Asset Interface System > View and Change Properties. (You can also get to this window by going to Explorer > Asset > Properties.)

You must create at least one file type configuration (in Linguistic Tool Setup > File Types > <File Type>: Add) for the File Type list to be displayed in the Change Properties window.

When WorldServer consults the File Type property for a project, MIME type, or AIS folder, it checks whether the assigned configuration has the correct file type association. If so, that file type is used. If the file formats do not match, the default configuration of the file type is used. See the "File type groups" topic for more information about applying multiple file type configurations to a folder in AIS.

Asset re-segmentation

WorldServer re-segments assets when it detects that the file type configuration has changed. However, in some cases, you might have to modify the file in order to change its timestamp. A file type configuration can depend on many factors. For example, changing sentence-breaking rules will affect segmentation. Unless you are sure that an asset has been re-segmented after configuration changes, you should force re-segmentation by modifying the source file to change its timestamp.