Documentation Center

Filter Determination for Segmentation

Content filters process the data in project assets to expose translatable content and hide non-translatable content. Translatable content is presented in segments; the process that content filters perform on assets is called segmentation. Filter configurations allow you to customize how the filters process the data.

Most of the time, segmentation is done by the Segment Asset automatic action in a project workflow. If the automatic action has not been run when the following operations are performed, the asset gets segmented when you perform the following actions:
  • Opening an asset in the Browser Workbench
  • Generating a scoping report for an asset
  • Exporting an asset to a translation kit
Whether WorldServer performs segmentation via the automatic action or via one of these ad hoc operations, the process is the same:
  • WorldServer inspects the MIME type data for each asset to obtain a filter instance of the appropriate type.
  • WorldServer determines which filter configuration to use.
  • WorldServer segments the asset using the specified configuration.

Filter association with MIME type

Segmentation is performed by a filter appropriate to the file type of the asset. For example, if the asset has an extension of .xml, WorldServer by default uses the filter associated with the MIME type of text/xml in the MIME type table (see Management > Administration > Customization > Custom component type: MIME Types). The default MIME type for the .xml file extension is text/xml, and those files by default use the Any XML File Type filter.

If the MIME type does not have an associated filter, the asset is considered not segmentable (that is, not translatable).

Filter configuration selection

If the default filter configuration is the only configuration for the found filter, that default configuration is used. If multiple configurations exist, the system will look for the project type or the AIS property to figure out the configuration to use, depending on how the project was created:
  • Upload Files and Create Projects on the Home page

    The filter configuration is that specified in the project type for the project.

  • Create New Project on the Assignments > Project page

    As in the first case, the filter configuration is that specified in the project type for the project.

  • Project > Create Project or ad hoc in the WorldServer Explorer

    An AIS property can be assigned to specify a filter configuration for a target asset or directory, in the Change Properties dialog accessed via Management > Asset Interface System > View and Change Properties. (This dialog is also accessible from Explorer > Asset > Properties.)

You must create at least one filter configuration (via Linguistic Tool Setup > Filter Configuration > <Filter>: Add) for the Filter Configurations drop-down list to appear in the Change Properties dialog.

When WorldServer consults the Filter Configuration property for a project, MIME type, or AIS folder, it checks whether the assigned configuration has the correct filter association. If so, that filter is used. If the file types do not match, the filter’s default configuration is used. See the "Filter Groups" topic for information on applying more than one filter configuration to a directory in AIS.

Asset Resegmentation

WorldServer resegments assets when it detects that the filter configuration has changed. However, in some cases a change will be noticed and in some other cases you have to touch the file or clear the cache proactively. A filter configuration can depend on many factors. For example, changing sentence-breaking rules will affect segmentation. It is best, unless you are absolutely sure that an asset has been resegmented after configuration changes, to force resegmentation by touching the source file to change the timestamp.