Documentation Center

Overview of TM fuzzy match scoring

The scoring process for fuzzy matching supports the following match penalties:
  • configurable penalties for punctuation and white spaces
  • configurable penalties for capitalization
  • configurable penalties for non-matching placeholders
These settings allow the implementation to generate scores that are more representative of the actual comparative effort involved in transforming one segment to another. To improve overall scoring, the weights used to compute the final scores are available for customization. The current implementation provides support for variable weighting of segment elements. The WorldServer Translation Memory (TM) recognizes these segment elements:
  • codes or placeholders
  • words
  • numbers
In the current implementation, these elements can have different weights. For example, you may want a placeholder code and a word to have different weights. Similarly, translating a number versus a word may not represent the same effort. As a result, you may want to assign a smaller or larger weight to numbers. The implementation also provides support for differentiating between inner and outer placeholders.

The remainder of this chapter describes how to perform these changes and the impact of the changes on scoring. This chapter concludes with a table illustrating how the new scoring algorithm applies to different segments.