Calculating repetitions
WorldServer can identify all the duplicated segments (excluding all translated segments) within an asset and across the project. The following points describe the process:
- Segmentation and leverage occurs for each asset. The leverage process determines whether there are 100% or ICE matches available for each segment. If so, they will be applied and these segments will be attributed with a 100% score (and an ICE status if appropriate). If there are no ICE or 100% matches, then the best fuzzy score for each segment is recorded (0–99%) for scoping purposes.
- The repetition calculation process must be triggered. This must happen after segmentation and leveraging, and does not happen by default. The repetition counter goes through each asset and looks for duplicates among all segments that do not have a 100% score (which is why it is stated that only fuzzy matched segments are considered for duplication.) The first occurrence is designated as the repeated segment. Occurrences after the repeated segment are considered the repetition segments. The text used to check for duplications comes from the asset segment, not the fuzzy matches in the TM.
For repetitions to be calculated, the repetition process must be triggered, using one of these methods:
- Following the
Segmentautomatic action, use theCalculate Repetitionautomatic action in a workflow. This will require all project tasks to be included in the repetition calculation process. - Use SDK customizations to perform repetition calculations. (See the SDK documentation of more information.)
When calculating repetitions within a project, the repetition information persists for all assets involved. Recalculating repetitions will result in adding to any existing stored repetition data. Before recalculating repetitions for an asset or set of assets, the previous repetition results must be flushed. Currently, this requires forcing the assets to be resegmented (such as using the Clear Cache automatic action). The same applies to calculating repetitions via the SDK; however, the SDK provides explicit control of whether the repetition data is saved or discarded.