Scoping test results
SDL performed a test to compare word counts and repetition generated by TRADOS and WorldServer. A set of HTML files was scoped by the two products. A total of 11,000 files containing 60MB of HTML data were processed.
The results of the word counting test (total number of words counted) are shown in the following table:
| TRADOS | WorldServer Default | WorldServer Default + Number Counting |
|---|---|---|
| 4,436,009 | 4,395,495 (-0.91%) | 4,445,387 (0.21%) |
Using the TRADOS number-counting scheme, the WorldServer total number becomes 0.21% higher. This difference is attributed to the differences in word breaking described in this section.
The results of the repetition counting test (repeated words count percentage from total number of words) are shown in the following table:
| TRADOS | WorldServer | WorldServer + sentence breaking on colon |
|---|---|---|
| 13.5% | 11.9% | 13.9% |
Matching WorldServer’s sentence breaking rule with TRADOS’ results in a slightly better percentage of detected repetitions. For information on WorldServer sentence breaking, see Sentence breaking.