Project archiving allows you to write reports on historical data. Archiving is the process of copying historical data into new database tables, specifically so that you can write reports and get answers to your questions.
The archive data store contains information about projects, tasks, and scoping, which is maintained indefinitely. You can write complex, dynamic reports answering questions such as:
- How many projects were created in each of the last 12 months?
- What was the average time for a translation project?
- Which vendors had the fastest turnaround time?
- How many active projects and tasks were there on each day over the last two months?
During the life of an active task, the workflow engine sends a message to initiate copying information related to the task to an archive table. Until the general cleaner deletes the original task information, it will appear in two places – in its original table and in the archive table. This also means that an active project with at least one completed task appears in the archive. Each project is updated as its tasks complete, so that the final version of the project which appears in the archive is its state when its final task was completed. The archive tables are mostly a mirror copy of the existing tables. The archive tables have some of the same columns as the main WorldServer tables, but some client-related data, such as
Client ID and
Name, are not available in the reporting schema.
Note: By default, WorldServer garbage collection starts running every night at 12:45 AM. You can modify cleaner schedules from the page. If you adjust the recurrence, WorldServer must be restarted to notice the change.
Note the following when working with tasks and archiving:
- When a task is canceled, it remains in the tasks table until garbage collection runs.
- When a task is completed, it also remains in the tasks table for at least 30 days (default). After 30 days, it can be removed during garbage collection.
It does not matter if a task is canceled or completed; it always remains in the tasks table. The general cleaner removes it from the tasks table after a configured number of days. The TaskDeleteAfterDays property configures the default amount of time a task remains in the tasks table.
- A task is always archived after it is completed successfully, even if the project with which the task is associated is not completed. Therefore, a completed task can coexist in the tasks table and the archiveTasks table for a period of time, until garbage collection runs.
- Archive tables are never included in garbage collection.
Use the spDeleteArchive stored procedure to clean up archive tables manually.
In summary, the archive tables provide historical information about work that has been performed in WorldServer. Initial project information is added to these tables when projects are created; task information is added on the successful completion of each task. Information is never removed from the archive tables automatically. Unless end users choose to remove old information from these tables, they will store historical information forever.
archiveProjectGroups
Historical information about WorldServer Project Groups.
| Column | Type | Description |
|---|
| projectGroupId | integer | The internal ID of this Project Group. Primary key | | name | string | The name of this Project Group. | | description | string | The description of this Project Group (optional). | | workgroupId | integer | The internal ID of the Workgroup with which this Project Group is associated. | | workgroupName | string | The name of the Workgroup with which this Project Group is associated. | | creationDate | date | When this Project Group was created. | | dueDate | date | When this Project Group is due.
Note: This field is currently set only for projects created through TransPort. Reports should probably not depend on this, and should use the due dates of individual projects instead.
| | completionDate | date | When this Project Group was completed. Not set if the Project Group is still active. | | creatorId | integer | The internal ID of the user who created this Project Group. | | creatorName | string | The username of the user who created this Project Group. | | type | integer | Ignored. | | sourceLocaleId | integer | The internal ID of the source Locale for this Project Group. | | sourceLocale | string | The name of the source Locale for this Project Group. |
|
archiveProjects
Historical information about WorldServer Projects.
| Column | Type | Description |
|---|
| projectId | integer | The internal ID of this Project. Primary key. | | name | string | The name of this Project. | | description | string | The description of this Project (optional). | | localeId | integer | The internal ID of the target Locale for this Project. | | localeName | string | The name of the target Locale for this Project. | | workgroupId | integer | The internal ID of the Workgroup with which this Project is associated. | | workgroupName | string | The name of the Workgroup with which this Project is associated. | | creationDate | date | When this Project was created. | | dueDate | date | When this Project is due (optional). | | completionDate | date | When this Project was completed. Not set if the Project is still active. | | creatorId | integer | The internal ID of the user who created this Project. | | creatorName | string | The username of the user who created this Project. | | type | integer | Ignored. | | lastTaskNum | integer | The task number given to the last task in the project. This is ordinarily the same as the number of tasks in the project. | | projectGroupId | integer | A foreign key into the archiveProjectGroups table, identifying the Project Group that this Project is part of. | | activeTasks | integer | The number of active Tasks in this Project. | | completedTasks | integer | The number of completed Tasks in this Project. | | canceledTasks | integer | The number of canceled Tasks in this Project. | | totalTasks | integer | The total number of Tasks in this Project, regardless of state. | | rfqId | integer | The internal ID of the Quote for this Project. | | rfqName | string | The Quote name for this Project. | | clientId | integer | The internal ID of the Client for this Project. | | clientName | string | The Client name for this Project. | | projectTypeId | integer | The internal ID of the Project Type for this Project. | | projectTypeName | string | The Project Type name for this Project. | | cost | double | The total cost for the project, as calculated by applying the associated Cost Model. Not set until the project is completed. | | adjustment | double | A manual adjustment to the project cost (optional). | | vendorNotes | string | Vendor notes related to the cost adjustment (optional). | | customerNotes | string | Customer notes related to the cost adjustment (optional). | | costModelArchiveId | integer | A foreign key into the archiveCostModels table, indicating the Cost Model associated with this project. | | reviewModelArchiveId | integer | A foreign key the reviewModelArchive table, indicating the Review Model associated with this project. |
|
archiveCostModels
Historical information about the translation costs associated with WorldServer Projects.
| Column | Type | Description |
|---|
| costModelArchiveId | integer | The internal ID of this Cost Model. Primary key. | | name | integer | The name of this Cost Model. | | icePrice | double | The price of translating each ICE matched word. | | repeititionPrice | double | The price of translating each repetition word.
Note: "repeititionPrice" is the way this is spelled in the database.
| | perfectPrice | double | The price of translating each word with a perfect (100%) TM match | | range1Price | double | The price of translating the highest fuzzy match range (for example, 90-100%). | | range1Min | integer | The lower boundary of the highest fuzzy match range (for example, 90). | | range2Price | double | The price of translating the second fuzzy match range (for example, 80-90%). Not set if there are less than two fuzzy match ranges in the associated scoping configuration. | | range2Min | integer | The lower boundary of the second fuzzy match range (for example, 80). Not set if there are less than two fuzzy match ranges in the associated scoping configuration. | | range3Price | double | The price of translating the third fuzzy match range (for example, 70-80%). Not set if there are less than three fuzzy match ranges in the associated scoping configuration. | | range3Min | integer | The lower boundary of the third fuzzy match range (for example, 80). Not set if there are less than three fuzzy match ranges in the associated scoping configuration. | | range4Price | double | The price of translating the fourth fuzzy match range (for example, 60-70%). Not set if there are less than four fuzzy match ranges in the associated scoping configuration. | | range4Min | integer | The lower boundary of the fourth fuzzy match range (for example, 60). Not set if there are less than four fuzzy match ranges in the associated scoping configuration. | | range5Price | double | The price of translating the fifth fuzzy match range (for example, 0-60%). Not set if there are less than five fuzzy match ranges in the associated scoping configuration. | | range5Min | integer | The lower boundary of the fifth fuzzy match range (for example, 0). Not set if there are less than five fuzzy match ranges in the associated scoping configuration. | | vendorId | integer | The internal ID of the vendor associated with this Cost Model (optional). | | vendorName | string | The name of the vendor associated with this Cost Model (optional). | | scopingConfigId | integer | The internal ID of the scoping configuration associated with this Cost Model. | | scopingConfigName | string | The name of the scoping configuration associated with this Cost Model. | | currency | string | The three-letter currency code for the currency that the prices in this cost model are listed in (for example, USD). |
|
archiveLineItems
Historical information about special charges or discounts on Projects, as calculated by Cost Model Rules.
| Column | Type | Description |
|---|
| lineItemId | integer | The internal ID of this Line Item. Primary key. | | costModelArchiveId | integer | A foreign key into the archiveCostModels table, indicating the Cost Model this line item applies to. | | name | string | The name of this Line Item. | | description | string | The description of this Line Item. | | cost | double | The additional charge (or discount, if negative) associated with this Line Item. |
|
archiveTasks
Historical information about successfully completed WorldServer Tasks.
| Column | Type | Description |
|---|
| taskId | integer | The internal ID of this Task. Primary key. | | projectId | integer | A foreign key into the archiveProjects table, indicating the Project this Task is part of. | | workflowId | integer | The internal ID of the workflow that this Task is flowing through. | | workflowName | string | The name of the workflow that this Task is flowing through. | | stepCount | integer | The ordinal step number of the currently active Task Step. | | parentId | integer | If this task is part of a sub-workflow, this is the taskId of the task in the parent workflow that spawned this one. Otherwise, not set. | | parentStepCount | integer | If this task is part of a sub-workflow, this is the step number of the currently active Task Step in the parent workflow. Otherwise, 0. | | status | integer | Indicates the last event of interest that happened with this Task. One of the following values:
- 0 - Task was not started
- 1 - Task is still in progress with success
- 2 - Task is still in progress with failure
- 3 - Task was completed with success
- 4 - Task was completed with failure
- 5 - Task was canceled.
| | taskNumInProject | integer | The ordinal number of this Task in the associated Project. | | creationDate | date | When this Task was created. | | dueDate | date | When this Task is due (optional). | | expectedDuration | integer | The number of hours the Task is expected to take to complete (optional). | | completionDate | date | When the Task was completed. | | currentTaskStepId | integer | A foreign key into the archiveTaskSteps table indicating which Task Step is currently active | | priority | integer | The priority of this Task. | | includeInCost | boolean | Should this Task’s cost be included in the total cost for the Project? | | sourceAsset | string | The AIS Path of the source asset. | | targetAsset | string | The AIS Path of the target asset. | | type | integer | Ignored. | | cost | double | The cost for this Task, based on applying the associated Cost Model. Not set if there was no associated Cost Model, or if the task never caused an asset to be segmented. | | mtWordCount | integer | The number of words processed by the Task with an MT match. | | totalWordCount | integer | The total number of words processed by this Task. Not set if the task never caused an asset to be segmented. | | iceWordCount | integer | The number of words processed by this Task with a TM ICE match. Not set if the task never caused an asset to be segmented. | | perfectWordCount | integer | The number of words processed by this Task with a 100% TM match. Not set if the task never caused an asset to be segmented. | | repetitionWordCount | integer | The number of words processed by this Task with a TM repetition. Not set if the task never caused an asset to be segmented. | | range1WordCount | integer | The number of words in the highest fuzzy match range (for example, 90-100%). Not set if the task never caused an asset to be segmented. | | range1Min | integer | The lower boundary of the highest fuzzy match range (for example, 90). | | range2WordCount | integer | The number of words in the second fuzzy match range (for example, 80-90%). Not set if the task never caused an asset to be segmented. | | range2Min | integer | The lower boundary of the second fuzzy match range (for example, 80). Not set if there are less than two fuzzy match ranges in the associated scoping configuration. | | range3WordCount | integer | The number of words in the third fuzzy match range (for example, 70-80%). Not set if the task never caused an asset to be segmented. | | range3Min | integer | The lower boundary of the third fuzzy match range (for example, 70). Not set if there are less than three fuzzy match ranges in the associated scoping configuration. | | range4WordCount | integer | The number of words in the fourth fuzzy match range (for example, 60-70%). Not set if the task never caused an asset to be segmented. | | range4Min | integer | The lower boundary of the fourth fuzzy match range (for example, 60). Not set if there are less than four fuzzy match ranges in the associated scoping configuration. | | range5WordCount | integer | The number of words in the fifth fuzzy match range (for example, 0-60%). Not set if the task never caused an asset to be segmented. | | range5Min | integer | The lower boundary of the fifth fuzzy match range (for example, 0). Not set if there are less than five fuzzy match ranges in the associated scoping configuration. | | costModelId | integer | A foreign key into the archiveCostModels table. Indicates the Cost Model associated with this Task. This is equivalent to archiveProjects.costModelArchiveId. | | reviewModelId | integer | The internal ID of the Review Model for this Task. | | creatorName | string | The username of the user who created this task. | | scopingConfigId | integer | The internal ID of the associated scoping configuration. | | scopingConfigName | string | The name of the associated scoping configuration. |
|
archiveTaskSteps
Historical information about each step that was completed during Task processing.
| Column | Type | Description |
|---|
| taskStepId | integer | The internal ID of this TaskStep. Primary key. | | taskId | integer | A foreign key into the archiveTasks table, indicating the Task this Task Step is part of. | | workflowStepId | integer | The internal ID of the workflow step that this Task Step corresponds to. | | claimaintId | integer | The internal ID of the user who has claimed this Task Step. | | claimantName | string | The username of the user who has claimed this Task Step. | | status | integer |
- 1: This Task Step has not yet started.
- 2: This Task Step is in progress.
- 3: This Task Step has been completed.
- 4: This Task Step has project scope and is waiting to execute.
- 5: This Task Step has project scope and is in progress.
| | compTransitionId | integer | The internal ID of the transition taken from this step when it completed. | | transitionName | string | The name of the transition taken from this step when it completed. | | creationDate | date | When this Task Step was created. | | dueDate | date | When this Task Step is due (optional). | | expectedDuration | integer | The number of hours the Task Step is expected to take to complete (optional). | | startDate | date | When the Task Step began executing. | | completionDate | date | When the Task Step was completed. | | vendorId | integer | The internal ID of the Vendor who this step is assigned to. | | vendorName | string | The name of the Vendor who this step is assigned to. | | stepName | string | The name of this step (for example, “Translate”). | | stepType | string | The type of this step (one of Start, Finish, Human, Automatic, Sub-Workflow, Parallel Review, Automatic Error, or Cancel). |
|
archiveTaskHistoryEntries
A record of interesting events with timestamps for historical WorldServer Tasks.
| Column | Type | Description |
|---|
| taskHistEntryId | integer | The internal ID of this History Entry. Primary key. | | taskId | integer | A foreign key into the archiveTasks table, indicating the Task this History Entry applies to. | | theDate | Date | The date of a historical event. | | eventType | integer | The event that occurred, one of the following possible values.
- 1: Task started
- 2: Task completed
- 3: Task canceled
- 4: Task Step claimed
- 5: Task Step executed
- 6: Task Step completed
- 7: Task Step un-claimed
- 8: Sub-task started
- 9: Sub-task completed
- 10: Task assignees modified
- 11: Task assets exported
- 12: Task assets imported
- 13: Task branched
- 14: Sub-task canceled
| | userId | integer | The unique ID of the user who performed the event, if relevant. | | theComment | string | A user-provided comment for the event (optional). |
|
archiveCustomProjectAttributes
Custom attributes and values for historical WorldServer Projects. Attributes with multiple values will have multiple rows in this table for the same attribute and Project.
| Column | Type | Description |
|---|
| projectId | integer | A foreign key into the archiveProjects table, indicating the Project whose custom attribute this is. | | name | string | The name of the custom attribute. | | uiName | string | The UI display name of the custom attribute. | | valueString | string | The string value of the attribute. | | valueDate | date | The attribute value as a date. Not set for non-date-valued attributes. | | valueNumber | integer | The numeric value of the attribute. Not set for non-numeric attributes. | | valueClob | CLOB | The attribute value, for attributes that require large amounts of storage (Large Text Area, Comment, HTML attributes). |
|
archiveCustomTaskAttributes
Custom attributes and values for historical WorldServer Tasks. Attributes with multiple values will have multiple rows in this table for the same attribute and Task.
| Column | Type | Description |
|---|
| taskId | integer | A foreign key into the archiveTasks table, indicating the Task whose custom attribute this is. | | name | string | The name of the custom attribute. | | uiName | string. | The UI display name of the custom attribute | | valueString | string | The string value of the attribute. | | valueDate | date | The attribute value as a date. Not set for non-date-valued attributes. | | valueNumber | integer | The numeric value of the attribute. Not set for non-numeric attributes. | | valueClob | CLOB | The attribute value, for attributes that require large amounts of storage (Large Text Area, Comment, HTML attributes). |
|
archiveCustomCostModelAttributes
Custom attributes and values for historical WorldServer Cost Models. Attributes with multiple values will have multiple rows in this table for the same attribute and Cost Model.
| Column | Type | Description |
|---|
| costModelArchiveId | integer | A foreign key into the archiveCostModels table, indicating the Cost Model whose custom attribute this is. | | name | string | The name of the custom attribute. | | uiName | string | The UI display name of the custom attribute. | | valueString | string | The string value of the attribute. | | valueDate | date | The attribute value as a date. Not set for non-date-valued attributes. | | valueNumber | integer | The numeric value of the attribute. Not set for non-numeric attributes. | | valueClob | CLOB | The attribute value, for attributes that require large amounts of storage (Large Text Area, Comment, HTML attributes). |
|
archiveCustomVendorAttributes
Custom attributes and values for historical WorldServer Vendors. Attributes with multiple values will have multiple rows in this table for the same attribute and Vendor.
| Column | Type | Description |
|---|
| vendorId | integer | A foreign key matching the vendorId column in the archiveCostModels table, indicating the Vendor whose custom attribute this is. | | archiveVendorAttributeId | integer | A unique ID. | | name | string | The name of the custom attribute. | | uiName | string | The UI display name of the custom attribute. | | valueString | string | The string value of the attribute. | | valueDate | date | The attribute value as a date. Not set for non-date-valued attributes. | | valueNumber | integer | The numeric value of the attribute. Not set for non-numeric attributes. | | valueClob | CLOB | The attribute value, for attributes that require large amounts of storage (Large Text Area, Comment, HTML attributes). |
|
periodicWorkStatus
A record of WorldServer activity over time. This table is populated by the special PWS_TASK recurrence, which records periodic snapshots of activity on a schedule.
| Column | Type | Description |
|---|
| time | date | The timestamp when the snapshot was taken. | | activeProjects | integer | The number of Projects active when the snapshot was taken. | | activeTasks | integer | The number of Tasks active when the snapshot was taken. | | wordsInActiveTasks | integer | The number of words in assets associated with active Tasks when the snapshot was taken. |
|
archiveReviewModels
| Column | Type | Description |
|---|
| reviewModelArchiveId | integer | Primary key. | | reviewModelId | integer | The internal ID of the Review Model. | | name | string | The name of the Review Model. | | description | string | A description for the Review Model. |
|
archiveReviewModelData
| Column | Type | Description |
|---|
| reviewModelArchiveId | integer | Primary key. | | dataType | integer | The data type for the Review Model. | | data | string | The data for the Review Model. | | displayOrder | integer | Order in display. |
|
archiveTaskReviewErrors
| Column | Type | Description |
|---|
| archivedTaskId | integer | Primary key and Foreign key. | | errorSeverity | string | Primary key. Severity of error. | | errorType | string | Primary key. The data for the Review Model. | | errorCount | integer | Number of errors. |
|
Archive Reporting Schema Diagram
The following figure shows the relationships between the archived reporting tables.
Figure 1. Archive Reporting Tables
Sample Queries
The following examples demonstrate queries you can write against the archive tables.
How many projects were created in each of the last 12 months?
Oracle
SELECT * FROM
(SELECT to_Char(creationDate, 'MM/YYYY') Month, count(*) Total
FROM archiveProjects
GROUP BY to_Char(creationDate, 'MM/YYYY')
ORDER BY to_Char(creationDate, 'MM/YYYY') desc)
WHERE rownum<=12;
SQL Server
SELECT top 12 Month, count(*) as Total FROM
(SELECT SUBSTRING(CONVERT(char(10), creationDate, 103), 4, 7) as
Month
FROM archiveProjects) projects
GROUP BY Month
ORDER BY Month desc
What was the average time in hours for a translation project?
Oracle
SELECT 24*sum(completionDate - creationDate)/count(*) as average
FROM archiveProjects
WHERE completionDate is not null;
SQL Server
SELECT sum(DATEDIFF(second, creationDate, completionDate))/3600.0/
count(*) as average
FROM archiveProjects
WHERE completionDate is not null;
How many active projects and tasks were there on each day over the last two months?
Oracle
SELECT to_Char(time, 'MM/DD') Day,
sum(activeProjects) as activeProjects,
sum(activeTasks) as activeTasks
FROM periodicWorkStatus
WHERE time > sysdate - interval '61' day
GROUP BY to_Char(time, 'MM/DD')
ORDER BY to_Char(time, 'MM/DD') desc;
SQL Server
SELECT Day,
sum(activeProjects) as activeProjects,
sum(activeTasks) as activeTasks
FROM
(SELECT (CONVERT(char(2), Month(time)) + '/' + CONVERT(char(2),
Day(time))) as Day,
activeProjects, activeTasks
FROM periodicWorkStatus
WHERE DATEDIFF(day, time, CURRENT_TIMESTAMP) <= 61) work
GROUP BY Day
ORDER BY Day desc
The archiveCustomProjectAttributes, archiveCustomTaskAttributes, and archiveCustomVendorAttributes tables can be used to generate reports that show values of custom attributes associated with projects, tasks, and vendors, respectively. These tables contain a row for each custom attribute, which means that the information about all the custom attributes for a particular project consists of several rows.
It is typical to want to display this information in columns, rather than rows. For instance, if you have custom project attributes called
PO Status and
Color, you may want a report that shows these attributes alongside other standard project fields (like
due date):
| Project | Due_Date | PO_Status | Color |
|---|
| Translate Brochures | 1/2/2006 | processed | Green | | Update Web site | 2/3/2006 | pending approval | Yellow | | New User Guide | 3/4/2006 | received | Red |
|
To do this, you would need to construct a complex SQL query that gets the desired information from the
archiveProjects table, then joins the
archiveCustomProjectAttributes table once for each desired attribute. For instance, the following query could be used to generate the table above:
SELECT
projects.name as Project,
projects.dueDate as Due_Date,
po.valuestring as PO_Status,
color.valuestring as Color
FROM
projectsArchive projects
LEFT OUTER JOIN
(SELECT valueString, projectId
FROM customProjectAttributesArchive
WHERE name = 'PO Status') po
ON projects.projectId = po.projectId
LEFT OUTER JOIN
(SELECT valueString, projectId
FROM customProjectAttributesArchive
WHERE name = 'Color') color
ON projects.projectId = color.projectId
If you need to make a large number of reports like this, you may find it simpler to create a view in your database that joins in all the attribute values of interest as columns; you can then use very simple queries based on this view in your reports.