Documentation Center

Configuring Archive Manager overview

You configuring Archive Manager in the cd_archivemanager_conf.xml configuration file. You need to configure the Web site Publications you want to archive, the content you want to include or exclude, and configure the personalized Web content you want to capture.

<Publications>
Basic configuration involves you configuring the Web site Publication that you want to archive by specifying the Publication ID and the base URL (the starting string of a URL; all resources that contain this string are archived) of each Web site you want to archive. For more information, see Archiving Web site Publications.
<InclusionRule>
You can include specific URLs for archiving based on certain conditions being met. Inclusion rules provide you with a finer-grained control over the archiving process to define what resources are archived and when. You need to define inclusion rules for each Web site Publication you are archiving by adding <InclusionRule> elements as children of a <Publication> element. For more information, see Including URLs.
<capture>
In the <capture> element you can exclude URLs, define identities, and add protocol providers:
<Exclude>
You also can exclude specific URLs from archiving based on certain conditions being met. You define exclusion URLs using regular expressions.
For more information, see Excluding URLs
<DefaultIdentity>
The <DefaultIdentity> is used to archive content for a default user profile for capturing an anonymous view of page.
<Identity>
You can make the Archive Manager capture personalized Web content by defining "identities".
An <Identity> allows you to archive pages that are personalized for specific users.
For more information on defining <Identity> elements, see Configuring personalized content (Identities).
<ProtocolProvider>
Archive Manager provides protocol registration so that you can configure the protocols you want to allow in the captured URLs. You can configure protocol providers by adding a <ProtocolProvider> element in the cd_archivemanager_conf.xml configuration file. For more information, see Protocol providers.

Basic cd_archivemanager_conf.xml example

The following example shows the main XML elements you can configure in the cd_archivemanager_conf.xml configuration file:

<Publications>
    <Publication/>
      <InclusionRule/>
    </Publication>
</Publications>

<Capture>
     <DefaultIdentity>
        <Proxy/>
     </DefaultIdentity>
     <Identity>
        <HttpHeader/> 
		      </Proxy>
				 </Identity>
     <Exclude/>
     <ProtocolProvider/>
</Capture>