Documentation Center

Capturing

Capturing defines the procedure and scope for capturing the content. The Archive Manager captures and archives Uniform Resource Locators (URLs).

These URLs include HTML, JSP, ASP, .NET, and ASP.NET Pages, and all other files ("artifacts") used to compose Web pages. The artifacts include, for example, CSS, JS, XML, and XSL files, as well as all MIME types supported by the Content Manager such as GIFs, JPEGs, and MPEGs.

The capture process involves getting a list of all affected URLs when content is published, unpublished, or republished. The affected URLs are all resources that have changed as a result of publish action. This also includes:

  • Archiving Pages when a Component is published — when a Component is published, all Pages on which the Component is displayed are archived. This includes Dynamic Component Presentations in which the Components are statically included on the Page.
  • Archiving Pages regardless of deployment order — if a page is captured and archived, but one or more of its artifacts has not been published yet, the page will be rearchived when the linked archive/resource is eventually published (but only if the resource changes the Page).
  • Archiving Pages if a dynamic link changes — for example, if the priority or directory location changes, or if the Component links to a Component that has not yet been published (and only the link text is displayed).

Supported MIME types

The Archive Manager supports the archiving of all resources used in a Web site that use the major Web technologies, for example, HTML, JSP, ASP.NET, Flash, and so on. The following table describes the supported binary file types:

NameMIME typePossible extensions
Access Databaseapplication/octet-streammdb
Bitmap Imageapplication/octet-streambmp
Excel Sheetapplication/ms-excelxls
Executableapplication/octet-streamexe
Flash Fileapplication/x-shockwave-flashswf
Gif Imageimage/gifgif
Jpeg Imageimage/jpegjpg, jpeg, jpe
MP3 Musicaudio/x-mpegmp3
Mpeg Videovideo/mpegmpg
PDF Documentapplication/pdfpdf
Plain Texttext/plaintxt
Png Imageimage/pngpng
PowerPoint Presentationapplication/ms-powerpointppt
QuickTime Movievideo/quicktimemov, qt
Real Playervideo/vnd.rn-realmediarm, ram, ra, rv
Rich Texttext/rtfrtf
Sound Fileapplication/x-wavwav
Word Documentapplication/msworddoc