Troubleshooting Archive Manager capture process
This topic describes limitations in the Archive Manager Capture Process.
- URLs used in JavaScript files or script segments of an HTML page
- URLs used in JavaScript files or script segments of an HTML page are not automatically included in the capture process.
- Redirection performed client-side using JavaScript code is not captured.
-
If a page contains a redirect, for example if redirect.jsp contains the following, then the original page which obtained on the Web server is the one archived (the redirect.jsp and not the Google Home Page):
<SCRIPT LANGUAGE="JavaScript"> window.location="http://www.google.com/"; </SCRIPT> - URL redirects
- URLs that are redirected using an HTTP status code are archived using the redirected URL rather than the original one.
- Nested resources such as Flash, Word, and PDF
-
When a document with a
"text\html"or"text\css"content type is captured, the process automatically recaptures embedded\nested resources used by the document, such as CSS imports, JavaScript files, images, and so on when these pages are republished. Other document formats require additional work to archive these resources. - Nested resources for documents of content types "text\html" or "text\css"
-
When capturing nested resources for documents of content types
"text\html"or"text\css"the process only includes embedded resources and not linked resources (hyperlinks) - Web applications
- Parts of a Web site running as Web application, such as site search and form data, are not supported.
- Initial client page load data
- Only data that is available during the initial client load of the page is captured.
- Application Server caching
- Application Server caching functionality affects the correct archiving of undeployment actions, specifically it affects the availability of a given URL.
- Only the rendered output for dynamic pages and not the original file is archived.
-
Although the archiving of dynamic pages resulting from code execution (such as
.jsp,.aspx,.asppages, and so on) are supported by the capture process, only the rendered output for these pages and not the original file is archived. - Web sites that use more than one protocol require additional configuration.
- Solution—Web sites are archived using a
BaseUrl. If your Web site uses more than one protocol, for example parts of your site are secured using HTTPs, you need to archive this section separately by defining a separateBaseUrl. For example:<Publications BaseUrl="http://www.mycompany.com:8080"> <Publication Id="1"/> <Publication Id="1" BaseUrl="https://www.mycompany.com/CustomerSupport:8081"/> </Publications>