By default, the IQ Index Service indexes PDF documents, Microsoft Word documents, Microsoft Excel documents, Microsoft PowerPoint documents, and OpenDocument Text documents. You can change which binary files (with textual content) get indexed by editing the configuration file, deployer-conf.xml, of your combined Content Deployer or Content Deployer worker. In these files, you can specify configuration strings as hardcoded values or as parameters.
Procedure
- On your Content Delivery server environment, access the configuration location of your Content Deployer worker or, if your worker is combined with the endpoint, the configuration location of your combined Content Deployer.
- Depending on your preference, do one of the following:
- If you prefer to configure using environment variables, open deployer-conf.xml for viewing.
- If you prefer to configure by editing the configuration file itself, open deployer-conf.xml for editing.
- Find the
Step element that takes care of indexing content. It has its Id attribute set to IshSearchIndexDeployStep.
- Within this element, find the child element called
BinaryIndexing. Its extensions attribute specifies the default list of file extensions for binary files it can index:
| File extension | File type |
|---|
| pdf | Adobe PDF (Portable Document Format) document |
| doc | Microsoft Word document (before the 2007 version) |
| docx | Microsoft Word document (from the 2007 version onward) |
| xls | Microsoft Excel document (before the 2007 version) |
| xlsx | Microsoft Excel document (from the 2007 version onward) |
| ppt | Microsoft PowerPoint document (before the 2007 version) |
| pptx | Microsoft PowerPoint document (from the 2007 version onward) |
- To change the list of file extensions, decide which types of files you want to index, and identify the file extensions associated with those file types. This indexing functionality uses Apache Tika, a toolkit that can work with over a thousand different file types. For more information about supported file types, refer to the Apache Tika Supported Document Formats webpage.
- Ensure that the value of the
extensions attribute in deployer-conf.xml is set to your own comma-separated list of file extensions.