Content and Property Sheet Data in XML
We have included limited XML indexing and search capabilities in the installed collection named Contenta_home\tools\solr\server\solr\collectionSDL_Contenta_Sample.
This collection includes a schema.xml file that defines a special field, XMLCONTENT, used to index XML content if, in fact, the Contenta object has any XML content. The object’s content, stripped of XML using Tika, is indexed in a separate schema.xml defined field, CONTENT_EN (where EN is the language code).
The XMLCONTENT field uses XML Tokenizers implemented as a Java .jar file delivered in the sample collection. Tokenizers are used for both indexing and querying XMLCONTENT. The XML tokenizers are delivered in XmlAwareTokenizer.jar, which is located in
solr_home\server\solr\collectionSDL_Contenta_Sample\lib.
S1000D property data is indexed in fields that are spelled the same as the property data field name, although letters are converted to uppercase, and spaces are replaced with an underscore:
- APPLIC_CROSSREF_TABLE_REF
- BUSINESS_RULES_REFERENCE
- COUNTRY_ISO_CODE
- DATA_MODULE_CODE
- DISASSEMBLY_CODE
- DISASSEMBLY_CODE_VARIANT
- DOCUMENT_TYPE
- INFORMATION_CODE
- INFORMATION_CODE_VARIANT
- INFORMATION_NAME
- ISSUE_DATE
- ISSUE_INWORK_NUMBER
- ISSUE_NUMBER
- ISSUE_TYPE
- ITEM_LOCATION_CODE
- LANGUAGE_ISO_CODE
- LEARN_CODE
- LEARN_EVENT_CODE
- MIME_TYPE
- MODEL_IDENTIFICATION_CODE
- NAME
- ORIGINATOR_CODE
- ORIGINATOR_NAME
- PUBLICATION_MODULE_CODE
- PUBLICATION_MODULE_ISSUER
- PUBLICATION_MODULE_NUMBER
- PUBLICATION_MODULE_TITLE
- PUBLICATION_MODULE_VOLUME
- PUB_ID_NUMBER
- QUALITY_ASSURANCE_VERIFICATION
- RESPONSIBLE_PARTNER_CO_CODE
- RESPONSIBLE_PARTNER_CO_NAME
- S1000D_VERSION
- SECURITY_CAVEAT
- SECURITY_CLASSIFICATION
- SECURITY_COMMERCIAL_CLASS
- SHORT_PUBLICATION_MODULE_TITLE
- SKILL_LEVEL_CODE
- SUBSUBSYSTEM_CODE
- SUBSYSTEM_CODE
- SYSTEM_CODE
- SYSTEM_DIFFERENCE_CODE
- TECHNICAL_NAME
- UNIT_OR_ASSEMBLY_CODE
- VERIFICATION_TYPE
All other property data is stored in dynamic fields. The fields are named using the same rules as for the above listed S1000D property data; however, an _S is appended to the field name. For example: the property data named Access Level would be indexed in a field named ACCESS_LEVEL_S.