Stop Words list
Stop words are words are words to be ignored in content during search. A set of stop word files is installed with Solr in the collectionSDL_Contenta_Sample/conf/lang directory. Stop words are supported in many foreign languages.
The file names end in the language code, and the files are customizable. There is no country code in these file names, just a language code. For example: stopwords_fr.txt for French.
Each file is a list of words. In stopwords_en.txt, the words include: "a an and the there their," and many more. For example, if you search for "and" in the object content fields, no hits will be returned. This is useful when you enter a search such as "the good the bad and the ugly" without putting quotes around the string. The search will be limited to the words "good bad ugly." Otherwise, every object containing content would likely be returned in the search results because each probably contains the words "and" and/or "the."
The following language codes are supported:
Supported Language Codes
| Language | Code |
|---|---|
| Arabic | ar |
| Armenian | hy |
| Basque | eu |
| Bulgarian | bg |
| Catalan | ca |
| Chinese | zt |
| Czech | cs |
| Danish | da |
| Dutch | nl |
| Finnish | fi |
| French | fr |
| Galician | gl |
| German | de |
| Greek | el |
| Hindi | hi |
| Hungarian | hu |
| Indonesian | id |
| Irish | ga |
| Italian | it |
| Japanese | ja |
| Korean | ko |
| Latvian | lv |
| Norwegian | nn |
| Persian | fa |
| Portuguese | pt |
| Romanian | ro |
| Russian | ru |
| Spanish | es |
| Swedish | sv |
| Thai | th |
| Turkish | tr |