...
Key | Data Type | Description | |||||
---|---|---|---|---|---|---|---|
Matching | |||||||
tokenizer | String | For processing the text input, the text is split into individual tokens. The Supported tokenizers:
Please refer to KEE Tokenizers and Filters for details on the tokenizers. | |||||
filters | List | Together with the Available filters are:
Default: by default only the Please refer to KEE Tokenizers and Filters for details on the filters. That section also explains how to create custom filters. | |||||
min_score | Float | How good a score is required for a token to match. 1.0 is a perfect match, 0.0 is no match at all. Use KEE Testing to find the right balance for each use case. Turning on verbose logging or tracing (see the Default: 0.9 | |||||
spellfix | Boolean | Allow small spelling mistakes. This allows at most one letter swap, so e.g. "Apple" and "Appel" will both match each-other. Default: false | |||||
blacklist | List | A list of entity names to ignore. If any of the | |||||
suffix_list | String | The suffix list that is used to remove common suffixes in the entity names. See the section suffix list below for details. | |||||
geo_strategy | String | How to deal with geographic names in entity names. Possible values:
| |||||
Keywords | |||||||
keywords | List | The keywords section defines which keywords are added to a Squirro item based on any matching entity. This is a list of keywords that can be added, where each individual entry contains the input file column to write and the keyword name into which to store it. The target value can make use of simple template substitution to add keyword names based on the data of the matching row. The syntax is a field name surrounded by curly brackets. Example:
| |||||
parent_keywords | List | The same setting as This recursively processes all parent entities (if any) and sets keywords on the item based on these rules. For this to work there must be a | |||||
clean_keywords | List | A list of keywords that should be removed from the items before applying the KEE tagging. This is useful when re-running KEE tagging to ensure that old keywords are removed. Example:
|
...