Excerpt |
---|
The configuration of built-in enrichments is done with the processing config. This can be used both for enabling and disabling of enrichments, as well as adding additional configuration for a step. |
There are two places this configuration can be specified:
...
To set up a processing configuration, specify the processing
field in a source's config. The value of that field is again a dictionary, with the enrichment names as keys.
Enrichments which can be specified include:
Processing Step | Documentation Link |
---|---|
unshorten-link | |
deduplication | Duplicate Detection |
content-augmentation | |
content-conversion | Content Conversion |
language-detection | Language Detection |
boilerplate-removal | Boilerplate Removal |
nearduplicate-detection | Near-Duplicate Detection |
webshot | Webshot |
filtering | Filtering |
For example to set up a Twitter source with duplicate detection disabled, the following configuration would be used:
...