Processing Config

The configuration of built-in enrichments is done with the processing config. This can be used both for enabling and disabling of enrichments, as well as adding additional configuration for a step.There are two places this configuration can be specified:

Per source / subscription: when creating a new subscription, the processing instructions can be passed in to fine-tune the behavior for that one source.
Per project: a project also has a processing config, which applies to all items coming in for a project.

Source processing config

To set up a processing configuration, specify the processing field in a source's config. The value of that field is again a dictionary, with the enrichment names as keys.

Enrichments which can be specified include:

Processing Step	Documentation Link
unshorten-link	Unshorten Link
deduplication	Duplicate Detection
content-augmentation	Content Augmentation
content-conversion	Content Conversion
language-detection	Language Detection
boilerplate-removal	Boilerplate Removal
nearduplicate-detection	Near-Duplicate Detection
webshot	Webshot
filtering	Filtering

For example to set up a Twitter source with duplicate detection disabled, the following configuration would be used:

{
    "query": "Squirro",
    "processing": {
        "deduplication": {
            "enabled": false
        }
    }
}

Using the Python SDK a subscription for this could be created with the following code snippet:

client = SquirroClient(None, None, cluster='https://next.squirro.net/')
client.authenticate(refresh_token='293d…a13b')
client.new_subscription(project_id, object_id='default', provider='twitter',
    processing_config={
        'query': 'Squirro',
        'processing': {
            'deduplication': {
                'enabled': False
            }
        }
    })

The enabled property is available for every built-in enrichment and can be set to true or false. Some of the enrichments have additional configuration options, that are described on the corresponding page.

Project processing config

Please contact the Squirro team if you want to use project processing configs.