Page Comparison

class squirro_client.item_uploader.ItemUploader(token=None, project_id=None, project_title=None, object_id=None, source_name=None, source_ext_id=None, cluster=None, client_cls=None, batch_size=None, config_file=None, config_section=None, processing_config=None, steps_config=None, source_id=None, source_secret=None, pipeline_workflow_name=None, pipeline_workflow_id='default', timeout_secs=None, **kwargs)

Item uploader class. Defaults are loaded from the .squirrorc file in the current user’s home directory.

Parameters:

token – User refresh token.
project_id – Identifier of the project, optional but one of project_id or project_title has to be passed in.
project_title – Title of the project. This will use the first project found with the given title. If two projects with the same title exist the project being used is not predictable.
object_id – Identifier of the object.
source_name – Name of the source.
source_ext_id – External identifier of the source, if not provided defaults to source_name.
cluster – Cluster to connect to. This only needs to be changed for on-premise installations.
batch_size – Number of items to send in one request. This should be lower than 100 depending on your setup. If set to -1 the optimal batch size is calculated from the items. Defaults to -1.
bulk_index – If set to True the cluster is instructed to index data in bulk.
bulk_index_add_batch_identifier – If set to True the cluster is instructed to add a batch identifier to each item during bulk indexing.
bulk_index_add_summary_from_body – If set to True the cluster is instructed to add the summary from the body during bulk indexing.
config_file – Configuration file to use, defaults to ~/.squirrorc
config_section – Section of the .ini file to use, defaults to squirro.
processing_config – A dictionary which contains specific instructions which are used while processing items for the source. Overriden by pipeline workflow name/ID.
source_id – Source which should be used. If passed in together with source_secret no source is created.
source_secret – Source secret to be used with source_id. If passed in together with source_id no source is created.
pipeline_workflow_name – Pipeline workflow name. Either name or id need to be set, otherwise processing_config is used.
pipeline_workflow_id – Pipeline workflow ID.

Typical usage:

Code Block

language	python

>>> from squirro_client import ItemUploader
>>> uploader = ItemUploader(project_title='My Project',
...                         token='<your token>')
>>> items = [{'id': 'squirro-item1',
...           'title': 'Items arrived in Squirro!'}]
>>> uploader.upload(items)

Project selection:

The ItemUploader creates a source in your project. The project must exist before the ItemUploader is instantiated.

Source selection:

The source will be created or re-used, the above parameter define how the source will be named.

Configuration:

The ItemUploader can load its settings from a configuration file. The default section is squirro and may be overridden by the parameter config_section to allow for multiple sources/projects.

Example configuration:

Code Block

language	python

[squirro]
project_id = 2sic33jZTi-ifflvQAVcfw
token = 9c2d1a9002a8a152395d74880528fbe4acadc5a1

`upload`

upload(items)

Sends items to Squirro.

Parameters:	items – A list of items. See api_reference_sink_data_format for the item format.

Versions Compared

Old Version 20

New Version 21

Key

Table of Contents

`class squirro_client.item_uploader.ItemUploader`

`upload`