Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Squirro also distributes some connectors as data loader plugins for maximum flexibility. Please see the section /wiki/spaces/DOWN/pages/54624262 in the downloads section (license required)Some data loader plugins can also be installed from Squirro's repository. See Plugin Repository for more.

Table of Contents

Table of Contents
outlinetrue

...

Usage

Data loader plugins are created as a Python class (see the DataSource Class reference for how exactly).That plugin is then loaded into the data loader using the --source-script parametercan be used in Squirro's user interface, or on the command line with the Data Loader.

When using a data loader plugin in the user interface, it behaves exactly the same way as the built-in connectors. You can configure some options, select the mappings, and save the source.

On the command line, the plugin can be loaded by specifying the --source-script option. For example:

Code Block
squirro_data_load ^
    -v ^
    --cluster %CLUSTER% ^
    --project-id %PROJECT_ID% ^
    --token %TOKEN% ^
    --source-script medline.py ^
    --map-id PMID ^
    --map-title TI

Tutorial

The Data Loader Tutorial section Custom Data Source shows how to create a custom data loader plugin.

Class Reference

See the DataSource Class.

Custom Packages

Data loader plugins sometimes need custom Python packages. These can be provided in the pkg folder, relative to where the data loader is invoked.

The recommendation is to include a requirements.txt file in the folder where the data loader plugin is located. That file lists all the required packages, with one package dependency per line. For example:

Code Block
languagetext
titlerequirements.txt
dateutils
requests

To download these packages into the pkg folder, execute the following command (the pkg folder must exist before calling this command):

Code Block
pip install -r requirements.txt --root pkg

Earlier data loaders need the following download command instead. Unfortunately that may not work in all cases, specifically if files of the type .whl are downloaded. Please contact support, if you encounter any issues in this regard.

Code Block
pip install -r requirements.txt --download pkg

Key Value Store/Cache

Info

Key Value entries stored in the self.key_value_cache instance have a default TTL and will expire after some time. Use this instance for caching purposes only and not for persistant key-value pairs that you do not want to lose.

Starting with release 2.6.0, dataloader plugins now have access to a key value store/cache. They are available as self.key_value_store and self.key_value_cache respectively in the Dataloader plugin class. More documentation for the usage of these key value store/cache is available under Data loader API for Caching and Custom State ManagementAs can be seen, --source-script is used in place of the --source-type option. Apart from that, the behaviour is identical, and mapping, facet configuration, etc. all behave identically.

Writing plugins

Data loader plugins are implemented as Python classes. The topic is covered in Writing a custom data loader.