Squirro also distributes some connectors as data loader plugins for maximum flexibility. Please see the section /wiki/spaces/DOWN/pages/54624262 in the downloads section (license required).
Table of Contents
Introduction
Data loader plugins are created as a Python class (see the DataSource Class reference for how exactly).
That plugin is then loaded into the data loader using the --source-script
parameter. For example:
squirro_data_load ^ -v ^ --cluster %CLUSTER% ^ --project-id %PROJECT_ID% ^ --token %TOKEN% ^ --source-script medline.py ^ --map-id PMID ^ --map-title TI
Tutorial
The Data Loader Tutorial section Custom Data Source shows how to create a custom data loader plugin.
Class Reference
See the DataSource Class.
Custom Packages
Data loader plugins sometimes need custom Python packages. These can be provided in the pkg
folder, relative to where the data loader is invoked.
The recommendation is to include a requirements.txt
file in the folder where the data loader plugin is located. That file lists all the required packages, with one package dependency per line. For example:
dateutils requests
To download these packages into the pkg
folder, execute the following command (the pkg
folder must exist before calling this command):
pip install -r requirements.txt --root pkg
Earlier data loaders need the following download command instead. Unfortunately that may not work in all cases, specifically if files of the type .whl
are downloaded. Please contact support, if you encounter any issues in this regard.
pip install -r requirements.txt --download pkg
Key Value Store/Cache
Key Value entries stored in the self.key_value_cache
instance have a default TTL and will expire after some time. Use this instance for caching purposes only and not for persistant key-value pairs that you do not want to lose.
Starting with release 2.6.0, dataloader plugins now have access to a key value store/cache. They are available as self.key_value_store
and self.key_value_cache
respectively in the Dataloader plugin class. More documentation for the usage of these key value store/cache is available under Data loader API for Caching and Custom State Management.