Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

DependencyDescription
cacheNon-persisted cache.
logA logging.Logger instance from Python's standard logging framework.
requestsPython requests library for to execute HTTP requests.

Development Workflow

For developing pipelets, Squirro provides the pipelet command line tool as part of the Toolbox.

Develop

The first step is to create the pipelet. In the following examples the pipelet will have been written to a file called pipelet.py in the current directory.

Code Block
titlepipelet.py
from squirro.sdk import PipeletV1

class ModifyTitlePipelet(PipeletV1):
    def consume(self, item):
        item['title'] = item.get('title', '') + ' - Hello, World!'
        return item

Validate

On the command line execute the pipelet validate command to verify that there are no errors in the pipelet code. For example this will ensure that no modules are imported that are disallowed from pipelets. See the section on Dependencies for more information.

Code Block
languagetext
pipelet validate pipelet.py 

Test

The pipelet consume command can be used to simulate pipelet running. For this purpose, the test items should be present in JSON text files on the disk. In the following example there is a item.json file in the current directory with this contents:

Code Block
languagejs
titleitem.json
{
    "title": "Sample",
    "id": "first_item"
}

To test the pipelet with this test file, use:

Code Block
languagetext
pipelet consume pipelet.py -i item.json 

This command will output the items that have been returned by the pipelet:

Code Block
languagetext
Loading items...
Loading item.json ...
Loaded.
Consuming item first_item
yielded item
{u'id': u'first_item', u'title': u'Sample - Hello, World!'} 

On top of these manual tests, automated tests can be implemented easily using the usual Python tools such as Nose.

Deploy

Once the pipelet is ready, it can be uploaded to the Squirro server. The pipelet upload command achieves that:

Code Block
languagetext
pipelet upload --token <your_token> --cluster <cluster> pipelet.py "Hello World"

This will make the pipelet available with the name "Hello World". To update the pipelet code on the server, this command can be re-executed at any time.

To use this in a project, open the Enrichments tab in the Squirro user interface and press "Add Enrichment". In the resulting dialog, the pipelet can be selected in the drop-down menu.

Image Added

Processing old items

Pipelets are only run for items that are processed in the system after the enrichment has been configured. For information on how to process old items with a pipelet, see Rerunning a Pipelet.