Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Add note that items are dicts and tweak the readability
Excerpt

Pipelets are plugins to the Squirro pipeline, used to customize the data processing.

Table of Contents

Table of Contents
outlinetrue
excludeTable of Contents

...

As it name says it does nothing but return the item unchangedthe item unchanged.

The item is a Python dict type and can be modified before it is returned. For example. The available item fields are documented in the Item Format reference. The following example illustrates modifying an item:

Code Block
languagepy
from squirro.sdk import PipeletV1
 
class ModifyTitlePipelet(PipeletV1):
    def consume(self, item):
        item['title'] = item.get('title', '') + ' - Hello, World!'
        return item

This pipelet will modify each item it processes, appending the string "Hello, World!" to the title. All the item's fields can be modified. The available fields are documented in the Item Format reference. 

Returning multiple items

The pipelet is always called for each item individually. But in some use cases the pipelet should not just return one item but multiple ones. In those cases use the Python yield statement to return each individual item. For example:

...