Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Ingester Service process forwards items and their associated pipelet-configuration to the Plumber Service where Pipelets get executed.

  • pipelet-configuration : The Pipelet to be run and its configuration

    • Pipelet Configuration is stored within the pipeline_workflow

Service Configuration

Ingester Service

  • The Ingester can spawn multiple processes

  • Each Ingester process splits a batch into N minibatches to allow parallelisation and increase throughput
    Those mini-batches are handled and sent concurrently to the Plumber Service, using a ThreadPool maintaining step_plumber_mini_batch_threads threads.

Code Block
$ /etc/squirro/ingester.ini
[ingester]
processors = 2
[pipeline] 

step_plumber_mini_batch_threads = 2

Plumber Service

  • With the example configuration above, the Plumber Service should spawn 4 workers to have always enough resources ready to handle incoming mini-batches served by Ingester processes at any time (ingester.processors x ingester.pipeline.step_plumber_mini_batch_threads
    = plumber.server.max_spare = 4)

Code Block
$ /etc/squirro/plumber.ini
[server]
fork = true
max_spare = 4 

PDF
nameIngesterProcess_Plumber.pdf

...

This page can now be found at Scaling Pipelet Execution on the Squirro Docs site.