The Ingester Service process forwards items and their associated pipelet-configuration
to the Plumber Service where Pipelets get executed.
pipelet-configuration
: The Pipelet to be run and its configurationPipelet Configuration is stored within the
pipeline_workflow
Service Configuration
Ingester Service
The Ingester can spawn multiple processes
Each Ingester process splits a batch into N minibatches to allow parallelisation and increase throughput
Those mini-batches are handled and sent concurrently to the Plumber Service, using a ThreadPool maintainingstep_plumber_mini_batch_threads
threads.
Code Block |
---|
$ /etc/squirro/ingester.ini
[ingester]
processors = 2
[pipeline]
step_plumber_mini_batch_threads = 2
|
Plumber Service
With the example configuration above, the Plumber Service should spawn 4 workers to have always enough resources ready to handle incoming mini-batches served by Ingester processes at any time (
ingester.processors x ingester.pipeline.step_plumber_mini_batch_threads
= plumber.server.max_spare = 4
)
Code Block |
---|
$ /etc/squirro/plumber.ini
[server]
fork = true
max_spare = 4 |
|
...
This page can now be found at Scaling Pipelet Execution on the Squirro Docs site.