...
New Ingestion Pipeline Editor
...
The pipeline editor has been completely recreated. The new editor is more visual, and provides a much easier overview of the various pipelines in a project.
In addition to that we have also laid down a lot of ground-work to allow for the re-running of pipeline workflows (for datasources running in the Frontend only) which allows for easier experimentation during the project setup process. For the more technical audience, this is enabled through the following underlying configurations. This will be included in the frontend in the upcoming releases.
We have added a new built-in pipeline step “Transform Input” which does the item fields and facets mapping. This
was previously done
in the dataloader
itself but can now be handled in the pipeline itself. This step is controlled using the configuration option
item_transformation_in_pipeline
. It is disabled by default, and should be considered
a beta feature for this release.
We have introduced a new
processed
directory in the Ingester to
store the input data to this directory before performing pipeline steps. This enables us to
processed
directory after executing the pipelinekeep a copy of the raw data to re-run the pipeline without fetching the data from the original source. This behavior is controlled by the configuration option
keep_processed_data
, which is also disabled by default.
We have also extend the Ingester to automatically remove the
input data after a certain time period or disk space threshold
to avoid disk over-filling. This is controlled by the configuration options
days_to_retain_processed_batches
andhours_to_retain_processed_batches
. This mechanism kicks in when thekeep_processed
Offer three Pipeline Workflow presets, a set of pre-made Pipeline Workflows with steps for covering various use cases.
Built-in steps have the possibility to use the
config_options
to render steps specific settings.Error when configuring source with pipeline steps that come from a DL plugin
Some steps which are already part of a workflow are missing values for certain properties
Pipelet implementing
getArguments()
is not rendered as 1st class widget -- throws error insteadPipeline Editor: Displaying issue with the config options of the Near-Duplicate Detection
"Pipeline editor: On using scroll arrows in the top left of the editor, the pipelines should scroll instead of jump
Pipeline Editor: Cannot delete workflow which was recently created and saved without reentering the editor
[Pipeline Editor] Step facet properties not being saved after changing value
"Pipeline editor - The link under Related in the left panel ""Add new Relate steps in the AI Studio"" links a user to the 'dashboards' space"
[Pipeline] Pipeline edit middle draggable section frontend implementation
Unable to rename step in pipeline workflow
[backend] Enable modification of pipeline step names
Ability to rerun whole pipeline for already processed data
Pre-populate pipeline with default steps
Enable triggering the rerunning of a pipeline workflow on the already processed raw data
Data loader frontend config - booleans break frontend when mappings.json provided
Data loader frontend config - Defaults don't show up
_data
is enabled.
In addition, we now offer three different Pipeline Workflow presets designed for various use-cases.
...
We have also added the functionality to rename all of the steps in pipeline workflow to your liking.
Pipelets which hint in their names that they perform Known Entity Extraction are now by default categorized in the “Relate” section of the Pipeline Editor.
...
Projects names are visible again in the project sidebar.
Adapt the SQ Dataloader behavior to not kill jobs automatically every 30 mins as long as we can fetch new data.
Cannot Fixed an issue where user cannot change avatar picture - because save button missing
Project selector (name) is missing if there's only one dashboard
Item detail: x button is misaligned.
"[Data Sources] ""Add feed source URL"" button has a weird circle animation".
Add animation for “Add feed sources” button on the feed dataloader plugin.
Fixed an issue where Startup of topic service would fail to install saml2 plugin because of failing to install the pysaml2 dependency.
Multiple Visual bug fixes around Cards widget and item detail view.
Improve exception handing in the feed plugin.
Scan endpoint now is not limited to 1000 entities per call/iteration.
Fixed the creation of favorites on the dashboard.
Salesforce SDK
Fonts are now downloaded from Salesforce instead of Squirro.
Styles no longer bleed from Squirro to Salesforce.
Fixed an issue where we were unable to open items.
Breaking Changes
With the introduction of the new pipeline editor, the navigation structure in the Setup space has changed. To make room for the new pipeline editor to use the full width, all the options that were previously under the Enrich tab have now moved to the new AI Studio tab. Rerunning of enrichments will soon disappear from there and then move into the new pipeline editor properly. As a result, any custom studio plugins under these sections have to be re-uploaded to either one of the existing sections or to a new section called “AI Studio”. This Can be achieved by specifying the
”location”: “dss”
in thestudio_plugin.json
file.Pipeline steps which are part of existing Pipeline workflows should continue function normally given that the included migration script was executed successfully. In case that something is not correct with an existing step of a workflow (e.g., no configuration options for a Pipelet when it should have), please remove the faulty step and add it again to the workflow. Then it should work as expected.
Fresh Installation Instructions
...