...
You can now install and run docker on production instances.
A link to the Squirro Monitoring space was added to the Spaces Menu, giving administrative users quick access to view their project's activity and data ingestion logs.
The PDF-OCR step now includes optional Confidence scoring, which can be enabled at the some performance cost of performance.
Added the option within the a Data Loader option to delete nargs entry (multi-value fields).
Added a new libNLP step to call spaCy running as a Squirro NLP service.
Created a Binary Documents pipeline for new projects.
Added a HFQuestionAnswering processor that can run Hugging Face question-answering pipelines for inference.
...
Improved the performance of PDF document ingestion for pipeline workflows which include ML models.[Search] Productize search service. For now exposes endpoints to help debugging the output of the squirro-query-syntax parsing pipeline.
Added information about sender and recipients to the attachments items in the Exchange data loader plugin.
Added bulk_labeling functions to handle this types of operation.
React widgets will now fetch data only if they are in a visible dashboard section layer.
Moves Moved all steps calling external services (mlflow_maas, endpoint) to the steps subfolder external. For compatibility reasons, the top-level registration was kept, allowing steps to still be addressed via their step name.
Added an endpoint for creating bulk labeling.
Upgraded React to v18.
Clicking on a community typeahead suggestion will now redirect the user to the selected community.
Implemented an NLP step for bulk labeling.
Added handling config for the bulk labeling step.
The previous “PDF Cannot Be Displayed” error message is has been replaced with the more generic and tries better to show “This document cannot be displayed” message and now makes a better attempt to display a working file link.
Increased the clickable area for the subscribe button in the Communities List widget.
Added creation date for proximity rule.
Added a link to the Squirro Monitoring space to the Squirro Spaces popover with current project being selected in the dashboard filter.
Squirro now uses tika-pdf-sentences by default for speedier PDF Sentences Tokenization.
Analyzed query tokens that might contain valid sub-tokens are now additionally re-written to perform exact phrase matching during query processing. NewYork => ("NewYork"~0 OR NewYork). This enables sub-word matching on New, and York individually relying on the configured SearchAnalyzer (subword-delimiter) - but will additionally match the exact phrase NewYork as well.
Implemented a ML endpoint for creating ml job which automatically creates labels.
...