Beta is currently not installable, while we are working out a few kinks related to the Elasticsearch upgrade in the upcoming new version.
Changes in the beta repository are shown here as soon as a new version is available.
2018-01-29 13:52:14.554794
Support hjson in assets, e.g. data loader, studio plugins, widgets main configuration files
2018-02-02 14:28:40.233311
No user visible changes
2018-02-09 07:28:07.646470
Always include the Squirro cluster service in RedHat 7 and CentOS 7 deployments including single-node installations
Fixing an issue with proper ordering of pipelets within Pipeline 2.0
Plugin repository Studio plugin - install Squirro plugins such as data loader plugins or pipelets from the UI
2018-02-16 07:04:51.931427
Bugfix: do not stop redis when redis cache is stopped or restarted.
2018-02-16 07:33:27.944698
No user visible changes
2018-02-23 07:07:27.254662
Extend Dataloader API to provide access to a key-value store as well as a key-value cache to the Dataloader plugins.
Including optional nginx monitoring module nginx-module-vts
Fixes to walk-through try.squirro.cloud wizard
Support for Self-Service demo role and walkthrough wizard
2018-03-09 07:09:16.563789
Performance improvements to item deduplication.
... Make Topic service start up more robust in the event of failure of packaged studio/dataloader plugins.
Fix for our new server monitoring infrastructure
Be more correct in cleaning empty inputstream directories.
Ensure that pipeline 2.0 speaks the same language as pipeline 1.0
More robust error handling in Pipeline 2.0
Add support for consume_multiple pipelets in the plumber service.
Remove bulk indexing support from Python client. By default, the new pipeline will be used from now which is as flexible as the old bulk indexing system.
Obsolete the pipeline and the processor service.
Changed plumber API to allow for bulk operations for pipelet execution, also added bulking for cache cleaning step.
Removed the pipeline 1.0, pipeline 2.0 is the only choice now.
Improved disk usage by logrotating the /var/log/squirro/*/nginx*.log files
Unit test only fix
Really fix redis server restarting.
Support in the Pipeline 2.0 for granular item fault handling within batched enrichments and ability for batched steps to modify items
2018-03-16 07:07:30.557932
Be more robust when the deduplication step failed because of a restarted redis.
dataloader and SquirroAPI subscription processing-config backward compatibility with pipeline workflows
Enrichment API compatibility on top of pipeline workflows
Fixed the ability to import items with "almost empty" html documents, e.g. those containing only whitespace, comments, or html "processing instructions"
Bug fix to ensure sources relate properly to pipeline workflows upon upgrade
Pipeline workflows
Support of managing built-in enrichments in addition to custom enrichment in the User Interface and the Squirro API via "Pipeline Workflows"
Slow backoff for failing retries in the pipeline.
2018-03-23 07:17:23.047508
Improved handling of empty files
Improved Pipeline 2.0 performance of Language Detection and various logging fixes.
Remove pyrebloom package.
Fix plumber handling of item IDs in case of multiple items per source item or pipelet failures.
Simplify deduplication step to use bulk deletion and only support the replace policy based on ID.
...Add machine learning service for running various Squirro's machine learning models to enable things like auto-clustering, recommendations to begin with. Currently in alpha.
Upgrade numpy to version 1.14.2
Prevent pipeline 2.0 ingester from generating artificial sq:doc field on retries.
Cleaning up of old and legacy files in Pipeline 2.0 ingester inputstream directories.
Improved parallelism in keyword tagging and the filtering service by having the pipeline 2.0 ingester run multiple batches in parallel
Add action and status endpoints for the ingester, to be used by a Studio plugin in the future.
2018-03-23 07:38:42.898908
No user visible changes
2018-03-29 02:19:56.074580
Subscription API compatibility with old version of Squirro SDK tools
...Studio plugins can now use the cache instance of redis by invoking `get_injected('redis_studio_cache').
Slightly break the status api of the ingester (returning dicts instead of lists now). Improve robustness of status api, add failed batches truncation.
Enabled pipeline 2.0 failed file reaping at hourly granularity
Accept source processing config alongside workflow_id if processing config contains no changes .
Fix file importer.
Fix to allow enabling Prometheus based monitoring of Squirro
2018-04-06 07:40:23.286320
Fix machine learning install for centos 7
Robustness improvements to install and upgrade
Tolerate services no longer running before removal.
2018-04-13 07:10:19.301226
Fix keyword validation within steps within pipeline 2.0 and workflow for keywords with special characters such as spaces whose type is not basestring.
Remove squirro.service.fileimport
Remove squirro.tools.replay
Handle source-based legacy inputstream files when sources no longer exist. Also improve logging in case there are unforeseen future uncaught exceptions and don't let the file reaper thread die because of them.
Don't auto-opt-in to Noise Removal step on upgrades to Squirro version 2.6.0 and when Pipeline Workflow is not specific as in legacy Squirro API
Remove explicit dependencies to JRE. Replaced by checks for JAVA availability in the pre-install and pre-upgrade scripts of squirro-cluster-node and squirro-storage-node.
2018-04-20 11:15:00.070551
Pipelet name handling was fixed, this unbreaks various code dealing with Pipelets.
Remove squirro.api.bulk
Improved logging of pipeline 2.0 ingester and avoiding persisting sq:blocks and sq:bodyhash in retried batches as these are internal fields.
Human readable representation of time-formats returned in the Machine learning Jobs status.
More robust failure with next instructions for install/upgrades when java is missing.
2018-04-27 07:37:38.529513
Add a forking mode to pipelet (plumber) and filtering services for optional better CPU-bound scaling.
Do not write file contents to disk in the pipeline. This fixes livelocks of the ingester for some failing pipeline workflows.
Optimize performance of the pipeline in case the boilerplate removal is being used (less memory usage).
Provide a simple load balancing scheme in the nginx config.
Add `desyncFromDesktop` in topic.dashboards database to. To be used for supporting mobile dashboards.
Fix reset action of ingester.
2018-05-04 07:13:12.245403
Libnlp shim fixup.
Bug fix for resolving new facet keys.
Add errors_grouped section to the ingester status endpoint output.
Handle special characters in weighted keywords.
2018-05-11 07:13:20.425084
<internal build change>
Add packaging for schema library.
2018-06-11 14:05:03.958925
Fixed query parser for incorrect curly brace usage.
Fix an issue with quoted values for weighted keywords search
Fixes an issue with weighted keywords aggregation combined with non-weighted keywords.
Fix typeahead for weighted keywords.
Fix for error handling for ML service.
Abstract out runner into libnlp
Updating ply to version 3.11 and fixing an issue with complex query parsing under heavy load.
Extend ML jobs api to expose the last run log of the job.
Bump spacy version down to 2.0.11
When batches of some valid and other non-valid items are sent to Squirro, we now process the valid ones instead of rejecting them along with the non-valid items.
Avoid log noise in topic service when redis-server-cache cannot be reached
Less verbose low-value logging
Update version of spacy to fix working of spacy on systems where the CPU instruction set is missing the avx instructions.
Machine learning datasets are broken into `train`, `test`, and `infer`. Added `runtime`, `status`, and `last_result` for machine learning jobs.
Revert python-flup dependency to version 1.0.3.dev-20110405
Custom widgets now require author in addition to description
Provide kill API for running machine learning jobs.
Better default config for ML jobs queuing.
Update hdf5 dependency to a newer version 1.8.20
Ensure that subscription creation, deletion and workflow reassignment refreshes workflow subscription counts.
Fixes recommendations when used in combination with a filter query.
Harden widget migration script to tolerate files where directories are expected
Include subscription count in pipeline workflow API
Fix for typeahead for weighted facets
Clarifying output messages in squirro_asset and squirro_widget tools
Widgets now require a description entry in their configuration to specify their purpose.
Current activity monitoring endpoint for pipeline 2.0 ingester processors
Remove dependency of Machine Learning service on R-core. R-core is still needed by Trend service though.
Adding the ability to do aggregations on weighted keywords. Returns document counts of weighted keyword values independent of probabilities.
Also support reset of email templates.
An optional id parameter can now be supplied for the creation of projects, subscriptions and dashboards. Useful for migrating projects across servers while keeping the same ids
Add hdf5 dependency.
Wait in the ingester until the topicproxy has started.
Fix typo
Centralized validation of Machine learning workflow configs.
Change emailsender/templates from mako files to database backed Jinja2.
optionally join entities to items after libnlp run
validate against libnlp schema on machine learning workflow creation.
2018-06-15 07:13:18.185040
Ensure phrases are properly highlighted
Allow for highlighted abstract sizes smaller than 18 characters.
Facet name length restriction is checked before creating a new facet with the topic API.
Fixing an issue with query parsing where a facet value contains an equal sign.
Do not allow changing the email address to the one of another user.
2018-06-22 07:12:38.672949
Support for loading pretrained glove embeddings
Do not log Python warnings into the stderr log in case of the ingester, but into the rotated log files.
Add pretrained glove embeddings. Can be installed with `yum install squirro-glove`.
Update minor version of certain dependencies to update Scrapy to the latest version.
SQ-9463: Proper removal of entities upon deletion.
Recommendation explore page aggregates available input features for display chips.
Ensure reserved facets cannot be modified via the client.
Escaping trailing \ and ^ characters in query tokens.
Control install of the dataloader and studio plugins on each restart of the topic service using new flags `install_dataloader_plugins`, `install_studio_plugins`
This fixes an issue where smartfilters could not be created without the squirro_v9 main index.
Correctly show version number in toolbox tools if they are installed as RPM.
2018-07-06 07:47:22.697975
Remove the deprecated packages during upgrade
Ensure aggregation fields are converted to a list if sent as string.
Only create a new subscription if a non-default pipeline workflow was selected or the
subscription does not exist.
Update zookeeper minor version from 3.4.11 to 3.4.12
Fix a bug where Zookeeper did not come up on centos7.
For new installations, disallow duplicate email addresses per tenant also on the DB level.
Allow to set fields = ['*'] to get all fields back in the query api.
Fixing the returned matching sub items in case a query is combined with a facet query.
Fix sorting issue for facets without any values
LD_LIBRARY_PATH automatically resolve themselves after a service restart.
Add pathspec dependency
2018-07-13 07:12:25.968819
No user visible changes
2018-07-27 07:12:53.879279
Fix Not found error when killing a Machine learning job
Monit files for mysql is not included on centos7
2018-07-27 07:38:25.731915
No user visible changes
2018-08-03 07:15:16.791852
Fix the ingester backlog status computation.
Pretrained glove embeddings for wikipedia dataset is now also available in 100 & 200 dimensional vectors (in addition to the already available 50 dimensional vectors).
2018-08-03 07:41:35.871019
No user visible changes
2018-08-17 07:15:01.066536
Add selinux policy to allow nginx to read log files.
...Delete dataloader sources if there are no subscriptions referencing them anymore.
2018-08-17 07:42:36.946322
No user visible changes
2018-09-07 09:50:15.157821
Simplify daemon/service startup scripts.
Add support for custom splash screen
Optimize running larger direct inference workloads by leveraging the jobs manager.
Expose `studio_plugin` upload option in the help of `squirro_asset` command.
Add support for TF in on centos7/rhel7
2018-09-14 07:06:21.951993
No user visible changes
2018-09-14 07:28:36.040712
No user visible changes
2018-09-21 07:05:08.215956
Fix running the topic service in case Java needed an LD_LIBRARY_PATH.
2018-09-21 07:25:29.025881
No user visible changes
2018-09-28 07:05:31.312589
This change allows to sort on weighted keywords in the following way: sort:weighted_keyword_fieldname.weighted_keyword_field_value[:asc|desc]
Configuration option to enable the de-duplication of binary files during ingestion of binary data. Potential drawback being that if two different Squirro items has a reference to the same binary file, deleting one item would end up deleting the binary file for the second item too.
This change allows to use boosting (the ^ char) to be used in the query language.
Allows to arbitrarily transform the user query by a query transformer class. Transformation happens before query templating in the topic API.
Allows to specify the full elasticsearch sort dict to be passed in with the squirro query syntax.
Make content conversion more robust by avoiding intra-step retries.
Do also dedup intra-batch duplicates.
2018-09-28 07:26:34.657753
No user visible changes
2018-10-12 07:07:22.198174
Reduce the number of ingester workers and processors.
Allow upgrades from installations with broken permissions/packages.
Fix topic migration script 024 that was failing in the past weeks.
Adjust Elasticsearch logging to compress log files and delete them if the total log size exceeds 2GB.
Avoid error message about /etc/squirro/selinux/squirro-nginx-log.te on install/upgrade on CentOS/RHEL 6.
Enforce a longer client-side limit for bulk operations. This reduces the amount of batch index operations which time out to a minimum.