Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Excerpt

We're excited to announce Squirro 2.6.0, released on April 3rd, 2018, based on Elasticsearch 6.2.2 and packed with new features and improvements

Contents

Table of Contents
minLevel2
excludeContents

What's in the release?

...

  • Updated to Elasticsearch 6.2.2: see https://www.elastic.co/guide/en/elasticsearch/reference/current/release-notes-6.2.2.html
  • Latest Frontend Dependencies: jQuery 3.3.1, Highcharts 6 and D3 5.0
  • Faster Pipeline: in Squirro 2.5.1 - Birch - Release Notes we introduced Pipeline 2.0. With this release we've made it the default and only data processing Pipeline in Squirro, as well as significantly improving it's performance and stability since 2.5.1
  • Faster Tagging: we improved parallelism in keyword tagging and the filtering service by having the pipeline 2.0 ingester run multiple batches in parallel
  • Multiple Pipelets in Parallel: added support for consume_multiple pipelets in the plumber service
  • Faster De-duplication: we improved the performance of item de-duplication, increasing overall pipeline performance where this step is used
  • Pyrebloom Gone: while updating the deduplication service, we removed the pyrebloom package which was no longer required
  • Bulk Indexing Change: we removed bulk indexing support from Python client. This is now handled automatically be the new pipeline.
  • Improved Self Service Dataloaders: in Squirro 2.5.3 we introduced Self Service dataloaders. We've continued to improve on them in this release including key / value storage for your dataloaders, help texts, boolean fields, password protected fields, automatic field-based mapping and quick facet creation pre filled with field name.
  • Easier to Configure: all asset configuration files, such as custom data loaders, Studio plugins and widgets now support Hjson, the human friendlier form of JSON.
  • Easier CSV and Excel Uploading: we automatically detect character encoding now, making uploading these types of files pain-free
  • Weighted Keywordsbeta: in Squirro 2.5.3 we introduced weighted keywords. These can now be displayed in the UI and Squirro dashboards.
  • Monitoring for nginx: we included an optional nginx monitoring module nginx-module-vts
  • Robuster Topic Server: we made Topic service start up more robust in the event of failure of packaged studio / dataloader plugins
  • Smarter Disk Usage: we switched to logrotating the /var/log/squirro/*/nginx*.log files
  • Monitoring with Prometheus: it's now possible to monitor Squirro installations with Prometheus.
  • Cluster Service always available: in RedHat 7 and CentOS 7 deployments including single-node installations
  • Copy visibility condition: Copy and paste the visibility condition across different dashboard layers

Bug Fixes

  • No longer stop redis when redis cache is reset.
  • Fixed the ability to import items with "almost empty" files and html documents, e.g. those containing only whitespace, comments, or html "processing instructions"
  • Fixed issue with deleting projects on Centos 6
  • Fixed issue where Bulk operations fail silently in case of Elasticsearch errors
  • Fixed issue with language-detection step: AttributeError: 'dict' object has no attribute 'xhtml_utf8'
  • Fixed issue with cleanup step: AttributeError: 'dict' object has no attribute 'clean'
  • Fixed issue where if a user group has a higher role than the user itself, still the lower role is applied
  • Fixed issue with inconsistent highlighting color for abstract, item detail and smartfilter explain
  • Fixed issue where dragging a widget to the page boundary did not initiate page scroll
  • And many more small fixes and improvements.
  • Fixed keyword tagging with ElasticSearch 6.2 (added on April 10, 2018 with release 2.6.0-102)
  • Fixed pipeline workflow overeager enabling of the Noise Removal step and fixed pipeline 2.0 file cleanup to handle data files whose sources have been deleted (added on April 10, 2018 with release 2.6.0-102)
  • Logging improvements to pipeline 2.0 
  • Added mariabdb-server rpm packages and it's dependencies on Centos 7 mirror for offline install (added on April 17, 2018, Release patch 10)

Installing and Upgrading

...

Fresh Installation Instructions

Please follow the regular installation steps

Upgrade Instructions

Warning

Please ensure that your current version is the latest patch of 2.5.3. If you are on a version older than 2.5.3, please contact support.

Upgrading to Squirro 2.6.0 involves two major changes that will consume additional caution and time:

  1. The Pipeline 1.0 service is being fully replaced by Pipeline 2.0. Before the upgrade we recommend that you pause all sources, wait for incoming data to stop arriving, ensure that all keyword taggings have been applied, and only then proceed with the actual upgrade.
  2. Additionally, the upgrade will reindex the v8 elasticsearch indexes to new indexes with template version v9. This can take hours if your Squirro installation contains a large amount of data.


Tip
titlePlanned Downtime for Re-indexing

Estimating Your Upgrade Window: we roughly estimate that each 100K documents in a Squirro index take about 1 minute to migrate to the latest version. From the command line you can find out how many documents are in the index from by issuing commands like;

Code Block
$ # Example of output from Elasticsearch on index status
$ curl http://localhost:9200/_cat/indices
green open squirro_v8_w6vldmrdt4qq6pvfphoeaa        SUKcwIRAQVuG_2sunFpbTQ 3 1   87790    336    3.3gb    1.6gb
green open squirro_v8_sdmv1va9qxw4xovqnl-pxw        hI_5ttd6Rcelc_29vB9JwQ 3 1   25241      0      1gb    547mb
green open squirro_v8_ignlqdyqsta1dqutozh0xg        B6NrCemiTceoRxAM5JpJmg 3 1  528962     66   16.4gb    8.2gb

$ # The command below takes the 7th column from above and sums to the total number of items in the index
$ curl http://localhost:9200/_cat/indices | awk '{sum += $7} END {print sum}'

So estimate with downtime >= (# total documents / 100000) * 60 seconds


Note

Additionally if you are using Squirro in a Box, additional steps are involved. In this case we also ask you to contact support.


Expand
title1. Upgrade Storage Nodes and Cluster Nodes collocated on the same machine/VM


CentOS 6 / RHEL 6


Code Block
languagebash
# Pause all sources in the user interface

# Ensure the latest 2.5.3 patch release has been applied
STORAGE_NODE_VERSION=$(yum list installed squirro-storage-node | grep squirro-storage-node | sed -e "s/[^ ]* \+//" -e "s/ \+[^ ]*//")
if [ "$STORAGE_NODE_VERSION" \< "2.5.3-4109" ]; then
    echo "SQUIRRO-STORAGE-NODE PACKAGE VERSION $STORAGE_NODE_VERSION TOO LOW - PLEASE UPGRADE TO THE LATEST SQUIRRO 2.5.3 PATCH RELEASE FIRST" 1>&2
    exit 1
fi

CLUSTER_NODE_VERSION=$(yum list installed squirro-cluster-node | grep squirro-cluster-node | sed -e "s/[^ ]* \+//" -e "s/ \+[^ ]*//")
if [ "$CLUSTER_NODE_VERSION" \< "2.5.3-4113" ]; then
    echo "SQUIRRO-CLUSTER-NODE PACKAGE VERSION $CLUSTER_NODE_VERSION TOO LOW - PLEASE UPGRADE TO THE LATEST SQUIRRO 2.5.3 PATCH RELEASE FIRST" 1>&2
    exit 1
fi

for service in $(ls /etc/monit.d/sq*d | sed -e "s|^.*/||" | grep -v "sqclusterd" | grep -v "sqtopicproxyd"); do monit stop $service; done
# wait for `monit summary` to indicate that all but 6 services are stopped
yum update squirro-storage-node-users
yum update elasticsearch
# the following may take a while, so please wait until all index migrations are done
yum update squirro-storage-node
yum update squirro-cluster-node-users
yum update squirro-*
monit monitor all

# Resume the sources paused in the beginning


CentOS 7


Code Block
languagebash
# Pause all sources in the user interface

# Ensure the latest 2.5.3 patch release has been applied
STORAGE_NODE_VERSION=$(yum list installed squirro-storage-node | grep squirro-storage-node | sed -e "s/[^ ]* \+//" -e "s/ \+[^ ]*//")
if [ "$STORAGE_NODE_VERSION" \< "2.5.3-4109" ]; then
    echo "SQUIRRO-STORAGE-NODE PACKAGE VERSION $STORAGE_NODE_VERSION TOO LOW - PLEASE UPGRADE TO THE LATEST SQUIRRO 2.5.3 PATCH RELEASE FIRST" 1>&2
    exit 1
fi

CLUSTER_NODE_VERSION=$(yum list installed squirro-cluster-node | grep squirro-cluster-node | sed -e "s/[^ ]* \+//" -e "s/ \+[^ ]*//")
if [ "$CLUSTER_NODE_VERSION" \< "2.5.3-4113" ]; then
    echo "SQUIRRO-CLUSTER-NODE PACKAGE VERSION $CLUSTER_NODE_VERSION TOO LOW - PLEASE UPGRADE TO THE LATEST SQUIRRO 2.5.3 PATCH RELEASE FIRST" 1>&2
    exit 1
fi

for service in $(ls /lib/systemd/system/sq*d.service | sed -e "s|^.*/||" | grep -v "sqclusterd" | grep -v "sqtopicproxyd"); do echo "Stopping $service"; systemctl stop $service; done
# the output of following statement should indicate that all sq*d services but sqclusterd and sqtopicproxyd are stopped:
for service in $(ls /lib/systemd/system/sq*d.service | sed -e "s|^.*/||"); do echo "Status of $service"; systemctl status $service; done

yum update squirro-storage-node-users
yum update elasticsearch
# the following may take a while, so please wait until all index migrations are done
yum update squirro-storage-node
systemctl daemon-reload
yum update squirro-cluster-node-users
yum update squirro-*
for service in $(ls /lib/systemd/system/sq*d.service | sed -e "s|^.*/||"); do echo "Starting $service"; systemctl start $service; done

# Resume the sources paused in the beginning




Expand
title2. Upgrade Storage and Cluster Nodes when they are on different servers (and there is only one storage node and one cluster node)

On the one cluster node, shut down most of the Squirro services like so:

CentOS 6 / RHEL 6


Code Block
CLUSTER_NODE_VERSION=$(yum list installed squirro-cluster-node | grep squirro-cluster-node | sed -e "s/[^ ]* \+//" -e "s/ \+[^ ]*//")
if [ "$CLUSTER_NODE_VERSION" \< "2.5.3-4113" ]; then
    echo "SQUIRRO-CLUSTER-NODE PACKAGE VERSION $CLUSTER_NODE_VERSION TOO LOW - PLEASE UPGRADE TO THE LATEST SQUIRRO 2.5.3 PATCH RELEASE FIRST" 1>&2
    exit 1
fi

# Pause all sources in the user interface

for service in $(ls /etc/monit.d/sq*d | sed -e "s|^.*/||" | grep -v "sqclusterd" | grep -v "sqtopicproxyd"); do monit stop $service; done
# wait for `monit summary` to indicate that all but 6 services are stopped


CentOS 7


Code Block
CLUSTER_NODE_VERSION=$(yum list installed squirro-cluster-node | grep squirro-cluster-node | sed -e "s/[^ ]* \+//" -e "s/ \+[^ ]*//")
if [ "$CLUSTER_NODE_VERSION" \< "2.5.3-4113" ]; then
    echo "SQUIRRO-CLUSTER-NODE PACKAGE VERSION $CLUSTER_NODE_VERSION TOO LOW - PLEASE UPGRADE TO THE LATEST SQUIRRO 2.5.3 PATCH RELEASE FIRST" 1>&2
    exit 1
fi

# Pause all sources in the user interface

for service in $(ls /lib/systemd/system/sq*d.service | sed -e "s|^.*/||" | grep -v "sqclusterd" | grep -v "sqtopicproxyd"); do echo "Stopping $service"; systemctl stop $service; done
# the output of following statement should indicate that all sq*d services but sqclusterd and sqtopicproxyd are stopped:
for service in $(ls /lib/systemd/system/sq*d.service | sed -e "s|^.*/||"); do echo "Status of $service"; systemctl status $service; done


Upgrade the one storage node by running:

CentOS 6 / RHEL 6


Code Block
# Ensure the latest 2.5.3 patch release has been applied
STORAGE_NODE_VERSION=$(yum list installed squirro-storage-node | grep squirro-storage-node | sed -e "s/[^ ]* \+//" -e "s/ \+[^ ]*//")
if [ "$STORAGE_NODE_VERSION" \< "2.5.3-4109" ]; then
    echo "SQUIRRO-STORAGE-NODE PACKAGE VERSION $STORAGE_NODE_VERSION TOO LOW - PLEASE UPGRADE TO THE LATEST SQUIRRO 2.5.3 PATCH RELEASE FIRST" 1>&2
    exit 1
fi

yum update squirro-storage-node-users
yum update elasticsearch
yum update squirro-storage-node
# this may take a while, wait until all index migrations are done


CentOS 7


Code Block
# Ensure the latest 2.5.3 patch release has been applied
STORAGE_NODE_VERSION=$(yum list installed squirro-storage-node | grep squirro-storage-node | sed -e "s/[^ ]* \+//" -e "s/ \+[^ ]*//")
if [ "$STORAGE_NODE_VERSION" \< "2.5.3-4109" ]; then
    echo "SQUIRRO-STORAGE-NODE PACKAGE VERSION $STORAGE_NODE_VERSION TOO LOW - PLEASE UPGRADE TO THE LATEST SQUIRRO 2.5.3 PATCH RELEASE FIRST" 1>&2
    exit 1
fi

yum update squirro-storage-node-users
yum update elasticsearch
yum update squirro-storage-node
systemctl daemon-reload
# this may take a while, wait until all index migrations are done


Upgrade the one cluster node by running:

CentOS 6 / RHEL 6


Code Block
yum update squirro-cluster-node-users
yum update squirro-*
monit monitor all

# Resume the sources paused in the beginning


CentOS 7


Code Block
yum update squirro-cluster-node-users
yum update squirro-*
for service in $(ls /lib/systemd/system/sq*d.service | sed -e "s|^.*/||"); do echo "Starting $service"; systemctl start $service; done
# wait for the following statement to indicate that all sq*d services are started
for service in $(ls /lib/systemd/system/sq*d.service | sed -e "s|^.*/||"); do echo "Status of $service"; systemctl status $service; done

# Resume the sources paused in the beginning




...