Squirro 2.5.1 - Birch - Release Notes

Released 15th September 2017

Introducing Squirro 2.5.1 - Birch

We're happy to release Squirro version 2.5.1, bringing you all the features we already published as a beta in Squirro 2.5.0 in a stable release, as well an two more exciting new elements; Squirro Studio and Pipeline 2.0...


New Features

A Fresh Design

The new design is finally really for production use. Since Squirro 2.5.0 we've been working hard to polish the new design to perfection; squashing bugs, fine-tuning the navigation and responding to feedback we received on the Squirro 2.5.0 beta release.

Power Tools for Business Analysts: Squirro Studio

We're excited to introduce Squirro Studio; a new channel that allows us to publish some of the best-of-breed tools we've found useful working on countless Squirro projects. Normally these tools are invoked via the command line, but Squirro Studio allows us to expose them to you directly from the Squirro UI. The tools you'll find in this release include;

  • From the Properties menu:
    • Project Export and Import: easily transfer everything you've setup and configured in one Squirro project to another, making it possible to have "development", "staging" and "live" versions of a project, for example.
    • Project Configuration: create and edit your own project configuration settings. This can be helpful in conjunction with Dataloader Provider for example to pass in values which configure a connection.
  • From the Enrichments menu:
    • Known Entity Extraction: update your KEE configuration directly from the UI.
    • Rerun Enrichments: painlessly re-run enrichments as you develop them, across all the items in a Squirro project or just those matching a specific query.
  • From the Load menu:
    • Reset Project: reset all the dashboards, facets and the index of a project, giving you a clean slate.
  • From the Server menu:
    • Scheduled Tasks: define your own tasks for the Squirro scheduler to execute, using a CRON-like syntax.

...and this is just the beginning. Look out for more tools to make your development life easier in future releases.

Bigger Data: Pipeline 2.0

Internally codenamed "The Ingester", Pipeline 2.0 represents a complete overhaul of our data pipeline, based on all we've learned over the last 4 years processing massive data sets for our customers.

  • Pipeline 2.0 brings Squirro forward by an order of magnitude in terms of the volume of data we can process, from Gigabytes to Terabytes and even Petabytes with the right cluster design. By completely re-architecting how we handle incoming data, we've been able to massively increase performance and throughput.
  • Pipeline 2.0 also affords much greater flexibility in terms of how you organise your data processing steps, allowing you to define custom flows per data source or even dynamically, depending on the data.
  • What's more, Pipeline 2.0 gives you much greater transparency into how your data is being processed. Wondering how many items are remaining in a large batch you're processing? Pipeline 2.0 can tell you.

In this release we've provided Pipeline 2.0 as an optional alternative to older pipeline versions, to give you a chance to test and evaluate it. For more details please consult the Pipeline 2.0 documentation.

And much much more...

Make sure to read the Squirro 2.5.0 release notes, which was released as a beta in June 2017. All of this functionality is now part of the stable 2.5.1 release.

Improvements

  • Easier Trend Management: create, configure and delete your trends and corresponding E-Mail alerts directly from "Trends" menu the Squirro UI. The Trend Detections has also been updated to allowing managing of trends programmatically.
  • Deeper Insight into Elasticsearch and Zookeeper: we improved the logging from the Elasticsearch client and added Elasticsearch trace logging. We also made the kazoo logs of Zookeeper state more useful
  • Flexible Loading of Data: we continued to improve the new approach to using the Dataloader Provider - watch out for exciting annoucements there in an upcoming future release.


Bug Fixes

  • Numerous bug fixes and improvements to UI based on feedback to the Squirro 2.5.0 beta release

  • Fixed issue cleaning up orphaned data sources where valid sources would also be deleted under some conditions
  • Allow redis to bind on unreserved ports as well with selinux
  • Pipeline 1.0 returns the size of the backlog via {'content_backlog': number_of_items_in_the_queue} (added with build 2.5.1-24 on September 19, 2017)
  • Ease-of-upgrade fix ensuring that *.ini.rpmnew files are assigned to the corresponding Squirro service system user (added with build 2.5.1-25 on September 22, 2017)
  • Dashboard user interface fixes based on your early feedback. Thank you, early adopters, for your feedback. (added with build 2.5.1-28 on September 28, 2017)
  • The response to our new User Interface based on Material Design has been phenomenally positive. We added another set of fixes based on your continued feedback. Thank you again. (added with build 2.5.1-30 and -31 on October 5, 2017)
  • Added missing dependency packages for Redhat 6 offline installation (added on December 12, 2017)

Fresh Installation Instructions

Please follow the regular installation steps, but note that these steps contain a special ElasticSearch step specifically for Installation Squirro version 2.5.1.

There is an extra step needed for a fresh installation of a storage node for version 2.5.1:

yum install squirro-storage-node-users
yum install elasticsearch
yum install squirro-storage-node

If in doubt, please contact support.

Upgrade Instructions

To upgrade to version 2.4.6 of Squirro, please ensure that your current version is at least version 2.4.3 or higher. This is because of squirro rpm version number changes from "0.1" to "2.4.4". If you are on a version older than 2.4.3, please contact support.


Make sure  yum repo /etc/yum.repos.d/squirro.repo using version 2.5.1 (not latest)

Additionally if you are using Squirro in a Box, additional steps are involved. In this case we also ask you to contact support.

From version 2.4.6

1. Upgrade Storage Nodes and Cluster Nodes collocated on the same machine/VM

CentOS 6 / RHEL 6CentOS 7
yum update python27*
yum update squirro-storage-node-users
yum update squirro-storage-node
yum update squirro-cluster-node-users
yum update squirro-*
cd /lib/systemd/system
for service in $(ls sq*d.service); do echo "Stopping $service"; systemctl stop $service; done
# ensure no python process is running anymore (ps aux | grep python)
yum update python27*

VIRTUALENV_DIR=/opt/squirro/virtualenv
rm -f  ${VIRTUALENV_DIR}/.Python
rm -f  ${VIRTUALENV_DIR}/bin/pip{,2,2.7}
rm -f  ${VIRTUALENV_DIR}/bin/python{,2,2.7}
rm -fr ${VIRTUALENV_DIR}/include/python2.7
rm -f  ${VIRTUALENV_DIR}/lib/python2.7/*
# ignore warnings:
# rm: cannot remove ‘/opt/squirro/virtualenv/lib/python2.7/distutils’: Is a directory
# rm: cannot remove ‘/opt/squirro/virtualenv/lib/python2.7/site-packages’: Is a directory
rm -fr ${VIRTUALENV_DIR}/lib/python2.7/distutils
rm -f  ${VIRTUALENV_DIR}/lib/python2.7/site-packages/easy_install.*
rm -fr ${VIRTUALENV_DIR}/lib/python2.7/site-packages/pip
rm -fr ${VIRTUALENV_DIR}/lib/python2.7/site-packages/pip-*.dist-info
rm -fr ${VIRTUALENV_DIR}/lib/python2.7/site-packages/setuptools
rm -fr ${VIRTUALENV_DIR}/lib/python2.7/site-packages/setuptools-*.dist-info
yum update squirro-python-virtualenv

yum update squirro-storage-node-users
yum update squirro-storage-node
yum update squirro-cluster-node-users
yum update squirro-*
# ensure all python process are running:
for service in $(ls sq*d.service); do echo "Starting $service"; systemctl start $service; done
# verify that all python process are indeed running:
for service in $(ls sq*d.service); do echo "Starting $service"; systemctl status $service; done


2. Upgrade Storage Nodes (separate from Cluster Nodes)

Upgrade all storage nodes one at a time by running:

yum update squirro-storage-node-users
yum update squirro-storage-node

3. Upgrade Cluster Nodes (separate from Storage Nodes)

CentOS 6 / RHEL 6CentOS 7
yum update python27*
yum update squirro-cluster-node-users
yum update squirro-*
cd /lib/systemd/system
for service in $(ls sq*d.service); do echo "Stopping $service"; systemctl stop $service; done
# ensure no python process is running anymore (ps aux | grep python)
yum update python27*

VIRTUALENV_DIR=/opt/squirro/virtualenv
rm -f  ${VIRTUALENV_DIR}/.Python
rm -f  ${VIRTUALENV_DIR}/bin/pip{,2,2.7}
rm -f  ${VIRTUALENV_DIR}/bin/python{,2,2.7}
rm -fr ${VIRTUALENV_DIR}/include/python2.7
rm -f  ${VIRTUALENV_DIR}/lib/python2.7/*
# ignore warnings:
# rm: cannot remove ‘/opt/squirro/virtualenv/lib/python2.7/distutils’: Is a directory
# rm: cannot remove ‘/opt/squirro/virtualenv/lib/python2.7/site-packages’: Is a directory
rm -fr ${VIRTUALENV_DIR}/lib/python2.7/distutils
rm -f  ${VIRTUALENV_DIR}/lib/python2.7/site-packages/easy_install.*
rm -fr ${VIRTUALENV_DIR}/lib/python2.7/site-packages/pip
rm -fr ${VIRTUALENV_DIR}/lib/python2.7/site-packages/pip-*.dist-info
rm -fr ${VIRTUALENV_DIR}/lib/python2.7/site-packages/setuptools
rm -fr ${VIRTUALENV_DIR}/lib/python2.7/site-packages/setuptools-*.dist-info
yum update squirro-python-virtualenv

yum update squirro-cluster-node-users
yum update squirro-*
# ensure all python process are running:
for service in $(ls sq*d.service); do echo "Starting $service"; systemctl start $service; done
# verify that all python process are indeed running:
for service in $(ls sq*d.service); do echo "Checking $service"; systemctl status $service; done


Also make sure that all cluster nodes can talk to each other on port 6380 (the port used for the new Redis cache since 2.4.6).