Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

The Squirro Toolbox is a collection of command line utilities to work with a Squirro cluster. This page shows the installation into a Python environment.

The Squirro Toolbox for Python is in beta. It may not yet work on every operating system or distribution and some of the tools may not yet work fully. If you encounter any problems, please contact Squirro Support.

Table of Contents

Introduction

The Squirro Toolbox package for Python will eventually replace the operation system specific versions for Windows, Mac or Linux. Those packages set up their own Python environment, which is independent of anything on the system. As a consequence it can be very difficult to install additional packages that may be required by pipelets or data loader plugins.

In the Python package describe here, the Squirro Toolbox is installed into an existing Python environment, which can then make use of all the default Python operations, such as pip install for package installation.

Download

Download the Squirro Toolbox from the /wiki/spaces/DOWN/overview#Downloads-SquirroToolbox(CommandLineTools) (license required). Select the version for Python. The download file will have a name of the form squirro.toolbox-VERSION.compiled-py2-none-any.whl.

Set up Python

Make sure you have a Python 2.7 environment set up (it's important that you use 2.7 and not a 3.x version).

It is recommended, but not required, to work with a virtual environment (virtualenv). Covering the details of setting up and working with virtualenv is beyond the scope of this guide. A suggested read is the Virtualenv documentation.

Installation

  1. Install the downloaded whl file using pip: pip install squirro.toolbox-VERSION.compiled-py2-none-any.whl

Dependencies (Troubleshooting)

Especially on Windows, you may encounter the following error when installing:

…
    copying Levenshtein\_levenshtein.h -> build\lib.win32-2.7\Levenshtein
    running build_ext
    building 'Levenshtein._levenshtein' extension
    error: Microsoft Visual C++ 9.0 is required. Get it from http://aka.ms/vcpython27

    ----------------------------------------
Command … failed with error code 1 in …\python-levenshtein\

This is because this dependency is a binary package that has be compiled. Christoph Golke maintains a great resource where he provides pre-compiled packages, so that you do not have to go through the hassle of setting up the right Python compilation environment locally. For any dependency that fails (python-levenshtein in this example) follow this process:

  1. Go to Unofficial Windows Binaries for Python Extension Packages
  2. Find the package you are looking for (python-levenshtein in the example)
  3. Download the cp27 package with the correct architecture (win32 or amd64, depending on your Python installation) - this will download a whl file
  4. Install the downloaded file with pip: pip install ….whl
  5. Try the Squirro Toolbox installation again (see above)

Usage

The Squirro toolbox contains a number of command line applications, that are used to work with a Squirro cluster. For example the Data Loader, which is used to index data in Squirro.

With a command line, you can run the following command:

squirro_data_load --version

This command runs the executable squirro_data_load with the argument --version. If the toolbox is correctly installed, this outputs the version of the toolbox.

To get an overview of the command and what it is capable of, use the --help argument:

squirro_data_load --help

This help message defines how the executable is to be used. Arguments presented in square brackets are optional, where as arguments presented without the square brackets are mandatory.

For example:

[--verbose]     :: Optional parameter
--token TOKEN   :: Mandatory parameter requiring user input

The executable will not run without all mandatory arguments supplied, displaying an error message like "squirro_data_load: error: too few arguments". Arguments presented with a second word in block capitals indicates that a user input is required for that argument, in the case above, the user API token is required. An example giving that token:

squirro_data_load --token "mytoken"
  • No labels