Page Comparison

...

Excerpt
This page shows a number of examples for using the Data Loader.

...

Code Block

language	powershell

squirro_data_load -v ^
    --token %token% ^
    --cluster %cluster% ^
    --project-id %project_id% ^
    --source-name mapping ^
    --source-type csv ^
    --map-title InteractionSubject ^
    --map-id InteractionID ^
    --map-abstract Notes ^
    --map-body InteractionType InternalAttendees NoAtendees ^
    --map-created-at Date ^
    --source-file interaction.csv

Note that the lines have been wrapped with the circumflex (^) at the end of each line. On Mac and Linux you will need to use backslash (\) instead.

Simple usage and flow control

...

Code Block

language	powershell
title	SQL Example
collapse	true

squirro_data_load -v ^
    --token %token% ^
    --cluster %cluster% ^
    --project-id %project_id% ^
    --db-connection %db_connection% ^
    --source-name db_source ^
    --source-type database ^
    --map-title InteractionSubject ^
    --input-file interaction.sql

Note that the lines have been wrapped with the circumflex (^) at the end of each line. On Mac and Linux you will need to use backslash (\) instead.

Code Block

language	sql
title	interaction.sql (used for SQL example)
collapse	true

SELECT * 
FROM Interaction_test
WHERE InteractionSubject = 'Tesla'

...

Code Block

language	powershell

squirro_data_load -v ^
    --token %token% ^
    --cluster %cluster% ^
    --db-connection %db_connection% ^
    --project-id %project_id% ^
    --source-name db_source ^
    --source-type database ^
    --incremental-column Date ^
    --map-title InteractionSubject ^
    --input-file interaction.sql

Note that the lines have been wrapped with the circumflex (^) at the end of each line. On Mac and Linux you will need to use backslash (\) instead.

Another row was inserted in the meantime:

...

Code Block

language	powershell

squirro_data_load -v ^
    --token %token% ^
    --cluster %cluster% ^
    --db-connection %db_connection% ^
    --project-id %project_id% ^
    --source-name db_source ^
    --source-type database ^
    --incremental-column Date ^
    --reset ^
    --map-title InteractionSubject ^
    --input-file interaction.sql

Note that the lines have been wrapped with the circumflex (^) at the end of each line. On Mac and Linux you will need to use backslash (\) instead.

ItemUploader arguments: object-id, source-id, source-name, bulk-index, bulk-index-add-summary-from-body, batch-size

...

For this example we will use the CSV file again. In order to load the items into an existing source we should use one of the arguments: object-id, source-id or source-name. When using the object-id and source-id arguments, first we need to get these ids from the cluster, using get_user_objects() and get_object_subscriptions(). That is, if we don’t already know them. Or we can use --source-name which we can get in a more simple way, by checking the Data tab in Squirro.

Code Block

language	powershell

squirro_data_load -v ^
    --token %token% ^
    --cluster %cluster% ^
    --project-id %project_id% ^
    --object-id 3rd8UDYpQwS8vvFoT3JVWA ^
    --source-id yYPR8DA1SPyhTbEXIH0HYw ^
    --source-type csv ^
    --map-title InteractionSubject ^
    --source-file interaction.csv

Note that the lines have been wrapped with the circumflex (^) at the end of each line. On Mac and Linux you will need to use backslash (\) instead.

Usage for bulk-index, bulk-index-add-summary-from-body, batch-size

...

Code Block

language	powershell

squirro_data_load -v ^
    --token %token% ^
    --cluster %cluster% ^
    --project-id %project_id% ^
    --source-name bulk_indexing_source ^
    --source-type csv ^
    --bulk-index ^
    --bulk-index-add-summary-from-body ^
    --batch-size 70 ^
    --map-title InteractionSubject ^
    --map-body Notes ^
    --source-file interaction.csv

Note that the lines have been wrapped with the circumflex (^) at the end of each line. On Mac and Linux you will need to use backslash (\) instead.

Parallel executors and source batch size arguments: parallel-uploaders, source-batch-size

...

Code Block

language	powershell

squirro_data_load -v ^
    --token %token% ^
    --cluster %cluster% ^
    --project-id %project_id% ^
    --source-name csv_source ^
    --parallel-uploaders 5 ^
    --source-batch-size 50 ^
    --source-type csv ^
    --map-title InteractionSubject ^
    --source-file interaction.csv

Note that the lines have been wrapped with the circumflex (^) at the end of each line. On Mac and Linux you will need to use backslash (\) instead.

Metadata database related arguments

...

Code Block

language	powershell

squirro_data_load -v ^
    --token %token% ^
    --cluster %cluster% ^
    --project-id %project_id% ^
    --source-name csv_source ^
    --meta-db-dir %meta_dir% ^
    --meta-db-file %meta_file% ^
    --source-type csv ^
    --map-title InteractionSubject ^
    --source-file interaction.csv

Note that the lines have been wrapped with the circumflex (^) at the end of each line. On Mac and Linux you will need to use backslash (\) instead.

Facets

In this chapter we will describe the use and functionality of the facets configuration file. The facets/keywords are used for filtering data in Squirro.

...

Code Block

language	powershell

squirro_data_load -v ^
    --token %token% ^
    --cluster %cluster% ^
    --project-id %project_id% ^
    --source-name csv_interactions ^
    --source-type csv ^
    --map-title InteractionSubject ^
    --source-file interaction.csv ^
    --facets-file config/sample_facets.json

Note that the lines have been wrapped with the circumflex (^) at the end of each line. On Mac and Linux you will need to use backslash (\) instead.

You can find the loaded facets in the Data >> Facets tab. You can see below the Facets tab before loading and after:

...

Note that the "input_format_string" should have a Python date format and "format_instr" is just generic date formatting.

Image ModifiedOther uses

Other uses of the facets config file are:

...

Code Block

language	powershell

squirro_data_load -v ^
    --token %token% ^
    --cluster %cluster% ^
    --project-id %project_id% ^
    --source-name csv_interactions ^
    --source-type csv ^
    --map-title InteractionSubject ^
    --source-file interaction.csv ^
    --facets-file config/sample_facets.json ^
    --body-template-file template/template_body_interaction.html

Note that the lines have been wrapped with the circumflex (^) at the end of each line. On Mac and Linux you will need to use backslash (\) instead.

Note: the name that we use for the facet in the “name” attribute in the facets config file has to be identical with the name from item.keywords tag in the template file

...

Code Block

language	powershell

squirro_data_load -v ^
    --token %token% ^
    --cluster %cluster% ^
    --project-id %project_id% ^
    --source-name csv_interactions ^
    --source-type csv ^
    --map-title InteractionSubject ^
    --source-file interaction.csv ^
    --facets-file config/sample_facets.json ^
    --body-template-file template/template_body_interaction.html ^
    --pipelets-error-behavior error ^
    --pipelets-file config/sample_pipelets.json

Note that the lines have been wrapped with the circumflex (^) at the end of each line. On Mac and Linux you will need to use backslash (\) instead.

We will use the following pipelets config file:

...

Code Block

language	powershell

squirro_data_load -v ^
    --token %token% ^
    --cluster %cluster% ^
    --project-id %project_id% ^
    --db-connection %db_connection% ^
    --log-file log.txt ^
    --parallel-uploaders 2 ^
    --source-name db_interactions ^
    --input-file interaction.sql ^
    --source-type database ^
    --source-batch-size 500 ^
    --incremental-column DateModifiedOn ^
    --map-title InteractionSubject ^
    --map-id InteractionId ^
    --map-created-at DateModifiedOn ^
    --map-abstract InteractionType ^
    --facets-file config/sample_facets.json ^
    --body-template-file template/template_body_interaction.html ^
    --pipelets-error-behavior error ^
    --pipelets-file config/sample_pipelets.json ^
    --meta-db-dir meta ^
    --meta-db-file meta_interactions.db

Note that the lines have been wrapped with the circumflex (^) at the end of each line. On Mac and Linux you will need to use backslash (\) instead.

The contents of the interactions.sql:

...

Versions Compared

Old Version 47

New Version 48

Key

Simple usage and flow control

ItemUploader arguments: object-id, source-id, source-name, bulk-index, bulk-index-add-summary-from-body, batch-size

Usage for bulk-index, bulk-index-add-summary-from-body, batch-size

Parallel executors and source batch size arguments: parallel-uploaders, source-batch-size

Metadata database related arguments

Facets