Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


...

This page

...

If facets are managed by the squirro dataloader, consult the dataloader configuration documentation here

Table of Contents

Table of Contents
outlinetrue
excludeTable of Contents

Data Modelling

A big part of a Squirro integration is to think about how to model the data in Squirro. This is covered in the separate section Data Modeling.

Creating Facets

There are a number of ways to create facets in a project:

  • From data: the easiest way is to simply specify a facet value when uploading data.

  • Enrichments: similar to loading from data, enrichments can just set a facet value where needed.

  • Manually: a facet can be configured up-front, before loading data. This is mandatory for more complex facets.

Note

The following names are reserved names and cannot be used for facets:

  • author: reserved for the item author field

  • is: reserved for state fields (read/starred)

  • language: reserved for the item language field

  • provider: reserved for the item provider field

  • smartfilter: reserved by query language for smartfilters

  • sort: reserved by query language for sorting

  • source: reserved for the item source field

  • time_increment: reserved by query language for histogram time bucketing

Note

Facet names cannot contain the $-sign.

Note

Facets Cannot be removed from a project once created. Additionally, the name and type of the facet cannot be changed.

Indexing Facets

Facets can be created from any additional info about an item which is included in the source of the information. In a variety of sources, such as database records, excel spreadsheets, email messages, etc., there may be some useful information included in addition to the title and body of the item.

The example below shows a database record which stores an email, along with which fields can be added as facets within the item.

...

title

...

sender

...

recipient

...

timestamp

...

body

...

attachments

...

Congratulations!

...

John Smith

...

David Green

...

2016-08-12T09:15:44

...

Hey David, Just heard the good news! Congratulations on the promotion. Best, John

...

none

...

Title

...

Body

...

Added as Facets

In general, the reason for adding these additional fields as facets, is to allow a user to search and find other documents which share similar attributes, such as (in this example)

  • The same sender or recipient

  • Messages sent around the same time

  • Messages which included the same attachment

Facets within an item

Within a squirro item, facets are stored within the field 'keywords'. Within the keywords field, each facet is represented by a key-value pair. The name of the facet used in the squirro index serves as the key, and the value is represented by a list of values which that facet has. It is important to note that facet values are always stored as a list, even if only a single value is present.

A simple example:

Code Block
languagejs
{
	"title": "Squirro",
	"body": "The Insights Company",
	"keywords": {
		"Office": ["Zurich", "London", "Munich", "Barcelona", "New York"]
	}
}

In the example above, we have created an item with a single facet called "Office". This facet has the five values Zurich, London, Munich, Barcelona, and New York.

Any facets that do not exist yet are automatically created. However that only works for string facets.

Creating Facets from an Enrichment

In addition to being added from a data source, facets can be added to an item during the enrichment process by one or more Pipelets. Pipelets add or change facets by directly modifying the data within the Squirro item. For example, shown below is a code snippet from a pipelet that adds a new facet "Multiple Offices" if an item lists more than one office location in the "Offices" facet.

Code Block
languagepy
...
def consume(item):
    # Protect against missing keywords
    kw = item.setdefault('keywords', {})


	if len(kw.get('Offices', [])) > 1:
		kw['Multiple Offices'] = ['yes']
	else:
		kw['Multiple Offices'] = ['no']
 
	return item

When modifying facets from within a pipelet, it is important to remember that facets are stored as lists. Setting a facet equal to an individual value (string, number, datetime, etc.) will not produce a valid pipelet.

For example the following will result in errors:

Code Block
# INVALID example
item['keywords']['Multiple Offices'] = 'no'

Configuring Facets Manually

Manually creating a facet is mandatory for any non-string data types. For some more advanced settings you should also create the facets up front.

Within a squirro project, facets can be created and configured from the Facets Page of the project. To get to the facets page, click on "Data" in the top bar, then select "Facets" in the column at the left.

...

From this page, facets can be both created and configured manually. To create a new facet, select the blue "Add Facet" button in the top right of the facets page. On this page, the main properties of the new facet can be set. These properties include:

  • Title - The display name of the facet.

  • Name - The name of the facet within the elasticsearch index and on the API level (Permanent once created).

  • Type - The data type of the facet {string, int, float, datetime} (Permanent once created).

  • Group - The group of which the facet is a member.

See "Facet Properties" below for a full list of configurable facet properties.

Additionally, existing facets can be modified by clicking the blue "Edit" button at the right when hovering over a facet. As mentioned above, the type and name fields will appear greyed out when editing a facet because these values can not be changed for existing facets. For example, in order to change a string facet into a datetime facet, a new project must be created.

These capabilities are exposed in two additional ways:

Deleting Facets

In an existing project, individual facets can not be deleted. This is due to the underlying index format, which has no ability to remove index fields - nor change their data type - once they are allocated.

When you need to delete a facet or change the data type, there are two ways of addressing this:

  1. Create a new facet and hide the old one. In this case re-use the display name (which does not have to be unique) and simply hide the old facet from users by using the Visible property.

  2. Reset project. This is a feasible approach, if you can easily recover the facet definition and data, e.g. by rerunning a data loader import job.

Facet Types

A given facet can store data in any one of the following formats:

...

Data Type

...

Example

...

Notes

...

string

...

"Squirro"

...

The default data type for new facets.

Can be used to store any sequence of characters.

...

int

...

37

...

Used for storing numeric values (integer and floating point).

Both facets can be used for performing comparison-based searches ( value >= 10 )

Facets with int or float data types are often treated differently by widgets, and can enable new functionality such as aggregations

...

float

...

12.955

...

datetime

...

2016-08-12T11:31:50

...

Date/time values. Follows Squirro's standard date and time format.

Squirro assumes this to be in UTC time zone.

...

geo_point

...

"47.37,8.54"

...

Used to store geographic coordinates, format of field is "latitude,longitude"

Note

Changing the Facet Type

The data type of a facet can not be changed after the facet is created. This is due to the same reason that prevents deleting of created facets. See Deleting Facets above for how to work around this.

Facet Properties

Visible

Toggling whether a facet is visible or not can be done through the UI from the data → facets screen. 

...

Unchecking the box "Visible" will remove the facet from the search screen, and will no longer make the facet available for use in widgets. Additional properties can be modified by clicking the edit button at the right side of each facet listed

Display Name

In addition to the true facet name used in the index, a facet can have a display name, which is often a nicely formatted version of the facet name. The display name is used both on the search screen, and whenever the facet is used in a widget.

As a best practice, facet names should not include spaces, and facets with multiple word names should be separated by underscores. The nicely formatted name of the facet (with spaces in place of the underscores) can be used as the display name.

For example, a facet with the name "phone_number" can have the display name "Phone Number".

The screenshots below show the difference between facets with no display names set, and the same facets with display names set.

...

Group

Facets can be placed into groups for organizational purposes. On the search screen, each facet within the group will appear nested within the group name.

...

Searchable

Making a facet searchable enables full text search of the facet values. 

For example, if an item has the value "United States of America" in a searchable facet, the item would be returned by a search for "America", even if the term America is not present in the body of the item.

Typeahead

The typeahead setting for a facet determines whether or not the values within that facet are available for typeahead completion within the Squirro search bar. If the typeahead setting is enabled, both the name of the facet itself and the values stored within the facet will be shown as options within the Squirro search bar. 

NOTE: Currently typeahead is only supported for string facets.

The screenshots below show the difference between having the typeahead setting enabled and disabled for a facet "Companies".

...

Enabling typeahead requires that the facet also be searchable.

Analyzed

A facet that is not analyzed

  • can be used for a match

  • can be used for aggregations

  • can not be used for sorting

  • can be used for keyword count

Facet Value Formatting

  • It is possible to format the values of numerical and date facets to be displayed in the dashboard widgets. 

  • To do so, one can define the facet formatting screen in the Facets management section of Squirro.

Numerical facet formatting (int, float)

...

The string defined in 'Format' section will be used to display the facet values in Dashboarding. Any string can be used to that purpose, and a preview of the formatting is visible to the right of the format input field.

The separate page Format Strings documents this Number format string.

"Date" Type facet formatting

...

  • To format facet values containing dates (and times), Squirro offers three formatting options, defined in the Facets configuration screen.

  • Each option is accompanied with a preview of the resulting formatted value.

The separate page Format Strings documents this Moment datetime format string.can now be found at Labels on the Squirro Docs site.