The catalyst data model provides a sub-item model so that when significant events are detected, we can show exactly which sentence or phrase in a document triggered the catalyst, as well as building detailed relationships across documents.
Individual entities are stored in the entities
item field.
Definitions / Vocabulary
Document | Original data as provided by the customer. |
---|---|
Item | A modified version of Document as stored within Squirro. |
Facet | Metadata assigned to an Item in the form of a key/[list of values] pair. Stored as attribute keywords in the Item. |
Entity | A real-world or higher level object of a pre-defined type, such as persons, locations, organizations, products, events etc., that can be denoted with a proper name. An Entity has a list of Extracts with all its appearances within one Item. Optionally it can maintain a list of instantiations of properties. Properties are pre-defined per Entity type and are simple values or references to other Entities. |
Extract | A single occurrence of a detected Entity within one Item. Keeps track of the location and the original text of the detection. |
Catalyst | A mapping between a Query and a set of Actions. |
Query | A string conforming to our query syntax. The query syntax is extended to allow searching for Entities. See Query Syntax below. |
Action | Some action executed based on a Catalyst match. E.g., send an email, call callback. |
Entity Profile | Pre-computed model for each value of an Entity. Used for ranking Recommendations. |
Recommendation | Ranked result list of Entities based on a Query (potentially containing Entities). |
Models
Item | see Item Model |
---|---|
Facet | see Facets API |
Entity | Note: Properties can come in two different types: string (default) or numeric. If they are numeric, e.g. of type float or int they will be indexed on a field 'numeric_properties' in elasticsearch and mapped back to 'properties' before returned. This allows e.g. for propper number comparison or range queries. Unlike for keywords we do not maintain a DB to keep track of the types of properties, but only infer the type from the submitted value. |
Query Syntax
Entities | entity:{< any query to match a single entity document >} |
---|