Excerpt |
---|
The catalyst data model provides a sub-item model so that when significant events are detected, we can show exactly which sentence or phrase in a document triggered the catalyst, as well as building detailed relationships across documents. |
...
Definitions / Vocabulary
Document | original Original data as provided by the customer. |
---|---|
Item | a A modified version of Document as stored within Squirro. |
Facet | Metadata assigned to an Item in the form of a key/[list of values] pair. Stored as attribute keywords in the Item. |
Extract | a A single occurrence of a detected Entity within one Item. Keeps track of the location and the original text of the detection. |
Entity | a A real-world or higher level object of a pre-defined type, such as persons, locations, organizations, products, events etc., that can be denoted with a proper name. An Entity has a list of Extracts with all its appearances within one Item. Optionally it can maintain a list of instantiations of properties. Properties are pre-defined per Entity type and are simple values or references to other Entities. |
Catalyst | a A mapping between a Query and a set of Actions. |
Query | A string conforming to our query syntax. The query syntax is extended to allow searching for Entities. See Query Syntax below. |
Action | some Some action executed based on a Catalyst match. E.g., send an email, call callback. |
Entity Profile | prePre-computed model for each value of an Entity. Used for ranking Recommendations. |
Recommendation | Ranked result list of Entities based on a Query (potentially containing Entities). |
Models
Item | see Item Model | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Facet | see Facets API | ||||||||||
Entity |
Note: Properties can come in two different types: string (default) or numeric. If they are numeric, e.g. of type float or int they will be indexed on a field 'numeric_properties' in elasticsearch and mapped back to 'properties' before returned. This allows e.g. for propper number comparison or range queries. Unlike for keywords we do not maintain a DB to keep track of the types of properties, but only infer the type from the submitted value. |
...