Excerpt |
---|
The catalyst data model provides a sub-item model so that when significant events are detected, we can show exactly which sentence or phrase in a document triggered the catalyst, as well as building detailed relationships across documents. |
Individual entities are stored in the entities
item field.
Definitions / Vocabulary
...
A mapping between a Query and a set of Actions.
...
Models
...
title | Example |
---|
Code Block | ||
---|---|---|
| ||
[{
"id": "1234", # unique entity id
"item_id": "123456", # reference to original item id
"type": "company", # type of the entity, e.g. company,
"name": "Thomson Reuters",
"confidence": 0.8, # aggregated confidence of all extracts [0-1]
"relevance": 0.9, # relevance of this entity for the item [0-1]
"extracts": [{
"text": "Thomson Reuters", # original representation
"field": "title", # on which Item field can this extract be found
"confidence": 0.9, # confidence level [0-1]
"offset": 14, # start offset of text within original item
"length": 15, # length of text within original item
}, {
"text": "TR", # original representation
"field": "body", # on which Item field can this extract be found
"confidence": 0.1, # confidence level [0-1]
"offset": 0, # start offset of text within original item
"length": 2, # length of text within original item
}],
"properties": {
"stock_symbol": "TR", # value based property
"parent_company_ref": "<id of company type entity>" # reference based property
},
}, {
"id": "1237", # unique entity id
"item_id": "123456", # original item id
"type": "deal", # type of the entity, e.g. deal,
"name": "Thomson Reuters bought Squirro for 1Mio in the US.",
"confidence": 0.3 # confidence level of this entity [0-1]
"extracts": [{
"text": "Thomson Reuters bought Squirro for 1Mio in the US.", # original representation
"field": "body", # on which Item field can this extract be found
"confidence": 0.3, # confidence level [0-1]
"offset": 114, # start offset of text within original item
"length": 52, # length of text within original item
}],
"properties": { # variable set of keys depending on the entity type
"region_ref": <entity_id_1_of_type_geo>,
"size": 10000000,
"industry": null,
"acquirer": <entity_id_3_of_type_company>,
"target": <entity_id_3_of_type_company>,
}
},
...
] |
Note: Properties can come in two different types: string (default) or numeric. If they are numeric, e.g. of type float or int they will be indexed on a field 'numeric_properties' in elasticsearch and mapped back to 'properties' before returned. This allows e.g. for propper number comparison or range queries. Unlike for keywords we do not maintain a DB to keep track of the types of properties, but only infer the type from the submitted value.
Query Syntax
...
No Format |
---|
entity:{< any query to match a single entity document >} |
...
title | Examples |
---|
Search for Items containing a specific Entity of type company:
No Format |
---|
entity:{type:company AND name:"Thomson Reuters"} |
Search for Items containing at least one company-typed Entity "Thomson Reuters" and another one Entity "Squirro":
No Format |
---|
entity:{type:company AND name:"Thomson Reuters"} AND entity:{type:company AND name:Squirro}
|
Search for Items containing a specific Entity of type company with a confidence higher than 80%:
No Format |
---|
entity:{type:company AND name:"Thomson Reuters" AND confidence > 0.8} |
Search for Items containing any Entity of type company with confidence higher than 70%:
No Format |
---|
entity:{type:company AND NOT confidence < 0.7} |
Search for Items containing no Entity of type company with confidence higher or equal than 20%:
No Format |
---|
entity:{type:company AND confidence < 0.2}
|
Search for Items containing any Entity of type deal with at least a 70% confidence:
No Format |
---|
entity:{type:deal AND confidence > 0.7} |
Search for Items containing a specific Entity of type deal:
No Format |
---|
entity:{type:deal AND properties.size:100 AND properties.region:US AND properties.industry:Tech AND properties.target:Whatsapp AND properties.acquirer:Facebook}
|
Search for Items containing one Entity with target Squirro and another Entity with target Whatsapp:
No Format |
---|
entity:{type:deal AND properties.target:Squirro AND properties.industry:Tech} AND entity:{type:deal AND properties.target:Whatsapp AND properties.industry:Tech} |
Search for Items containing an Entity of type deal with a property size bigger than 100:
No Format |
---|
entity:{type:deal AND properties.size > 100} |
childrenThis page can now be found at Catalyst Data Model on the Squirro Docs site.