...
Code Block |
---|
{
"sources": {
"salespeople": {
"dsn": "csv:///salespeople.csv"
"field_id": "name",
"field_matching": ["name", "email"],
"hierarchy": "manager -> name"
}
}
} |
What the above code does is create a new source of known entities called "salespeople", and for this source we set the data source name ("dsn") to point to the csv file salespeople.csv
which is located in the same folder as the config.json
file.
The field_id
field identifies the field "name
" as being the unique identifier for each entity in the csv file.
the field_matching
field provides a list of all the fields that we want to look for to identify a known entity within a document. The code also In this case, we want to look for references to either the salesperson's name, or their email address in the documents in our Squirro project, so we include both of those fields in a list.
The heirarchy
field indicates that there is a hierarchy within the entities in the csv file, where the value in the 'manager' field of one entity points to the name of a that entity's parent entity (the person's manager).
Creating a strategy
Once we have the KEE project pointed to the list of known entities, we want to create our first strategy for recognizing known entities within each document.
...