Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Table of Contents
outlinetrue
excludeTable of Contents

Introduction

Known Entity Extraction (commonly abbreviated KEE) is Squirro's technology to enrich unstructured data by linking it to company-specific structured information.

Examples of such structured information and the way that they can be linked to unstructured documents include:

  • Company list - extracted from a CRM system such as Salesforce. 

    • The companies in the list can then be tagged within unstructured documents such as email conversations, news articles, call notes, etc.

  • Portfolio of securities held by a specific investor - extracted from an asset management system. 

    • Each security mentioned in a news article can be linked to a specific position held by an investor.

  • List of people - extracted from a user authentication system. 

    • Each person can then be tagged in emails and call notes where they are mentioned.

  • Product lists from internal databases. 

    • Each of a company's products can be tagged in social media content and public web news where it is referenced. These references can automatically be made visible to the right product team.

This documentation explains how to create these links between structured and unstructured information using the Known Entity Extraction functionality. As this is a component of Squirro, make sure you are familiar with the core Squirro concepts, especially the Squirro Architecture and the Item Format.

Usage

As data is loaded into a project, Known Entity Extraction is performed using a plugin to the data enrichment pipeline (a pipelet) provided by Squirro.

The KEE pipelet uses a lookup database as the foundation of its work. That lookup database needs to be re-compiled any time the original data or setting for the KEE project change. To create this lookup database, the kee utility is used. That utility is installed as part of the Toolbox. The following pages document how to work with this utility:

The Known Entity Extraction can also be set up directly in the Squirro user interface. That process is documented in:

For advanced use cases that are not covered by default, the pipelet can be extended by subclassing it:

...

This page can now be found at Known Entity Extraction on the Squirro Docs site.