In the AI Studio, we provide a process to build binary and multi-class machine learning classifiers. When it comes to Proximity Filters (our rule-based approach), we provide a way to generate binary classifiers and multi-label classifiers.
You may ask what is the difference between the various types of classification, here you find a hands-on summary or directly split up:
In this document, we walk you through the process of how to set up a multi-label proximity filter classifier with the AI Studio:
Step 1: Candidate Set
Create Candidate Sets to filter the data corpus to a more relevant sub set of documents. This will help you later in the process. In this example, we focus on management change
, plant to invest
and IPO
:
...
Step 2: Ground Truth (rule generation)
First, the Ground Truth needs to be defined. It is important that all the labels below are added and the proximity search checkbox is activated:
...
Next, you can start to generate rules for the different labels:
Either by labeling in the
List view
or theFocus view
:Or in the
Rule overview
tab:
Step 3: Model
After defining the rules, you can move forward to build a proximity filter classifier model:
...
To generate a multi-label proximity filter you need to add all labels (coma-separated) to Label Tags
field in the Configure Template
screen:
...
The Proximity Filter is generated after clicking on Save and Train
and can be viewed in the model overview:
...
Step 4: Validation
The validation of the proximity filter can be viewed by clicking on Validate
:
...
Note: This screen is empty if there are no sentences labeled in the Ground Truth.
Step 5: Publish
The multi-label proximity filter can be published and used the same as every other AI Studio model in the data ingestion pipeline. The only difference is sentences can now potentially be classified by more than one of the labelsThis page can now be found at Multi-Label Proximity Filter on the Squirro Docs site.