Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In the AI Studio, we provide a process to build binary and multi-class machine learning classifiers. When it comes to Proximity Filters (our rule-based approach), we provide a way to generate binary classifiers and multi-label classifiers.

You may ask what is the difference between the various types of classification, here you find a hands-on summary or directly split up:

In this document, we walk you through the process of how to set up a multi-label proximity filter classifier with the AI Studio:

Step 1: Candidate Set

Create Candidate Sets to filter the data corpus to a more relevant sub set of documents. This will help you later in the process. In this example, we focus on management change, plant to invest and IPO:

...

Step 2: Ground Truth (rule generation)

First, the Ground Truth needs to be defined. It is important that all the labels below are added and the proximity search checkbox is activated:

...

Next, you can start to generate rules for the different labels:

  1. Either by labeling in the List view or the Focus view:

    Image Removed

  2. Or in the Rule overview tab:

    Image Removed

Step 3: Model

After defining the rules, you can move forward to build a proximity filter classifier model:

...

To generate a multi-label proximity filter you need to add all labels (coma-separated) to Label Tags field in the Configure Template screen:

...

The Proximity Filter is generated after clicking on Save and Train and can be viewed in the model overview:

...

Step 4: Validation

The validation of the proximity filter can be viewed by clicking on Validate:

...

Note: This screen is empty if there are no sentences labeled in the Ground Truth.

Step 5: Publish

The multi-label proximity filter can be published and used the same as every other AI Studio model in the data ingestion pipeline. The only difference is sentences can now potentially be classified by more than one of the labelsThis page can now be found at Multi-Label Proximity Filter on the Squirro Docs site.