Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Why

Tagging a dataset is done to create a set of training examples for text classification problems.

...

Setting up a training and inference job


Getting and Installing the widget

The widget for tagging datasets can be found here: https://github.com/squirro/delivery/tree/master/dashboard/widgets/dataset-tagging

and is added to a project as a custom widget. Details on how to upload a custom widget can be found here: squirro_asset Command Line Reference#Dashboardwidgets

Configuring the Widget

With data loaded into the project, we can move to setting up the widget and starting to tag examples.

...

  • Facet Name - The Facet that stores the labels added by humans
  • Tag Facet Name - The Facet that stores the tags predicted by the model
  • Labels to use - The options for different classes, if they have not already been predicted by a model.
    • For example, if you have the classes "pos" and "neg", you can fill in this config option with the value "pos,neg" to tell that to the widget
  • Show bulk tagging controls - If selected, a black bar will appear at the top of the widget, with the option to label the top 10 examples shown with a single click.

If training and inference jobs are already set up, you will see the prediction strength for each class for each example in the project (the darker a class shows up, the more strongly the model predicts that label). At this point, you are ready to start labeling examples and training the model.

Image Modified

Usage

It is typically easiest to use the search bar to find good examples to tag first. Any type of search can be used along with the data labeling widget, so there are lots of clever ways of finding good examples for each class to start with. Such as:

...