Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The balancer step uniforms the distribution of the number of elements per class in a data set (Attention: The balancer works just with in a batch If the batch size is smaller than the size of the data set). Balancing is needed to allow the ML algorithm to learn more general instead of over fitting to the largest populated class bucket.

Parameters

  • class_field: key name in which the classes are located

  • classes: list of all classes which are used in the classification

  • not_class: boolean which states if a not class should be instantiated or not

  • output_label_field: field in which the label are stored (only important if not_class is True)

  • deviation (optional): Max deviation from the smallest class bucket to the largest bucket (1. = 100%, 0. = 0%)

  • seed (optional): Seed for the randomization process

Example

...

This page can now be found at Models on the Squirro Docs site.