The balancer
step uniforms the distribution of the number of elements per class in a data set (Attention: The balancer
works just with in a batch If the batch size is smaller than the size of the data set). Balancing is needed to allow the ML algorithm to learn more general instead of over fitting to the largest populated class bucket.
Parameters
class_field
: key name in which the classes are locatedclasses
: list of all classes which are used in the classificationnot_class
: boolean which states if a not class should be instantiated or notoutput_label_field
: field in which the label are stored (only important ifnot_class
isTrue
)deviation
(optional): Max deviation from the smallest class bucket to the largest bucket (1. = 100%, 0. = 0%)seed
(optional): Seed for the randomization process
Example
...
This page can now be found at Models on the Squirro Docs site.