Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Current »

The balancer step uniforms the distribution of the number of elements per class in a data set (Attention: The balancer works just with in a batch If the batch size is smaller than the size of the data set). Balancing is needed to allow the ML algorithm to learn more general instead of over fitting to the largest populated class bucket.

Parameters

  • class_field: key name in which the classes are located

  • classes: list of all classes which are used in the classification

  • not_class: boolean which states if a not class should be instantiated or not

  • output_label_field: field in which the label are stored (only important if not_class is True)

  • deviation (optional): Max deviation from the smallest class bucket to the largest bucket (1. = 100%, 0. = 0%)

  • seed (optional): Seed for the randomization process

Example

{
  "step": "balancer",
  "type": "balancer",
  "name": "balancer",
  "classes": ["A","B","C","D"],
  "class_field": "label",
  "not_class": false
  "output_label_field": "balanced_label"
}

  • No labels