Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Use the jupyter notebook setup as described here to do your EDA, develop your classifier and train your machine learning models if there is dedicated HW.

  2. Create a lib/nlp step which includes the classifier or is able to run the pre-trained model:

    1. To start you would need to get access to download and install the lib/nlp on github, clone it and install it.

    2. In addition we consult the page of the lib/nlp documentation (https://squirro.github.io/nlp/api/steps/classifiers/ ). To make it easier we can inherit from the classifier base class, which comes with some pre-defined parameters like input_field,input_fields,label_field and output_field. Below you see a template which you can use to fill in your code for your classifier:

      Code Block
      languagepy
      """Custom classifier class"""
      
      from squirro.lib.nlp.steps.classifiers.base import Classifier
      
      class CustomClassifier(Classifier):
          """Custom #Classifier. 
      
          # Parameters
          type (str): `my_custom_classifier`
          my_parameter (str): my parameter
          """
      
          def __init__(self, config):
              super(CustomClassifier, self).__init__(config)
      
          def process(self, docs):
              """ process/execute inference job on the incoming data """ 
              return docs
      
          def train(self, docs):
              """ train your model """
              return self.process(docs)
      
    3. To make it work we need also have a look at the incoming data structure. In both functions train and process there is a list of Documents handed over. Which fields are populated is depending in the prior steps and their configuration.

    4. Note: The train-function can be a pseudo function specially if a pre-trained model is used. Meaning that the train-function does not need to actually train a model if a trained model is provided and the step only is meant for inference execution.

...