Page Comparison

...

Use the jupyter notebook setup as described here to do your EDA, develop your classifier and train your machine learning models if there is dedicated HW.

Create a lib/nlp step which includes the classifier or is able to run the pre-trained model:

To start you would need to get access to download and install the lib/nlp on github, clone it and install it.

In addition we consult the page of the lib/nlp documentation (https://squirro.github.io/nlp/api/steps/classifiers/ ). To make it easier we can inherit from the classifier base class, which comes with some pre-defined parameters like input_field,input_fields,label_field and output_field. Below you see a template which you can use to fill in your code for your classifier:

Code Block

language	py

"""Custom classifier class"""

from squirro.lib.nlp.steps.classifiers.base import Classifier

class CustomClassifier(Classifier):
    """Custom #Classifier. 

    # Parameters
    type (str): `my_custom_classifier`
    my_parameter (str): my parameter
    """

    def __init__(self, config):
        super(CustomClassifier, self).__init__(config)

    def process(self, docs):
        """ process/execute inference job on the incoming data """ 
        return docs

    def train(self, docs):
        """ train your model """
        return self.process(docs)

To make it work we need also have a look at the incoming data structure. In both functions train and process there is a list of Documents handed over. Which fields are populated is depending in the prior steps and their configuration.
Note: The train-function can be a pseudo function specially if a pre-trained model is used. Meaning that the train-function does not need to actually train a model if a trained model is provided and the step only is meant for inference execution.

...

Versions Compared

Old Version 1

New Version 2

Key