...
Use the jupyter notebook setup as described here to do your EDA, develop your classifier and train your machine learning models if there is dedicated HW.
Create a lib/nlp step which includes the classifier or is able to run the pre-trained model:
To start you would need to get access to download and install the lib/nlp on github, clone it and install it.
In addition we consult the page of the lib/nlp documentation (https://squirro.github.io/nlp/api/steps/classifiers/ ). To make it easier we can inherit from the classifier base class, which comes with some pre-defined parameters like
input_field
,input_fields
,label_field
andoutput_field
. Below you see a template which you can use to fill in your code for your classifier:Code Block language py """Custom classifier class""" from squirro.lib.nlp.steps.classifiers.base import Classifier class CustomClassifier(Classifier): """Custom #Classifier. # Parameters type (str): `my_custom_classifier` my_parameter (str): my parameter """ def __init__(self, config): super(CustomClassifier, self).__init__(config) def process(self, docs): """ process/execute inference job on the incoming data """ return docs def train(self, docs): """ train your model """ return self.process(docs)
To make it work we need also have a look at the incoming data structure. In both functions
train
andprocess
there is a list of Documents handed over. Which fields are populated is depending in the prior steps and their configuration.Note: The
train
-function can be a pseudo function specially if a pre-trained model is used. Meaning that thetrain
-function does not need to actually train a model if a trained model is provided and the step only is meant for inference execution.
...