Page Comparison

...

Language detection
Language-specific spaCy Analysis is applied using the pre-trained spaCy language model (see example) for the detected language. The analysis includes:
- Tokenization and lemmatization
- Part of Speech (POS) tagging
- Named Entity Recognition (NER)
Part of Speech Booster / Filter
- Assigns weight to tokens based on their POS tags
- Conjunctions and determiners are removed
Query Modifier.

...

You can configure the available workflows under AI STUDIO > ML Workflows.

Every project is equipped with a default query-processing workflow per default. This default workflow is read-only and cannot be deleted or modified. It is managed by the Machine-Learning (ML) Service and is automatically updated to the latest version.

The default query-processing workflow is set as the ACTIVE QUERY PROCESSOR and is listed along with any other custom workflow.

...

Hovering over a workflow, you can click SET ACTIVE to make the workflow the ACTIVE QUERY PROCESSOR.

...

is listed along with any other custom workflow.

...

If you want to customise the behaviour of the default query-processing workflow, you can CLONE the workflow and edit it’s configuration.
Then by hovering over the newly created workflow, you can click SET ACTIVE to make the cloned workflow the ACTIVE QUERY PROCESSOR.

...

The default query processing workflow cannot be deleted, but can be disabled. To disable performing query processing, you can navigate to the SETTINGS > Project Configuration andremove the topic.search.query-workflow option by clicking the RESET button.

Info

During the startup, the ML-Service automatically adds the default query processing workflow to the projects that don’t have it.

Because each project has its own default workflow, the default query processing workflow is not imported during project importing.

Query Processing Workflow Steps

...

Expand

title	Pre-configured query processing pipeline steps

Code Block

language	json

{
  "component": "Query-Processing",
  "cacheable": true,
    "dataset": {
        "items": []
    },
  "pipeline": [
    {
      "fields": [
        "query",
        "user_terms",
        "facet_filters"
      ],
      "step": "loader",
      "type": "squirro_item"
    },
    {
      "step": "custom",
      "type": "parse",
      "name": "syntax_parser"
    },
    {
      "step": "custom",
      "type": "analysis",
      "name": "lang_detection",
      "input_field": "user_terms_str"
    },
    {
      "step": "custom",
      "name": "custom_spacy_normalizer",
      "type": "analysis",
      "infix_split_hyphen": false,
      "infix_split_chars": ":<>=",
      "merge_entities": true,
      "merge_noun_chunks": false,
      "cacheable": true,
      "input_fields": [
        "user_terms_str"
      ],
      "output_fields": [
        "nlp"
      ],
      "exclude_spacy_pipes": [],
      "spacy_model_mapping": {
        "en": "en_core_web_sm",
        "de": "de_core_news_sm"
      }
    },
    {
      "step": "custom",
      "type": "enrich",
      "name": "pos_booster",
      "strict_filter": true,
      "analyzed_input_field": "nlp",
      "phrase_proximity_distance" : 15,
      "pos_weight_map": {
        "PROPN": 10,
        "NOUN": 10,
        "VERB": 2,
        "ADJ": 5,
        "X": "-",
        "NUM": "-",
        "SYM": "-"
      }
    },
    {
      "step": "custom",
      "type": "enrich",
      "name": "query_modifier",
      "raw_input_field": "query",
      "term_mutations_metadata": ["term_expansion_mutations","pos_mutations"],
      "output_field": "enriched_query"
    },
    {
      "step": "debugger",
      "type": "log_fields",
      "fields": [
        "user_terms",
        "facet_filters",
        "pos_mutations",
        "term_expansion_mutations",
        "enriched_query"
      ],
      "log_level": "info"
    }
  ]
}

...

Versions Compared

Old Version 14

New Version 15

Key

Query Processing Workflow Steps