Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Key phrases are stored within the facet:nlp_tag__phrases.
The item’s Title is also added.

...

With configuration tag_topics:True, the pool of ranked key-phrases is used to extract cleaned, deduplicated phrases referred to as “topics” (stored in facet:nlp_tag__topics).

Concept

Code Block
- Filter steps:
  - Remove terms with POS ["ADJ", "DET", "PUNCT"]
  - Remove terms containing (almost) only number characters, like `33120x`
  - De-Duplicate:
      - Skip phrases that are also detected in NER-TAGS ["PRODUCT", "EVENT", "PERSON"] (configurable)
      - Skip phrases that contain terms from already stored "topics"
- Select 20 phrases evenly across all ranks (as determined via TextRank)

...

Enrichment

  • Overall Sentiment Tagging Label
    facet:sentiment_pretrained
    One sentiment label (neutral, positive, negative) per document.

    • Sentiment analysis is applied per sentence

    • Sentences with neutral sentiment are skipped

  • Overall Sentiment Score
    facet:nlp_tag__sentiment_score
    Float value within [-1,+1]

  • Sentiment Assessment
    facet:positive_terms, facet:negative_terms
    A sentiment phrase consists of the valence-term and it’s context. \

...