Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
  • The Significant Terms provides visual information about terms which are especially significant in a dataset, comparing to another dataset.
  • Significant Terms allow to reveal the uncommonly common. This means, that it shows which terms show a significant different value distribution in a foreground-dataset when compared to a background-dataset. This requires a certain amount of documents to work, the more terms there are in a facet, the more documents are needed to get a meaningful answer. In Squirro the background-dataset is what is defined in an unmodified dashboard. The foreground-dataset is constructed from the background-dataset and includes the current selection. If there is no selection, meaning that the foreground- and the background-dataset are equal, the term frequency is shown (except if the facet is bodytitle or summary where this operation is too costly).
  • Significant terms work very well on facets with few values, meaning if computed on a bodytitle or summary field, there are much more documents needed to get a significant term to show-up. One workaround for this restriction is to use phrase- or term detection and index those phrases/terms in a separate facet field. This has shown to improve the results vastly while also not requiring a lot of documents.
  • When configuring over a special content field (body, title, summary), one is asked also to provide language and the maximum number of results returned.
  • The maximum number of results field affects performance, so exercise with caution when increasing the limit.

Image Removed

...