Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

The unshorten link pipeline step resolves the link and expands it to the long version. This ensures that short URLs are indexed with the long version. This helps the Duplicate Detection which relies on a combination of title and link by default.

Enrichment nameunshorten-link
Stagededuplication
Enabled  by defaultYes, except for the bulk provider (affects items that are uploaded through the ItemUploaderDocumentUploader, File Importer, etc.)

During the unshorten-link step, the link field of items is expanded to resolve any HTTP redirects. This ensures that tiny URLs e.g. from Twitter posts are expanded to their long version.

Because this step has to do requests to the web sites, it will add delays to the pipeline processing. If your data source does not contain shortened URLs, then you can disable this step using the processing config.

There are no configuration options for this enrichment, with the exception of the enabled property to enable and disable it.

  • No labels