The Thumbnail Extraction pipeline step finds a thumbnail image to represent the Squirro item.
Enrichment name | Thumbnail Extraction, internally referred to as "webshot" |
---|---|
Stage | processing |
Enabled by default | No |
Table of Contents
Overview
There are two ways that Squirro can find the right thumbnail for the item:
- If the
webshot_picture_hint
field points to a valid image URL, that image is used as the thumbnail. - Alternatively the web site is downloaded and analyzed to find the most prominent image.
Configuration
Thumbnail extraction relies on an Amazon Web Services S3 configuration to store images for thumbnails and to retrieve thumbnails for display. Configure the following files:
Configuration File | Example | |
---|---|---|
/etc/squirro/common.ini | /etc/squirro/common.ini [services_external] thumbler = //thumbler-testing.squirro.net [thumbler_salt] thumb = <salt_1> | |
/etc/squirro/webshot.ini | /etc/squirro/webshot.ini [aws] access_key = <key_1> secret_key = <key_2> s3_bucket = webshot.testing.squirro.net s3_base_url = http://webshot.testing.squirro.net.s3-website-eu-west-1.amazonaws.com/ [webshot] use_thumbler = True thumbler_config = thumb thumbler_bucket = webshot thumbler_salt = <salt_1> Then restart the sqwebshotd service. | |
/etc/squirro/thumbler.ini | /etc/squirro/thumbler.ini [bucket_webshot] is_s3 = True access_key = <key_1> secret_key = <key_2> s3_bucket = webshot.testing.squirro.net [config_thumb] operation = scale salt = <salt_1> Then restart the sqthumblerd service. | |
URL and webserver configuration to forward | Example based on nginx: /etc/nginx/conf.d/thumber.conf upstream thumbler-testing { server ip-squirro-cluster-node:443; } server { listen 443 ssl; server_name thumbler-testing.squirro.net; ssl_certificate <ssl_certificate_1>; ssl_certificate_key <ssl_key_1; location / { proxy_pass https://thumbler-testing/service/thumbler/; proxy_set_header Host $host; proxy_set_header Connection Close; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_redirect off; proxy_read_timeout 60; } # redirect server error pages to the static page /50x.html # error_page 500 502 503 504 /50x.html; location = /50x.html { root /usr/share/nginx/html; } } Then reload the nginx service or other web server you may be using. |