Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Thumbnail Extraction pipeline step finds a thumbnail image to represent the Squirro item.

...

Table of Contents

Table of Contents
outlinetrue
excludeTable of Contents

Overview

There are two ways that Squirro can find the right thumbnail for the item:

  • If the webshot_picture_hint field points to a valid image URL, that image is used as the thumbnail.
  • Alternatively the web site is downloaded and analyzed to find the most prominent image.

Image Removed

Configuration

Thumbnail extraction relies on an Amazon Web Services S3 configuration to store images for thumbnails and to retrieve thumbnails for display. Configure the following files:

...

Code Block
title/etc/squirro/common.ini
[services_external]
thumbler = //thumbler-testing.squirro.net

[thumbler_salt]
thumb = <salt_1>

...

Code Block
title/etc/squirro/webshot.ini
[aws]
access_key = <key_1>
secret_key = <key_2>
s3_bucket = webshot.testing.squirro.net
s3_base_url = http://webshot.testing.squirro.net.s3-website-eu-west-1.amazonaws.com/

[webshot]
use_thumbler = True
thumbler_config = thumb
thumbler_bucket = webshot
thumbler_salt = <salt_1>

Then restart the sqwebshotd service.

...

Code Block
title/etc/squirro/thumbler.ini
[bucket_webshot]
is_s3 = True
access_key = <key_1>
secret_key = <key_2>
s3_bucket = webshot.testing.squirro.net

[config_thumb]
operation = scale
salt = <salt_1>

Then restart the sqthumblerd service.

...

Code Block
titleExample based on nginx: /etc/nginx/conf.d/thumber.conf
upstream thumbler-testing {
    server ip-squirro-cluster-node:443;
}

server {
    listen 443 ssl;
    server_name  thumbler-testing.squirro.net;

    ssl_certificate <ssl_certificate_1>;
    ssl_certificate_key <ssl_key_1;

    location / {
        proxy_pass https://thumbler-testing/service/thumbler/;
        proxy_set_header Host $host;
        proxy_set_header Connection Close;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_redirect    off;
        proxy_read_timeout 60;
    }

    # redirect server error pages to the static page /50x.html
    #
    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;
    }
}

...

This page can now be found at Thumbnail Extraction on the Squirro Docs site.