Twitter Provider
The Twitter provider follows people and search results on Twitter.By using the Gnip service, the Squirro Twitter provider delivers results close to real time. The usual lag between a publication on Twitter and its indexing in Squirro is just a few seconds.
Provider name | |
---|---|
Type | Push provider |
Table of Contents
Configuration
Field | Description |
---|---|
screen_name | The handle of the account to follow on Twitter. Example: squirro . |
username | Same as screen_name but may optionally include the leading @ character. Example: @squirro . |
query | A search term to follow on Squirro. Refer to Gnip's documentation on PowerTrack rules for the query language. Note: The Twitter provider always adds a "-is:retweet" component to the query, thus avoiding all retweets which would add many duplicates to the stream. For information about the query and data sampling, refer to the Knowledge Base story /wiki/spaces/KB/pages/2949432. |
Configuration Example
This is an example configuration for subscribing to the BBC News:
{ "screen_name": "bbcnews" }
Using the Python SDK a subscription for this could be created with the following code snippet:
client = SquirroClient(None, None, cluster='https://next.squirro.net/') client.authenticate(refresh_token='293d…a13b') client.new_subscription(project_id, object_id='default', provider='twitter', config={'screen_name': 'bbcnews'})
Item Format
The following keywords are added to each item that the Twitter provider delivers to Squirro.
Keyword | Description | Example |
---|---|---|
from | The screen name of the tweet author. | ABBgroupnews |
hashtags | A list of hashtags used in the tweet. | power, ABB |
mentions | A list of screen names that are mentioned in the tweet. | ABB_CEO |
place | The name of the location where the tweet was published. Only available for geo-coded tweets. | Baden |
verb | The type of action taken by the user. Possible values are "post" for normal tweet and "share" for re-tweets. | post |
The example column shows what values would be delivered with the example of the following Tweet:
In addition for all tweets that contain a link, the contents of the first link is fetched from their original site. In this process the story is retrieved and any ads, side columns, etc. are removed (see Boilerplate Removal).
Filtering
The Twitter provider does not deliver tweets that:
- have only matches in the URL or in a Twitter username. So a search for
squirro
will not match "Go to http://squirro.com/..." or "Hey @Squirro did you see this fancy video?" - are replies.
- are retweets.
Licensing
The Twitter provider needs to be licensed separately. Cost is driven mostly by number of tweets coming in, but additionally by number of users to follow and number of queries to track.