Twitter Provider

The Twitter provider follows people and search results on Twitter.By using the Gnip service, the Squirro Twitter provider delivers results close to real time. The usual lag between a publication on Twitter and its indexing in Squirro is just a few seconds.

Provider nametwitter
Type

Push provider

Table of Contents

Configuration

FieldDescription
screen_nameThe handle of the account to follow on Twitter. Example: squirro.
usernameSame as screen_name but may optionally include the leading @ character. Example: @squirro.
query

A search term to follow on Squirro. Refer to Gnip's documentation on PowerTrack rules for the query language.

Note: The Twitter provider always adds a "-is:retweet" component to the query, thus avoiding all retweets which would add many duplicates to the stream.

For information about the query and data sampling, refer to the Knowledge Base story /wiki/spaces/KB/pages/2949432.

Configuration Example

This is an example configuration for subscribing to the BBC News:

{
    "screen_name": "bbcnews"
}

Using the Python SDK a subscription for this could be created with the following code snippet:

client = SquirroClient(None, None, cluster='https://next.squirro.net/')
client.authenticate(refresh_token='293d…a13b')
client.new_subscription(project_id, object_id='default', provider='twitter',
    config={'screen_name': 'bbcnews'})

Item Format

The following keywords are added to each item that the Twitter provider delivers to Squirro.

KeywordDescriptionExample
fromThe screen name of the tweet author.ABBgroupnews
hashtagsA list of hashtags used in the tweet.power, ABB
mentionsA list of screen names that are mentioned in the tweet.ABB_CEO
placeThe name of the location where the tweet was published. Only available for geo-coded tweets.Baden
verbThe type of action taken by the user. Possible values are "post" for normal tweet and "share" for re-tweets.post

The example column shows what values would be delivered with the example of the following Tweet:

In addition for all tweets that contain a link, the contents of the first link is fetched from their original site. In this process the story is retrieved and any ads, side columns, etc. are removed (see Boilerplate Removal).

Filtering

The Twitter provider does not deliver tweets that:

  • have only matches in the URL or in a Twitter username. So a search for squirro will not match "Go to http://squirro.com/..." or "Hey @Squirro did you see this fancy video?"
  • are replies.
  • are retweets.

Licensing

The Twitter provider needs to be licensed separately. Cost is driven mostly by number of tweets coming in, but additionally by number of users to follow and number of queries to track.