Twitter Tasks

To access Twitter using MassMine you must first authenticate. This is a requirement of Twitter. Authorization is simple, and only has to be completed once.

To authenticate with Twitter, run

massmine --task=twitter-auth

Check out the general usage examples to learn how to use MassMine. Below is a description of the tasks available for Twitter.

Task parameters marked * are required. For parameters marked with a + choose only one.


➾ twitter-auth

Sets up MassMine to make data requests under your Twitter account privileges. This task must be ran before using any other Twitter tasks, or an error will be returned.

Parameters

Example

massmine --task=twitter-auth

➾ twitter-followers

Returns information each follower for a specified user.

Parameters

Example

massmine --task=twitter-followers --user=quinoa

➾ twitter-friends

Returns information on each friend of a specified user.

Parameters

Example

massmine --task=twitter-friends --user=quinoa

➾ twitter-locations

Returns a list of valid geo-locations as Yahoo Where on Earth Indentifiers (WOEIDS) accepted by Twitter. These WOEIDs can be used with some Twitter tasks that accept a geo parameter.

Parameters

-none-

Example

massmine --task=twitter-locations

➾ twitter-rehydrate

Returns “rehydrated” tweets based on supplied tweet IDs. This is Twitter’s preferred method for reviving old tweets. Note that this is also one of the few (or only) ways that Twitter allows researchers to share data with one another. The process involves gathering tweets (using other massmine twitter tasks) and then sharing only the tweet ID field of each tweet. This is allowed under specific conditions, and up to a specified number of tweets per unit of time—see Twitter’s API terms and conditions for up-to-date details. Next, the shared tweet IDs can be “rehydrated” using this task to retrieve the full tweet object’s data. In this roundabout way, you can pass a curated data set from one researcher to another. Note that there is no guarantee of consistency over time. This reflects Twitter’s attempt to allow users to control their data over time. For example, if the tweet is edited or deleted in the intervening time, you will receive the edited version at the time of rehydration, or nothing if the tweet has been deleted.

Parameters

Example

# Rehydrate a single tweet
massmine --task=twitter-rehydrate --query=595302290619260928

# Rehydrate multiple tweets with a comma-separated list
massmine --task=twitter-rehydrate --query=595302290619260928,595302291349118976

➾ twitter-sample

Returns a random sample of tweets as they occur in real time. Up to 1% of Twitter’s actual volume is returned. Returns up to a maximum number of tweets requested OR until a specified date/time is reached. Both “count” and “dur” can be specified, in which case the task finished whenever either target is reached.

Parameters

Example

# Request a specified number of tweets
massmine --task=twitter-sample --count=50

# Or, keep collecting until a time is reached
massmine --task=twitter-sample --dur='2015-10-11 14:30:00'

# This will finish whenever 50 tweets or the deadline is reached,
# whichever occurs first
massmine --task=twitter-sample --dur='2015-10-11 14:30:00' --count=50

➾ twitter-search

Search for pre-existing tweets matching a given search phrase. Not all tweets are indexed and made available by Twitter’s search, and search-able tweets are indexed for the last 7 days only. For better search coverage, consider using the twitter-stream task to capture tweets as they occur in real time.

Parameters

Example

# Looking for love...
massmine --task=twitter-search --query=love --count=300

# ... in only certain places
massmine --task=twitter-search --query=love --count=300 --geo=37.781157,-122.398720,1mi

# ... in French
massmine --task=twitter-search --query=amour --count=300 --lang=fr

➾ twitter-search–30day

Search for pre-existing tweets matching a given search phrase using Twitter’s Premium service. Note that this endpoint requires a paid account with Twitter. Not all tweets are indexed and made available by Twitter’s search, and search-able tweets are indexed for the last 30 days only.

Important: The user is responsible for managing their requests-per-month rate limits. MassMine will adhere to Twitter’s per-second (10 requests per second) and per-minute (60 requests per minute) rate limits. However, each paid plan also has a requests-per-month limit that is determined by each user’s plan. Users can monitor their monthly rate limit status at Twitter’s developer dashboard.

Rate limits are determined by the number of requests to Twitter’s server. Each request can include up to 500 tweets. Because of this, requesting 100 tweets costs the same as requesting 500 tweets. As such, MassMine maximizes your return by providing results in 500-tweet chunks. Thus, users should request tweets in increments of 500 when using the count parameter (request increments <500 will be rounded up). For instance, requesting --count=1800 or --count=2000 will both return up to 2000 tweets (2000 exactly, unless Twitter returns less than 2000 matches).

Parameters

Example

massmine -t twitter-search-30day -q love -c 500 --date='2020-09-26-00-00:2020-10-22-00-00' -o lovetweets.ndjson

➾ twitter-stream

Returns tweets as they occur in real time, matching either a search phrase, a user name, or a location. Up to 1% of Twitter’s actual volume is returned. Returns up to a maximum number of tweets requested OR until a specified date/time is reached. Both “count” and “dur” can be specified, in which case the task finished whenever either target is reached.

Parameters

Example

# Search by keyword, with a max count OR deadline
massmine --task=twitter-stream --query=love --count=300 --dur='2015-10-11 14:30:00'

# Track a user in real time (may only make sense for HIGHLY active accounts).
# Here we track multiple users
massmine --task=twitter-stream --user=nasa,wired --dur='2015-10-11 14:30:00'

# Or, simply grab tweets coming out of New York City
massmine --task=twitter-stream --geo=-74,40,-73,41 --count=300

➾ twitter-trends

Returns the top–50 trends for a given location.

Parameters

Example

# Current trends in Seattle, Washington
massmine --task=twitter-trends --geo=2490383

➾ twitter-trends-nohash

Returns the top–50 trends for a given location, with #hashtags excluded.

Parameters

Example

# Current trends in Seattle, Washington
massmine --task=twitter-trends-nohash --geo=2490383

➾ twitter-user

Returns 1 or more users timelines (i.e., their tweet history), in reverse chronological order.

Parameters

Example

# Let's get the last 10 tweets from NASA
massmine --task=twitter-user --user=nasa --count=10

# We can fetch 10 from both NASA and Wired in one shot:
massmine --task=twitter-user --user=nasa,wired --count=10