How To Use The Twitter API To Find Events

The Twitter Search API

Now that we've got the basic proof of concept for the "posting to twitter" part of our twitterbot, we can look at the "finding the parties" portion. There are a few different ways one could go about writing a script that finds SXSW parties: we could write something that culled info from event websites using a web scraper like PhantomJS, we could use the API of a social network like Reddit to find party leads, or we could query a search engine hoping to find the actual RSVP pages of some honest-to-goodness bashes.

But ultimately the best solution that results in the neatest stack, clearest separation of concerns, and returns the results we want is right in front of our faces — the Twitter search API.

So let's look at the twit documentation and find some examples for searching tweets. This next code is lifted directly from the twit github page and searches that hot topic on everbody's minds today: bananas.

twitter.get('search/tweets', { q: 'banana since:2011-11-11', count: 100 }, function(err, data, response) {
  console.log(data)
})

To make some sense of the structure by printing out all the top-level attributes, change console.log(data) to:

for (attr in data) {
    console.log(attr);
}

The statuses attribute looks like what we want. Exploring that a little further, we can see that it consists of an array of tweet objects. Let's modify our code to print out the text of all those tweets.

var Twit = require('twit');
var twitInfo = require('./config.js');
var twitter = new Twit(twitInfo);

var tweets;

twitter.get('search/tweets', { q: 'banana since:2011-11-11', count: 100 }, function(err, data, response) {
  tweets = data.statuses;
  for (index in tweets) {
    console.log(tweets[index].text);
  }
})

That is a lot of banana talk.

OK, that's nice and good as a first step getting a handle on the search API and how to return text, but let's apply it to the object of our attention.

The following code repeats some code in a brute force attempt to return some links to the types of parties we're after.

var Twit = require('twit');
var twitInfo = require('./config.js');

var twitter = new Twit(twitInfo);

var musicParties, 
        interactiveParties, 
        filmParties;

twitter.get('search/tweets', { q: 'SXSW music party ', count: 100 }, function(err, data, response) {
  musicParties = data.statuses;
  console.log("MUS " + musicParties[0].text);
})

twitter.get('search/tweets', { q: 'SXSW interactive party ', count: 100 }, function(err, data, response) {
  interactiveParties = data.statuses;
  console.log("INT " + interactiveParties[0].text);
})

twitter.get('search/tweets', { q: 'SXSW film party ', count: 100 }, function(err, data, response) {
  filmParties = data.statuses;
  console.log("FLM " + filmParties[0].text);
})

When we run the new index.js, it produces the following output:

FLM RT @NGeistofficial: Salem to host 2015 #SXSW Film Festival opening #party http://t.co/ytDINxCflW #SXSW2015 #SXSWFilm #Salem @SalemWGNA http…

INT RT @Skoop_Events: North of 41 - 5th Annual SxSW Interactive Party
March 14, 8-11:30pm
RSVP: http://t.co/DZWA9osAIG
/#sxsw15 #sxsw #sxsw2015 …

MUS Yuuuhhh "@austin360: Ghostface Killah headlines another SXSW party http://t.co/MiAO81rxVm" @sinhalesepolice in atx for sxsw?

Those look like things we want!

That was a sample of searches intended to aggregate a large database of tweets that we could then analyze and tweet to people asking for specific types of parties.

But what's the largest single database of tweets? Twitter itself.

Let's refactor our approach to rely on a one-for-one search as opposed to a data aggregation stategy — which can get messy and requires a sizeable back-end. Using this strategy, here's a more generalized twitter search function:

function search (query) {
    twitter.get('search/tweets', { q: query, count: 1 }, function(err, data, response) {
      console.log(data.statuses[0].text);
    })
}

Logging data.statuses[0].text is a stand-in right now for when we'll post the tweet later.

Testing this code out with search('SXSW music party') we return the same result as the music party tweet mentioned before.

That's all a good start. Now let's get to work on the responding-to-users part of the twitterbot, which will require the streaming API functionality of our twit client.

Making the Bot Interactive

Looking at the API, it seems simple enough to open a stream tracking mentions of a specific word. In our case, since we want to track people who are tweeting at the bot, it makes sense to track our handle, @SouthBotFunWest. Here's the code to open a stream tracking our mentions logging both the tweets and their posters.

var stream = twitter.stream('statuses/filter', { track: '@SouthBotFunWest' });

stream.on('tweet', function (tweet) {
  var asker = tweet.user.screen_name;
  var text = tweet.text;
  console.log(asker + " tweeted: " + text);
})

Log out of your bot's profile (if you're logged in of course), log in to another, tweet at your bot and check back at your terminal. You should see your name and tweet logged on the command line!

Now we've got the ability to track tweets that mention our bot. The next logical step is putting in place a system to trigger different search queries depending on the content of those tweets.

The easiest way of doing this is with Regular Expressions. Regular expressions are basically a self-contained programming language for matching patterns of natural language (e.g. words, expressions) in text. Let's say we want to the bot to respond whenever someone says "hi." That's a perfect use case for regular expressions. First though, let's work on breaking the text of the tweet down word-by-word, so we can use our beginning of line (^) and end of line ($) operators to signify the beginning and endings of words, as opposed to the full text of the tweet. We can do this using the natural module and it's word tokenizer function, which takes a body of text and turns it into an array of words. Let's install and save it.

npm install --save natural

And of course require() it in our index.js file, then instantiate a new instance of WordTokenizer().

var natural = require('natural'),
  tokenizer = new natural.WordTokenizer();

Then try it out. Uncommenting the stream code to prevent the script from hanging, we can log the tokenized version of a typical tweet:

console.log(tokenizer.tokenize("hey @SouthBotFunWest, where's a cool party?"))

[ 'hey', 'SouthBotFunWest', 'where', 's', 'a', 'cool', 'party' ]

It comes out in a (nearly) perfect array! It's important to note how the tokenizer captures punctuation (namely, that it doesn't), but this is generally the response that we want. It's time to add the tokenizer to our stream code.

Underneath the console.log(asker + " tweeted: " + text); in your stream.on() callback, add:

  var wordArray = tokenizer.tokenize(text);
  var greetingRE = /^hi$/;

  for(var i=0;i < wordArray.length;i++) {
    if (greetingRE.test(wordArray[i])) {
      console.log(wordArray[i]);
      console.log("Sup " + "@" + asker + ". So, I've heard about some cool South-by parties. You know, whatever [music, interactive, film, free, food, drink]");
    }
  }

This code tokenizes the incoming tweet into an array of words, checks each word to see if it matches "hi," and — if it does — prints the word and our response message.

Try tweeting at your bot from another account with "hi" somewhere in the post. You should see "hi" and our response logged to the terminal!

Be sure to read the next Application Development article: How StoryCorps.me Was Built Using The Wordpress API and PhoneGap

 

Comments (0)