Month: July 2015

Crawling Twitter content using R

Over the course of time, mining tweets has been made simple and easy to use and also requires less number of dependency packages.

Tweets information can be used in many ways ranging from extracting consumer sentiments about an entity (brand) to finding your next job. Although there is a set limit of 3000 tweets, there are work around strategies available (dynamically changing client auth information once the current client has exceeded the limit).

I chose R because, it is easy to work on post extraction analysis, relatively easy to setup a database so extracted tweets can be stored in a table and using machine learning packages for any analysis. Python is also an excellent choice.

To extract the tweets, you need to log in to https://dev.twitter.com/ and then click on “Create New App” (by clicking on managing your apps)

Follow the instructions, fill in information to get yourself a API Key, API secret, Access Token information. Execute the following code and you are all set with extracting tweets!

install.packages(c("devtools", "rjson", "bit64", "httr"))

library(devtools)

library(twitteR)

APIkey <- "xxxxxxxxxxxxxxxxxxxxxxx"
APIsecret <- "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
accesstoken <- "23423432-xxxxxxxxxxxxxxxxxxxxxxxxxx"
accesstokensecret <- "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

setup_twitter_oauth(key,apisecret,accesstoken,accesstokensecret)

Screenshot from 2015-08-09 19:06:19

Read twitteR manual to learn various methods like searchtwitter() and play around a bit. Here is what I did:

Screenshot from 2015-08-09 19:11:21

 

Advertisements