This algorithm can tell if you're drunk tweeting

istockphoto

If you were tweeting and drinking between July 2013 to 2014, your tweets might have been used as part of an experiment by computer science students at the University of Rochester. (Perhaps you don't remember, which might be for the best.)

Nabil Hossain and colleagues trained a computer to identify alcohol-related tweets and used the data to monitor alcohol-related activity in a particular area. The research could help with understanding and responding to public health issues, according to the authors of the study.

The researchers collected more than 11,000 geotagged tweets from New York City and Monroe County, where Rochester is located, in the northern part of the state. They filtered all of the tweets that mentioned alcohol-related words such as beer, drunk, hangover, wasted or party (as well as variations such as "druuuuuunk"). Then they enlisted Amazon's Mechanical Turk -- a crowdsourcing marketplace -- to determine which category the tweets fell into:

  • Tweets that mention alcohol
  • Tweets about that person drinking alcohol
  • Tweets about that person drinking alcohol while tweeting (i.e., drunk tweets)

The team also used keywords like sofa, TV, sleep, and bath, to determine where people were tweeting from: at home or at a bar, for example.

From this data, they created a support vector machine (SVM) or algorithm that used its artificial intelligence to accurately spot drunk tweets.

The team also used the data to create heat maps that show drinking and tweeting hot-spots in New York City and Monroe County.

Among the revelations: city dwellers appear to be more likely to tweet while drinking from home or their immediate neighborhood, while in Monroe County many drunk-tweeters still needed to find their way home.

"We see that NYC has a larger proportion of user-drinking-now tweets posted from home (within 100 meters from home) whereas in Monroe County a higher proportion of these tweets generated at driving distance (more than 1000 meters from home)," the authors write.

screen-shot-2016-03-17-at-1-56-05-pm.png
Heat maps of user-drinking-now tweets. In NYC, the drinking hot-spots are Lower Manhattan and its surroundings, while in Monroe County they are downtown Rochester (center) and the city of Brockport (left).
Hossain et. al.

"Social media is a new ubiquitous source of real-time community and individual public-health related behaviors," Hossain and colleagues said in the paper. "Our results demonstrate that tweets can provide powerful and fine-grained cues of activities going on in cities."

Past research has used Twitter data to track air pollution, the spread of HIV and other public health issues. The researchers say that their model could be used to create a tool for improving community health or to help those who may have a problem with alcohol.

"For instance, the peer social network 'Alcoholics Anonymous' is designed to develop social network connections to encourage abstinence among the members and establish helpful ties," the paper stated.

The authors said that future research will analyze how social interactions and peer pressure in social media influence the tendency to mention drinking. They also want to compare the rate of individuals traveling in and out of adjacent neighborhoods to drink.

"All these analyses will help us understand the merits of these methods for analyzing drinking behavior, via social media, at a large-scale with very little cost, which can lead to new ways of reducing alcohol consumption, a global public health concern," they write.