Boulder, Colorado-based Gnip Inc and DataSift Inc, based in the U.K. and San Francisco, are licensed by Twitter to analyze archived tweets and basic information about users, like geographic location. DataSift announced this week that it will release Twitter data in packages that will encompass the last two years of activity for its customers to mine, while Gnip can go back only 30 days.
Harvesting what someone said a year or more ago is game-changing, said Paul Stephens, director of policy and advocacy for the Privacy Rights Clearinghouse in San Diego. As details emerge on the kind of information being mined, he and other privacy rights experts are concerned about the implications of user information being released to businesses waiting to pore through it with a fine-tooth comb.
As we see Twitter grow and social media evolve, this will become a bigger and bigger issue, said Graham Cluley, senior technology consultant for British-based Internet security company Sophos Ltd. Online companies know which websites we click on, which adverts catch our eye, and what we buy ... increasingly, they're also learning what we're thinking. And that's quite a spooky thought.
Twitter opted not to comment on the sale and deferred questions to DataSift. In 2010, Twitter agreed to share all of its tweets with the U.S. Library of Congress. Details of how that information will be shared publicly are still in development, but there are some stated restrictions, including a six-month delay and a prohibition against using the information for commercial purposes.
That's where DataSift comes in. More than 700 companies are on a waiting list to try out its offering, DataSift CEO Rob Bailey said in an interview with Reuters. Those who buy the data will be able to see tweets on specific topics and even isolate those views based on geography. Bailey, who is based in San Francisco, said the effect is something like holding a huge number of sporadic focus groups on brands or products.
For instance, Coca-Cola Co could look at what people in Massachusetts are saying about its Coke Zero, or Starbucks Corp could find out what people in Florida are saying about caramel lattes. Companies can also look at how they have responded to consumer complaints.
Gnip, which offers the short-term data package, said the information collected -- which involves real-time viewing -- can also be used during natural disasters to help rescuers, to monitor illnesses such as a flu outbreak and to analyze stock market sentiment.
No private conversations or deleted tweets can be accessed, Bailey said. Companies want aggregated data, not to try to figure out who said what to whom. The only information that we make available is what's public, Bailey added. We do not sell data for targeted advertising. I don't even know how that would work.
A digital analytics expert said the biggest impact will be for marketers. The only privacy risk is marketers being able to do more with the data, faster, said Thomas Bosilevac, director of analytics for the digital marketing company Digitaria.
That doesn't mean everyone has to be happy about this. It's frustrating, and telling, that now marketers have greater access to my old tweets than I do, said Rebecca Jeschke, digital rights analyst and spokeswoman for the non-profit Electronic Frontier Foundation. However, this is perfectly legal, if creepy. If you publish your tweets publicly, that allows all sorts of folks to do all sorts of things with them.
For people concerned that something they said will come back to haunt them, it's not too late to go back and delete old tweets. DataSift is required to regularly update its files to remove comments that have since been deleted. Unlike when you're looking for someone else's tweets, users can always see their own simply by clicking on the word tweets.
(Edited by Linda Stern and Beth Gladstone and Gerald E. McCormick)