Sunday, January 6, 2013

Library of Congress Twitter Collection: 170 Billion Tweets Strong - PC Magazine

Twitter Logo

When future generations look back to get a general sense of the zeitgeist from the past six years or so, they'll have plenty of digital information to look at – it'll just be limited to 140 characters or less.

At least, that's the premise put forth by the "Twitter collection" at the Library of Congress. If you've been curious about how much data the Library has amassed as part of its plan to archive all public tweets, first announced in April 2010, the Library has a pretty beefy number that it's now able to announce: 170 billion tweets.

Just to put that number into a bit of perspective, the Library received around 140 million tweets for archiving – each day – in Feb. 2011. The pipe expanded to approximately half a billion daily tweets by Oct. 2012, and it stands to reason that the Library will likely be processing even more tweets on a daily basis going forward.

"The Library's first objectives were to acquire and preserve the 2006-10 archive; to establish a secure, sustainable process for receiving and preserving a daily, ongoing stream of tweets through the present day; and to create a structure for organizing the entire archive by date," wrote director of communications Gayle Osterberg in a Friday blog post.

With those goals achieved, the Library now plans to tackle the equally large elephant in the room: How to process and display this volume of Twitter posts so they can be accessed by researchers, "in a comprehensive, useful way." Interest in the Library's Twitter archives – ranging from research about citizen journalism and elected officials tweets to stock market predictions – has generated approximately 400 inquiries from researchers thus far, and that's even before the Library has been able to grant any kind of access to its 170 billion-large tweet archive.

"Twitter is a new kind of collection for the Library of Congress but an important one to its mission," Osterberg wrote. "As society turns to social media as a primary method of communication and creative expression, social media is supplementing, and in some cases supplanting, letters, journals, serial publications and other sources routinely collected by research libraries."

For more tech tidbits from David Murphy, follow him on Facebook or Twitter (@thedavidmurphy).

Subscribing to a newsletter, constitutes acceptance of our Terms of Use and Privacy Policy.
');}}
blog comments powered by Disqus

No comments:

Post a Comment