April 14th, 2010 by rbanks
Really stunning news that the entire Twitter archive since 2006 is going to be handed over, or I guess duplicated, in the Library of Congress. That gives us a sense of how this body of data can be seen as a mass record of the thoughts of a vast population.
I find it a little odd that quite a lot of the commentary seems to be about the scientific importance of this move, like you can’t just analyze Twitter data directly on the site itself. For example, this comment:
I’m no Ph.D., but it boggles my mind to think what we might be able to learn about ourselves and the world around us from this wealth of data.
This is from Matt Raymond, who blogs directly for the Library, so I’m sure I’m missing something. Matt goes on to say the following, though, which I think is really the point. This move is about historical preservation, reminiscence, and a shared heritage:
Just a few examples of important tweets in the past few years include the first-ever tweet from Twitter co-founder Jack Dorsey (http://twitter.com/jack/status/20), President Obama’s tweet about winning the 2008 election (http://twitter.com/barackobama/status/992176676), and a set of two tweets from a photojournalist who was arrested in Egypt and then freed because of a series of events set into motion by his use of Twitter (http://twitter.com/jamesbuck/status/786571964) and (http://twitter.com/jamesbuck/status/787167620).
It’s not Twitter’s job to look after our data for the long term. I’m glad they’re handing some of it over to a body who really can handle that responsibility.
For related work (on a much smaller scale) see the Backup Box.