Here’s another story about data rescue and the preservation of government information, this time from PC Magazine UK. Though the last data refuge event was in Denton, TX in May and the 2016 End of Term crawl has finished its collection work and will soon have its 200TB of data publicly accessible, there still remains much interest — and not a little bit of worry — about the collection and preservation of govt information and data. And with stories continuing to come out — eg this one from the Guardian entitled “Another US agency deletes references to climate change on government website” — about the US government agencies scrubbing or significantly altering their Websites, this issue will not be going away any time soon.
“Somewhere around 20 percent of government info is web-accessible,” said Jim (sic.) Jacobs, the Federal Government Information Librarian at Stanford University Library. “That’s a fairly large chunk of stuff that’s not available. Though agencies have their own wikis and content management systems, the only time you find out about some of it is if someone FOIAs it.”
To be sure, a great deal of information was indeed captured and now resides on non-government servers. Between Data Refuge events and projects such as the 2016 End-of-Term Crawl, over 200TB of government websites and data were archived. But rescue organizers began to realize that piecemeal efforts to make complete copies of terabytes of government agency science data could not realistically be sustained over the long term—it would be like bailing out the Titanic with a thimble.
So although Data Rescue Denton ended up being one of the final organized events of its kind, the collective effort has spurred a wider community to work in concert toward making more government data discoverable, understandable, and usable, Jacobs wrote in a blog post.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.