Here’s an interesting little GIF that Lazaro Gamio (@LazaroGamio) posted to twitter recently. The visualization shows the historical Congressional district boundaries of Maryland’s 3rd district, from 1789-2017. this district is one of the most gerrymandered in the country. The last few years are particularly startling. As one commenter put it, the later district shape “looks like a Rorschach test!”
Playing around with historical congressional district boundaries: Maryland's 3rd district, from 1789-2017 -> pic.twitter.com/nGOU3vcn1W
— Lazaro Gamio (@LazaroGamio) February 23, 2017
Here’s an oddity. On the Department of Labor’s blog, there was a post on september 6, 2016 titled “What is the ‘Real’ Unemployment Rate?” that described the “huge array of measures, which together provide a comprehensive picture of the state of job opportunities” in the US. As you’ll see if you click on that link, the post is now “404 page not found.” You’ll not find the post in the blog’s archive for September 2016 either. However, the post was archived by the Internet Archive on October 17, 2016, the last time that IA crawled the blog. So sometime between October, 2016 and today (February 16, 2017) that post was scrubbed from the Department of Labor’s blog.
What’s more strange is that the archived site showed 26 posts in September, 2016, but the live site’s blog’s archive for September 2016 shows only 10 posts. Unfortunately, IA didn’t crawl the monthly archive urls, so there’s no way to know what those missing 10 posts were about. There are also discrepancies for other months (eg, the archived site shows 30 posts in August 2016, while the live site shows 17 posts!).
There’s nothing that I can discern in this one found post that could be considered controversial. It’s not a CRS Report that found no correlation between the top tax rates and economic growth, thereby destroying a key tenet of conservative economic theory that was subsequently suppressed in 2012. It was written by Dr. Heidi Stierholz, the department’s chief economist.
So what gives? Why is the Department of Labor disappearing selective blog posts? We’ll let you know if we find out.
This is an amazing offer from Brewster Kahle and the internet Archive. Kahle just wrote a letter to the House Subcommittee on Courts, Intellectual Property and the Internet Committee on the Judiciary stating unequivocally that they will “archive and host — for free, forever, and without restriction on access to the public — all records contained in PACER.” The “Public Access to Court Electronic Records” or PACER system is the supposedly publicly accessible system of federal court records that charges exorbitant fees to download, thus making it for all intents and purposes blocking meaningful access to federal court records. But with this letter, the whole system could become actually accessible, for free and in perpetuity!
By this submission, tile Internet Archive would like to clearly state to the Judiciary Committee, as well as to the Administrative Office of the U.S. Courts and the Judicial Conference of the United States, that we would be delighted to archive and host — for free, forever, and without restriction on access to the public — all records contained in PACER…
In order to recognize the vision of universal free access to public court records, the Federal Judiciary would essentially have to do nothing. We are experts at “crawling” online databases in an efficient and careful fashion that does not burden those systems. We are already able to comprehensively crawl PACER from a technical perspective, but the resulting fees would be astronomical. The Federal Judiciary has a Memorandum of Understanding with both the Executive Office for us Trustees and with the Government Printing Office that gives each entity no-fee access for the public benefit. The collection we would provide to the public would be far more comprehensive than the GPO’s current court opinion program- although I must laud that program for providing a digitally-authenticated collection of many opinions.
By making federal judicial dockets available in this manner, the Federal Judiciary would enable free and unlimited public access to all records that exist in PACER, finally living up to the name of the program. In today’s world, public access means access on the Internet. Public access also means that people can work with big data without having to pass a cash register for each document.
The OpenGov Foundation wrote just released their “Statement on Internet Archive Offer to Deliver Free and Perpetual Public Access to PACER” in which they said:
“The vital public information in PACER is the property of the American people. Public information, from laws to court records, should never be locked away behind paywalls, never be stashed behind arbitrary barriers and never be covered in artificial restrictions. Forcing Americans to pay hard-earned money to access public court records is no better than forcing them to pay a poll tax.
“The Internet Archive’s offer to archive and deliver unrestricted public access to PACER for free and forever is the best possible Valentine’s Day gift to the American people. The Internet Archive is proposing a cost-effective and innovative public-private partnership that will finally fix a clear injustice. There is no reason to do anything but accept this offer in a heartbeat.”
Last week, we posted that the USDA’s Animal and Plant Health Inspection Service had announced that it was removing from its Website “inspection reports, regulatory correspondence, research facility annual reports, and enforcement records that have not received final adjudication.”
Russ Kick of the MemoryHole blog has now published thousands of these reports, which he had downloaded last summer and deposited in the Internet Archive. These include Animal and Plant Health Inspection Service (APHIS) reports concerning animal welfare at zoos, circuses, aquariums, puppy mills, etc., as well as their archive of annual reports filed by facilities that experiment on animals.
There are still extant deleted APHIS files out there – including inspection reports and enforcement records. If found, please send them to Russ. More of the story is over at Motherboard this page, plus 2,600 individual annual report PDFs from a wide time period here and here.)
On January 25, we blogged that the Department of Homeland Security (DHS) Daily Open Source Infrastructure Report has been summarily discontinued. This report provided a daily curated selection of articles/links/summaries to open source articles about various areas of U.S. critical infrastructure.
Effective January 18, 2017, the Office of Infrastructure Protection (IP) is discontinuing the DHS Daily Open Source Infrastructure Report. The discontinuation of this report is part of broader efforts to more efficiently focus resources towards the highest priority needs of the critical infrastructure security and resilience community. IP is committed to working closely with our public and private sector partners in identifying innovative approaches to exchanging information in a timely and actionable manner to further support risk mitigation activities.
One reader, Dr. Megan Squire, a CS professor at Elon University, took it upon herself to harvest the reports (2,151 PDF files!) and deposit them in the Internet Archive. These are now part of the Internet Archive’s growing Government Documents collection. Thanks Megan for this work! I hope our readers will take up the “rogue internet archivist” mantle and collect and preserve digital government information in all its guises and at all levels!!
On January 18, 2017 the US Department of Homeland Security discontinued its Daily Open Source Infrastructure Report service which it had run since October 2006. To enable researchers to study the content of these reports, I collected as many as I could find (2,151 PDF files) and released them to the Internet Archive. You can find them here: DHS Daily Open Source Infrastructure Reports 2006-2017
The PDF files came from the following URLs:
And when these yielded 404 errors (which they did for most pre-2013 files) I used the Internet Archive itself, with the following URL base: