Carl Malamud posted the following to BoingBoing today:
"This bill would provide that the full text of the California Code of Regulations shall bear an open access creative commons attribution license, allowing any individual, at no cost, to use, distribute, and create derivative works based on the material for either commercial or noncommercial purposes."
Public.Resource.Org has bulk data for the CCR and the public safety codes (known as Title 24) online, but this would all be way easier if we didn't have to double-key the building codes every 3 years and jump on the West CD-ROM every 2 months to extract the data. This move would lead to tremendous innovation, just like we've seen when the Federal Register went open source in bulk.
The bill sponsor, Assemblyman Nestande, has a long background in public policy and IP. He was campaign manager for Sonny Bono's successful 1994 congressional campaign.
This article reports on the importance of a bill that will enable Congress to provide bulk access to its legislative data. It also profiles one of the heroes of open-access to Congressional data, Josh Tauberer. As the Post says, Josh has prodded Congress and the result may be the "raw material for an Angie's List or a Yelp for Congress, a way for modern users to evaluate lawmakers with the same kind of crowdsourced help that they use to evaluate lunch."
This is a lot like how Carl Malamud got the SEC to put the EDGAR database online. (SEC'S EDGAR On Net, What Happened And Why, TAP-INFO, 30 Nov 1993).
Congressional data may soon be easier to use online, by David A. Fahrenthold, Washington Post, (June 8, 2012)..
Online, searching for a bill in Congress feels a little like time travel: Go looking for legislation, and you wind up in the Internet of 1995.
At Congress's '90s-vintage archive site, there's no way to compare bills side by side. No tool to measure the success rate of a bill's sponsor. And there's certainly no way to leave a comment. Congress makes it hard for outside sites to do any of this, either, by refusing to give out bulk data on its bills in a user-friendly form.
On Friday, that might start to change.
What happens when federal agencies rely upon standards developed by standard-setting bodies and communities of practice and incorporate those standards into federal rules? In many cases agencies refer to the standards but do not include the full text of the standards in Federal Register or the Code of Federal Regulations. As a result, those interested in commenting on a particular regulation may not have access to the relevant standard, particularly if it is copyrighted or only accessible for a fee.
The Electronic Frontier Foundation (EFF), the Association of Research Libraries, and OpenTheGovernment.org have sent comments to the Administrative Conference of the US recommending that "all material incorporated by reference -- regardless of the stage in the regulatory process, the subject matter of the regulation, or the identity of the regulated entity -- should be made freely available, with no purported copyright restrictions and downloadable on a government agency's website."
Public.Resource.Org submitted comments to the Office of Management and Budget on making standards that are incorporated by reference into federal regulations widely available to the public without charge. Public.Resource.Org also said that such standards should "be deemed in the public domain rather than subject to copyright restrictions."
- OpenTheGov and ARL Join EFF in Urging Government to Make all Parts of the Law Easily Available to Everyone (10/24/2011).
"copyrighted materials, once incorporated into law, should be available for free." The principles of transparency and accessibility to the law should animate agency decisions in this arena and materials incorporated by reference should be made freely available, online and off, at all times...
- Revised Draft Recommendations of the Administrative Conference of the US on "Incorporation by Reference in Federal Regulations" ACUS.gov (October 2011)
- Comments on "Incorporation by Reference in Federal Regulations" (October 21, 2011) To Committee on Administration and Management Administrative Conference of the United States Committee of Administration and Management from Corynne McSherry & Mark Rumold Electronic Frontier Foundation, Prue Adler, Association of Research Libraries, and Patrice McDermott, OpenTheGovernment.org
We urge ACUS to reject any suggestion that access to the law may be limited where the regulation in question happens to incorporate copyrighted materials. All material incorporated by reference - regardless of the stage in the regulatory process, the subject matter of the regulation, or the identity of the regulated entity - should be made freely available and downloadable on a government agency's website.
- Incorporation by Reference, A Proposed Rule by the Federal Register Office on 02/27/2012
On February 13, 2012, the Office of the Federal Register (OFR or we) received a petition to amend our regulations governing the approval of agency requests to incorporate material by reference into the Code of Federal Regulations. We've set out the petition in this document. We would like comments on the broad issues raised by this petition.
- Re: Request for Information 2012–7602, 77 FR 19357 submitted by Public.Resource.Org to the Office of Information and Regulatory Affairs of the Office of Management and Budget Washington (April 11, 2012).
See also: Liberating America's secret, for-pay laws.
Cory Doctorow says: "This morning, I found a an enormous, 30Lb box waiting for me at my post-office box. Affixed to it was a sticker warning me that by accepting this box into my possession, I was making myself liable for nearly $11 million in damages. The box was full of paper, and printed on the paper were US laws -- laws that no one is allowed to publish or distribute without permission. Carl Malamud, Boing Boing's favorite rogue archivist, is the guy who sent me this glorious box of weird. I was expecting it, because he asked me in advance if I minded being one of the 25 entities who'd receive this law-bomb on deposit. I was only too glad to accept -- on the condition that Carl write us a guest editorial explaining what this was all about. He was true to his word."
Liberating America's secret, for-pay laws, By Carl Malamud, boingboing (Mar 19, 2012).
Boing Boing Official Guest Memorandum of Law To: The Standards People Cc: The Rest of Us People From: Carl Malamud, Public.Resource.Org In Re: Our Right to Replicate the Law Without a License
Readers of Slashdot asked Carl Malamud about his experiences and hopes in his project to prod the U.S. government into scanning archived documents. They asked questions about metadata, digitizing rare books, what he thinks about corporate partnerships in the process to get public data released, other projects like Ancestry.com and PACER, and even "Which government agency is the worst to get information from?"
Malamud's answers are posted at the link below "with a mix of heartening and disheartening information about how the vast project is progressing."
- Carl Malamud Answers: Goading the Government To Make Public Data Public, Slashdot, Your Rights Online section, Posted by timothy on Monday January 23, 2012.
Earlier this month, we posted about the "Open letter and petition to President Obama to create a federal scanning commission and digitize all .gov publications". The petition closed on 1/20 and now David Ferriero, the Archivist of the US at the National Archives, has given the official NARA response. I'd say this is a positive first step, but much discussion is still needed. Please join the conversation over at the NARA Blog. I think documents librarians will be invaluable to this effort going forward!
Digitizing Federal Public Records
By David Ferriero
Thank you for signing a petition asking the Obama Administration to digitize all public records.
The Obama Administration believes increasing access to our collections by digitizing our records is a great idea. Our most recent efforts to do this ourselves as part of our OpenGov initiative, include the Citizen Archivist project, a Wikipedian in Residence, Tag it Tuesdays, and Scanathons. We are also moving forward on implementing the President’s recent Memorandum on Managing Government Records, which focuses on the need to update policies and practices for the digital age.
But all those things aren’t enough. Your petition, and the Yes We Scan effort broadly, calls for a national strategy, and even a Federal Scanning Commission, to figure out what it would take to digitize the holdings of many federal entities, from the Library of Congress to the Government Printing Office to the Smithsonian Institution.
These ideas bring up a host of questions that still need to be answered: What should the National Archives’ priorities be? Do we focus on preserving deteriorating paper records, still bound with red ribbons from two centuries ago? Do we make digital copies of Vietnam Era film footage? Should we focus on preserving those older paper records while citizens volunteer to digitize more recent, and better preserved, records?
The National Archives – which houses the Nation’s permanent records – is looking for your input to help answer these important questions on how we move forward. What are your thoughts on how the National Archives and other agencies should proceed? What questions should we be asking ourselves?
You can add your thoughts over on the National Archives blog, and I’m looking forward to having a longer discussion with the creators and signers of this petition on this important issue in the coming weeks– more details on that will follow.
Thank you again for your interest in this important issue. I’m looking forward to your ideas on how we can proceed with digitizing federal public records.
David Ferriero is the Archivist of the United States
Watching Them Watching: Issa Touts Video Archive of Oversight Hearings, by Nick Judd, TechPresident (January 6 2012).
As of today, the House Committee on Government Oversight under Rep. Darrell Issa has released 1,139 videos of hearings going back to the 103rd Congress of 1993-1994, committee staff announced today.
These videos, dusted off from the House committee's archives, join hundreds more going all the way back to 1987 on House.Resource.org, a repository for archived video and hearing transcripts gleaned from C-SPAN, the House and the Internet Archive as part of a collaboration between Carl Malamud's Public.Resource.org and House Speaker John Boehner. At the start of this Congress, Boehner asked Issa's Oversight committee — which had been recording its own video of hearings, doubling up on video already recorded by the House Broadcasting Studio, since the 2010-2011 session of Congress — to take on archiving and publicising video of committee hearings as a pilot project. The House this year also launched its own streaming of floor proceedings.
Open letter and petition to President Obama to create a federal scanning commission and digitize all .gov publications #FDLPSubmitted by jrjacobs on Tue, 2012-01-03 11:11.
John Podesta and Carl Malamud have written an open letter to President Obama (text below) asking for the creation of a Federal Scanning Commission and to greatly increase the pace of digitization of federal resources. They need 25,000 signatures on their petition by January 20, so your help would be greatly appreciated!
While I have some reservations about wholesale digitization that are glossed over in the letter -- I worry for example about the process and how current digitization methods basically destroy documents, how current OCR software is less than perfect, and about only making a digital equivalent to a paper document, NOT the ability to extract and re-use data and statistics etc. (to read more, see "Achieving a collaborative FDLP future") -- as Malamud says:
"Just imagine ... what if we could scan the contents of the FDLP, back issues of the CFR, the briefs before the Supreme Court? We'll never know if we can scan .gov unless we start asking the questions. Please help us get started!"
For that, I'm asking readers to sign the petition and forward to your friends. A national effort is just what is needed. Librarians must advocate for and participate in this process!
December 21, 2011
The White House
1600 Pennsylvania Avenue
Washington, D.C. 20500
Dear Mr. President:
Locked in our federal vaults is a tremendous storehouse of information that if digitized would form a core for our digital public libraries in America with huge benefit for our country: cutting costs in the Federal government, creating jobs throughout America, and revolutionizing how we educate our citizens, how we practice the law, and how we create news, art, and scholarly works.
Imagine if the riches contained in the National Archives, Library of Congress, Smithsonian Institution, Government Printing Office, National Library of Medicine, National Agricultural Library, National Technical Information Service, and scores of other federal organizations were made available, becoming the core of a national effort to make access to knowledge a right for all Americans. The dream is a big one, but if we do not begin the questions of what it would take to get there, we will never start down that road. Today, we don't know what it would take.
We are not necessarily suggesting that the federal government immediately undertake an ambitious effort to scan the holdings of .gov, but if we ever hope to begin even a small piece of making available our past for use by our future, we should at least begin to scope out the size of the problem. We believe it would require a decade-long commitment to digitization to make our nation's cultural, scientific, educational, and historical resources available, but we can't even begin that discussion unless we know how big the problem is. Such an effort is indeed ambitious to contemplate, but we can only ask if we were able to put a man on the moon, why can't we launch the Library of Congress into cyberspace?
Over the last year, a number of efforts have sprung up to create comprehensive digital libraries. The European Union has created Europeana with a goal to “make a large part of the world's cultural heritage available to a large part of the world's population.” In the United States, efforts have included Google Books, the Hathi Trust, the Internet Archive, and the recently announced Digital Public Library of America, a planning initiative with a goal of “creating a large-scale digital public library that will make the cultural and scientific record available to all.”
No matter what the eventual shape of these efforts, we know that the holdings of the U.S. government will play a crucial role, a central part of our public domain. While there have been many well-intentioned efforts to digitize federal holdings, those efforts have been preliminary and tentative. Our national cultural and scientific organizations have never worked together to develop a coherent digitization strategy to scan at scale.
The PCAST report on Designing a Digital Future hits the nail on the head on investing in Networking and Information Technology (NIT), but does not address squarely the question of what it would to take to digitize the holdings of our national institutions. The Presidential Memorandum on Managing Government Records discusses how to make record-keeping move into the modern age in the future, but does not address how to rescue the past and make it useful for Americans.
One way to begin is to convene governmental and non-governmental experts, perhaps in the form of a Presidential Commission, Interagency Task Force, or other mechanism. The “Federal Scanning Commission” would be tasked to answer 6 questions and deliver a report within a year:
- What are the holdings of our national institutions? How many images, documents, videos, and other objects are there?
- How long would it take to digitize these materials?
- How much would it cost given current technology? Is there directed research or are there economies of scale that would bring those costs down?
- What is the strategy for digital preservation of these materials? How will we avoid digital obsolescence?
- What is the strategy for identifying restrictions on use of the material? How does one identify and safeguard materials that have copyright restrictions, contain personally identifiable information, or contain classified materials?
- What are the economic and non-economic benefits of such an effort?
- What are the cost savings to government?
- What are the economic benefits? Would this effort enable industries that build on top of scientific and technical information, spur innovation in the legal marketplace, or enable our creative industries to create more effectively?
- What are the non-economic benefits? Will such an effort lead to better STEM and other educational efforts? Will it promote a more informed citizenry and better access to justice?
To date, thinking about digitization has been piecemeal. Individual agencies have thought about the problem in terms of prototypes and pilots. Only the White House can bring these efforts together under one roof and begin to think in terms of a national digitization strategy for our federal government.
Bringing government agencies together with outside experts to solve a common problem related to our federal holdings has a precedent. When R. D. W. Connor was appointed as the first Archivist of the United States, he faced a herculean task, getting all the agencies of government to come together with a common vision of “safeguarding and preserving the records of our Government.” The idea of safeguarding and preserving the records of government was a new one, and Archivist Connor found “records mingled higgledy-piggledy with empty whiskey bottles.”
Archivist Connor appealed for help to President Roosevelt, asking for his assistance in forging a common vision among the agencies and for their cooperation. President Roosevelt formed a National Archives Council and convened the first meeting in the Cabinet Room, asking Secretary of State Cordell Hull to serve as chairman. By bringing the agencies together in one room, President Roosevelt made the dream of archiving the records of government a shared vision, and then made that vision real.
When Thomas Jefferson donated his books to create the cornerstone of the Library of Congress, his library contained a wealth of useful information, from an extensive collection on the law to books on agriculture, chemistry, surgery, and medicine. With this contribution, Jefferson saw to it that the government of the United States would play a central role in the increase and diffusion of knowledge. It is time now for us to lay the cornerstone for our own era, to anchor our digital age with the vast holdings of our government so that we may promote the useful arts and the progress of science.
We ask your help to achieve this 21st century dream, making the vast resources of our federal government available to all on the global Internet, making access to knowledge a right for all Americans and a defining contribution for our future.
John D. Podesta, Chair
Center for American Progress
Carl Malamud, President
Happy 150th birthday US Government Printing Office! And to celebrate, here's a blast form the past: a 1979 report from the Public Interest Research Group entitled "The Peoples' Printer: A Report on the Government Printing Office" by Shawn kelly. This little known PIRG report was scanned and put online by Carl Malamud who said in a tweet that "it was handed to me in a brown paper wrapper at an event I was speaking at. Remarkable 1979 independent analysis." Thanks Carl for for scanning and tweeting about this!
The U.S. Government Printing Office (GPO) marks a milestone on March 4th when it celebrates 150 years of producing and delivering Government information for all three branches of the Federal Government and the public. GPO opened its doors on March 4, 1861, the same day President Abraham Lincoln took the oath of office. Throughout its history the agency has used constantly changing technologies to meet the needs of the Congress, Federal agencies, and the public. During GPO's early days, employees relied on ink and paper to set the text for The Emancipation Proclamation. Today, as another President from Illinois leads the Nation, GPO employees are using the latest digital technology to document the actions of our Government while carrying out its founding mission of Keeping America Informed.
While GPO's past has been about printing, its present and future are being defined by digital information technologies. In fact, GPO today is the product of more than a generation of investment in digital production and dissemination technologies, an investment that has yielded stunning improvements in productivity, capability, and savings for the taxpayers, savings of 66% on the cost of congressional printing alone. Employing just 2,200 staff, fewer than at any time in the past century, GPO now provides a range of products and activities that could only have been dreamed of 30 years ago: online databases of Federal documents with state-of-the-art search and retrieval capabilities available to the public without charge, Government publications available as e-Books, passports and smart cards with electronic chips carrying biometric data, print products on sustainable substrates using vegetable oil based inks, and a public presence not only on the Web but on Twitter, Facebook, and You Tube.
There is a good interview with Carl Malamud in Library Journal. Carl sees the intersection of legally public domain government information with orphan works, all legal information, and information we are losing because of lack of curation and control by the private sector.
- Public Information for All: An Interview with Carl Malamud, By Debbie Rabina, Library Journal (Nov 1, 2010).
Do you see a role for libraries?
The availability of materials and the parceling out in the public domain is a huge issue for libraries, not only for government documents generally and legal material specifically but everything from the letters of Ben Franklin to all this wonderful corpus of materials that has been issued but is no longer available, the problem of orphan and fallow works, for example, the problem of official vendors of materials that don’t allow knowledge to be spread. Librarians should and must be jumping up and down and pounding the table and saying, “This is a huge issue.”