John Podesta and Carl Malamud have written an open letter to President Obama (text below) asking for the creation of a Federal Scanning Commission and to greatly increase the pace of digitization of federal resources. They need 25,000 signatures on their petition by January 20, so your help would be greatly appreciated!
While I have some reservations about wholesale digitization that are glossed over in the letter — I worry for example about the process and how current digitization methods basically destroy documents, how current OCR software is less than perfect, and about only making a digital equivalent to a paper document, NOT the ability to extract and re-use data and statistics etc. (to read more, see “Achieving a collaborative FDLP future”) — as Malamud says:
“Just imagine … what if we could scan the contents of the FDLP, back issues of the CFR, the briefs before the Supreme Court? We’ll never know if we can scan .gov unless we start asking the questions. Please help us get started!”
For that, I’m asking readers to sign the petition and forward to your friends. A national effort is just what is needed. Librarians must advocate for and participate in this process!
December 21, 2011
The White House
1600 Pennsylvania Avenue
Washington, D.C. 20500
Dear Mr. President:
Locked in our federal vaults is a tremendous storehouse of information that if digitized would form a core for our digital public libraries in America with huge benefit for our country: cutting costs in the Federal government, creating jobs throughout America, and revolutionizing how we educate our citizens, how we practice the law, and how we create news, art, and scholarly works.
Imagine if the riches contained in the National Archives, Library of Congress, Smithsonian Institution, Government Printing Office, National Library of Medicine, National Agricultural Library, National Technical Information Service, and scores of other federal organizations were made available, becoming the core of a national effort to make access to knowledge a right for all Americans. The dream is a big one, but if we do not begin the questions of what it would take to get there, we will never start down that road. Today, we don’t know what it would take.
We are not necessarily suggesting that the federal government immediately undertake an ambitious effort to scan the holdings of .gov, but if we ever hope to begin even a small piece of making available our past for use by our future, we should at least begin to scope out the size of the problem. We believe it would require a decade-long commitment to digitization to make our nation’s cultural, scientific, educational, and historical resources available, but we can’t even begin that discussion unless we know how big the problem is. Such an effort is indeed ambitious to contemplate, but we can only ask if we were able to put a man on the moon, why can’t we launch the Library of Congress into cyberspace?
Over the last year, a number of efforts have sprung up to create comprehensive digital libraries. The European Union has created Europeana with a goal to “make a large part of the world’s cultural heritage available to a large part of the world’s population.” In the United States, efforts have included Google Books, the Hathi Trust, the Internet Archive, and the recently announced Digital Public Library of America, a planning initiative with a goal of “creating a large-scale digital public library that will make the cultural and scientific record available to all.”
No matter what the eventual shape of these efforts, we know that the holdings of the U.S. government will play a crucial role, a central part of our public domain. While there have been many well-intentioned efforts to digitize federal holdings, those efforts have been preliminary and tentative. Our national cultural and scientific organizations have never worked together to develop a coherent digitization strategy to scan at scale.
The PCAST report on Designing a Digital Future hits the nail on the head on investing in Networking and Information Technology (NIT), but does not address squarely the question of what it would to take to digitize the holdings of our national institutions. The Presidential Memorandum on Managing Government Records discusses how to make record-keeping move into the modern age in the future, but does not address how to rescue the past and make it useful for Americans.
One way to begin is to convene governmental and non-governmental experts, perhaps in the form of a Presidential Commission, Interagency Task Force, or other mechanism. The “Federal Scanning Commission” would be tasked to answer 6 questions and deliver a report within a year:
- What are the holdings of our national institutions? How many images, documents, videos, and other objects are there?
- How long would it take to digitize these materials?
- How much would it cost given current technology? Is there directed research or are there economies of scale that would bring those costs down?
- What is the strategy for digital preservation of these materials? How will we avoid digital obsolescence?
- What is the strategy for identifying restrictions on use of the material? How does one identify and safeguard materials that have copyright restrictions, contain personally identifiable information, or contain classified materials?
- What are the economic and non-economic benefits of such an effort?
- What are the cost savings to government?
- What are the economic benefits? Would this effort enable industries that build on top of scientific and technical information, spur innovation in the legal marketplace, or enable our creative industries to create more effectively?
- What are the non-economic benefits? Will such an effort lead to better STEM and other educational efforts? Will it promote a more informed citizenry and better access to justice?
To date, thinking about digitization has been piecemeal. Individual agencies have thought about the problem in terms of prototypes and pilots. Only the White House can bring these efforts together under one roof and begin to think in terms of a national digitization strategy for our federal government.
Bringing government agencies together with outside experts to solve a common problem related to our federal holdings has a precedent. When R. D. W. Connor was appointed as the first Archivist of the United States, he faced a herculean task, getting all the agencies of government to come together with a common vision of “safeguarding and preserving the records of our Government.” The idea of safeguarding and preserving the records of government was a new one, and Archivist Connor found “records mingled higgledy-piggledy with empty whiskey bottles.”
Archivist Connor appealed for help to President Roosevelt, asking for his assistance in forging a common vision among the agencies and for their cooperation. President Roosevelt formed a National Archives Council and convened the first meeting in the Cabinet Room, asking Secretary of State Cordell Hull to serve as chairman. By bringing the agencies together in one room, President Roosevelt made the dream of archiving the records of government a shared vision, and then made that vision real.
When Thomas Jefferson donated his books to create the cornerstone of the Library of Congress, his library contained a wealth of useful information, from an extensive collection on the law to books on agriculture, chemistry, surgery, and medicine. With this contribution, Jefferson saw to it that the government of the United States would play a central role in the increase and diffusion of knowledge. It is time now for us to lay the cornerstone for our own era, to anchor our digital age with the vast holdings of our government so that we may promote the useful arts and the progress of science.
We ask your help to achieve this 21st century dream, making the vast resources of our federal government available to all on the global Internet, making access to knowledge a right for all Americans and a defining contribution for our future.
John D. Podesta, Chair
Center for American Progress
Carl Malamud, President
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.