[UPDATE 6/27: COL's final report is now available online. We've added a link to it below along with the draft report]
We here at FGI are all for greater access to government information and have long supported and worked toward a fully digital FDLP. When discussing the future of the FDLP, we believe it is important to create policy based on thorough fact-based analysis, to learn from FDLP history and not repeat mistakes which in the past led to benign neglect of documents collections -- many of which were borne out of trying to handle government documents collections on the cheap. For example, lack of adequate cataloging, one of our biggest current problems today, is a direct result of libraries not investing sufficiently in describing FDLP collections.
I had the distinct honor to be invited to speak at the University of Washington Libraries on thursday, January 24, 2013. I want to thank Cass Hartnett, the Northwest Government Information Network, the UW Information School, the UW Association of Library and Information Science Students (ALISS), and the University of Washington Libraries for allowing me the opportunity to talk publicly about the future of the Federal Depository Library Program (FDLP). the audio for my talk can be downloaded from the UW Library digital archive or streamed below from the Internet Archive.
that is all.
We’re at the very beginning of the digital era where tools, policies, best practices, etc are all in flux. In many ways, we’re at the age of new metaphors needed to describe what it is that we as librarians do on a daily basis.
I'd like to talk about the underlying historical ideals of the FDLP, discuss how those ideals have been under fire from both within and without the library community and argue that those ideals applied to today's new information metaphors give us the best chance at access to and long-term preservation and assurance of govt information.
Then I’ll talk about some of the digital collection strategies that I’ve found to be successful and then conclude with a bit about collaboration and to-dos.
[UPDATE 3/25/12: for those of you on ipads/iphones (which don't play well with flash), I've attached the PDF version below. JRJ]
I just got back from Shreveport, LA (and boy are my arms tired :-)). I was honored to be invited as the keynote speaker at the Spring 2012 meeting of the Louisiana Government Documents Round Table (LA GODORT) during the 2012 LA Library Association annual conference. We had a great conversation about the future of government information and the FDLP -- and I reminded everyone to submit their FDLP library forecast survey!! I showed a few case studies describing ways that librarians could build digital govt documents collections (including Everyday Electronic Materials (EEMs), LOCKSS-USDOCS, and Archive-it). Thanks again to Miriam Childs, Stephanie Braunstein and the rest of the LA GODORTers for a wonderful time!
This story is making it around the interwebs/twitterverse today. Brainpicker, a wonderful blog, posted a story today about digital preservation in the film industry -- "The future for digital storage is constant migration.". While focusing on independent filmmakers and nonprofit archives, it's worthwhile to do a find-and-replace "film" with "government documents" and "filmmakers" with "government information (nee documents) librarians." Digital preservation takes collaboration and long-term vision. period!
"Most of the filmmakers surveyed...were not aware of the perishable nature of digital content or how short its unmanaged lifespan is." After the Motion Picture Academy's release last month of "The Digital Dilemma 2," a warning aimed at independent filmmakers and nonprofit archives, cinematographer John Bailey talks with one of the report's authors about the perils of data migration ("It’s not unreasonable to say that the term "digital preservation" is an oxymoron") and the need to educate filmmakers who are so "enamored with the perceived benefits of digital image capture and workflow" that they fail to realize preservation concerns start to appear almost immediately after their work is completed. Film professor David Bordwell covers the report in a detailed post about preserving "born-digital" films, sixth in his "Pandora's Digital Box" series about the worldwide conversion to digital projection, with lots of good links at the bottom.
[HT to Brainpicker!]
Earlier this month, we posted about the "Open letter and petition to President Obama to create a federal scanning commission and digitize all .gov publications". The petition closed on 1/20 and now David Ferriero, the Archivist of the US at the National Archives, has given the official NARA response. I'd say this is a positive first step, but much discussion is still needed. Please join the conversation over at the NARA Blog. I think documents librarians will be invaluable to this effort going forward!
Digitizing Federal Public Records
By David Ferriero
Thank you for signing a petition asking the Obama Administration to digitize all public records.
The Obama Administration believes increasing access to our collections by digitizing our records is a great idea. Our most recent efforts to do this ourselves as part of our OpenGov initiative, include the Citizen Archivist project, a Wikipedian in Residence, Tag it Tuesdays, and Scanathons. We are also moving forward on implementing the President’s recent Memorandum on Managing Government Records, which focuses on the need to update policies and practices for the digital age.
But all those things aren’t enough. Your petition, and the Yes We Scan effort broadly, calls for a national strategy, and even a Federal Scanning Commission, to figure out what it would take to digitize the holdings of many federal entities, from the Library of Congress to the Government Printing Office to the Smithsonian Institution.
These ideas bring up a host of questions that still need to be answered: What should the National Archives’ priorities be? Do we focus on preserving deteriorating paper records, still bound with red ribbons from two centuries ago? Do we make digital copies of Vietnam Era film footage? Should we focus on preserving those older paper records while citizens volunteer to digitize more recent, and better preserved, records?
The National Archives – which houses the Nation’s permanent records – is looking for your input to help answer these important questions on how we move forward. What are your thoughts on how the National Archives and other agencies should proceed? What questions should we be asking ourselves?
You can add your thoughts over on the National Archives blog, and I’m looking forward to having a longer discussion with the creators and signers of this petition on this important issue in the coming weeks– more details on that will follow.
Thank you again for your interest in this important issue. I’m looking forward to your ideas on how we can proceed with digitizing federal public records.
David Ferriero is the Archivist of the United States
Open letter and petition to President Obama to create a federal scanning commission and digitize all .gov publications #FDLPSubmitted by jrjacobs on Tue, 2012-01-03 11:11.
John Podesta and Carl Malamud have written an open letter to President Obama (text below) asking for the creation of a Federal Scanning Commission and to greatly increase the pace of digitization of federal resources. They need 25,000 signatures on their petition by January 20, so your help would be greatly appreciated!
While I have some reservations about wholesale digitization that are glossed over in the letter -- I worry for example about the process and how current digitization methods basically destroy documents, how current OCR software is less than perfect, and about only making a digital equivalent to a paper document, NOT the ability to extract and re-use data and statistics etc. (to read more, see "Achieving a collaborative FDLP future") -- as Malamud says:
"Just imagine ... what if we could scan the contents of the FDLP, back issues of the CFR, the briefs before the Supreme Court? We'll never know if we can scan .gov unless we start asking the questions. Please help us get started!"
For that, I'm asking readers to sign the petition and forward to your friends. A national effort is just what is needed. Librarians must advocate for and participate in this process!
December 21, 2011
The White House
1600 Pennsylvania Avenue
Washington, D.C. 20500
Dear Mr. President:
Locked in our federal vaults is a tremendous storehouse of information that if digitized would form a core for our digital public libraries in America with huge benefit for our country: cutting costs in the Federal government, creating jobs throughout America, and revolutionizing how we educate our citizens, how we practice the law, and how we create news, art, and scholarly works.
Imagine if the riches contained in the National Archives, Library of Congress, Smithsonian Institution, Government Printing Office, National Library of Medicine, National Agricultural Library, National Technical Information Service, and scores of other federal organizations were made available, becoming the core of a national effort to make access to knowledge a right for all Americans. The dream is a big one, but if we do not begin the questions of what it would take to get there, we will never start down that road. Today, we don't know what it would take.
We are not necessarily suggesting that the federal government immediately undertake an ambitious effort to scan the holdings of .gov, but if we ever hope to begin even a small piece of making available our past for use by our future, we should at least begin to scope out the size of the problem. We believe it would require a decade-long commitment to digitization to make our nation's cultural, scientific, educational, and historical resources available, but we can't even begin that discussion unless we know how big the problem is. Such an effort is indeed ambitious to contemplate, but we can only ask if we were able to put a man on the moon, why can't we launch the Library of Congress into cyberspace?
Over the last year, a number of efforts have sprung up to create comprehensive digital libraries. The European Union has created Europeana with a goal to “make a large part of the world's cultural heritage available to a large part of the world's population.” In the United States, efforts have included Google Books, the Hathi Trust, the Internet Archive, and the recently announced Digital Public Library of America, a planning initiative with a goal of “creating a large-scale digital public library that will make the cultural and scientific record available to all.”
No matter what the eventual shape of these efforts, we know that the holdings of the U.S. government will play a crucial role, a central part of our public domain. While there have been many well-intentioned efforts to digitize federal holdings, those efforts have been preliminary and tentative. Our national cultural and scientific organizations have never worked together to develop a coherent digitization strategy to scan at scale.
The PCAST report on Designing a Digital Future hits the nail on the head on investing in Networking and Information Technology (NIT), but does not address squarely the question of what it would to take to digitize the holdings of our national institutions. The Presidential Memorandum on Managing Government Records discusses how to make record-keeping move into the modern age in the future, but does not address how to rescue the past and make it useful for Americans.
One way to begin is to convene governmental and non-governmental experts, perhaps in the form of a Presidential Commission, Interagency Task Force, or other mechanism. The “Federal Scanning Commission” would be tasked to answer 6 questions and deliver a report within a year:
- What are the holdings of our national institutions? How many images, documents, videos, and other objects are there?
- How long would it take to digitize these materials?
- How much would it cost given current technology? Is there directed research or are there economies of scale that would bring those costs down?
- What is the strategy for digital preservation of these materials? How will we avoid digital obsolescence?
- What is the strategy for identifying restrictions on use of the material? How does one identify and safeguard materials that have copyright restrictions, contain personally identifiable information, or contain classified materials?
- What are the economic and non-economic benefits of such an effort?
- What are the cost savings to government?
- What are the economic benefits? Would this effort enable industries that build on top of scientific and technical information, spur innovation in the legal marketplace, or enable our creative industries to create more effectively?
- What are the non-economic benefits? Will such an effort lead to better STEM and other educational efforts? Will it promote a more informed citizenry and better access to justice?
To date, thinking about digitization has been piecemeal. Individual agencies have thought about the problem in terms of prototypes and pilots. Only the White House can bring these efforts together under one roof and begin to think in terms of a national digitization strategy for our federal government.
Bringing government agencies together with outside experts to solve a common problem related to our federal holdings has a precedent. When R. D. W. Connor was appointed as the first Archivist of the United States, he faced a herculean task, getting all the agencies of government to come together with a common vision of “safeguarding and preserving the records of our Government.” The idea of safeguarding and preserving the records of government was a new one, and Archivist Connor found “records mingled higgledy-piggledy with empty whiskey bottles.”
Archivist Connor appealed for help to President Roosevelt, asking for his assistance in forging a common vision among the agencies and for their cooperation. President Roosevelt formed a National Archives Council and convened the first meeting in the Cabinet Room, asking Secretary of State Cordell Hull to serve as chairman. By bringing the agencies together in one room, President Roosevelt made the dream of archiving the records of government a shared vision, and then made that vision real.
When Thomas Jefferson donated his books to create the cornerstone of the Library of Congress, his library contained a wealth of useful information, from an extensive collection on the law to books on agriculture, chemistry, surgery, and medicine. With this contribution, Jefferson saw to it that the government of the United States would play a central role in the increase and diffusion of knowledge. It is time now for us to lay the cornerstone for our own era, to anchor our digital age with the vast holdings of our government so that we may promote the useful arts and the progress of science.
We ask your help to achieve this 21st century dream, making the vast resources of our federal government available to all on the global Internet, making access to knowledge a right for all Americans and a defining contribution for our future.
John D. Podesta, Chair
Center for American Progress
Carl Malamud, President
The Defining Moment
As noted here recently, the depository community is into yet another round of trying to redefine the Federal Depository Library Program. (See: GPO contracts Ithaka S+R to develop sustainable FDLP models.)
This new project will question and evaluate the role that FDLP libraries will play in the lifecycle of government information. The definition of this role will determine whether or not FDLP libraries will deserve or get support from their constituencies. If they serve a useful function, they will get support; if they do not, they will not.
This is a defining moment. Libraries will not flourish or even survive because librarians like them, or because older people have fond memories of them, or even because we have a vague belief that they "should" survive. They will survive only if they fulfill a role in society that no one else fulfills as well -- or at all -- and if society recognizes and values that role.
Make no mistake about it: what is at stake is the survival of FDLP libraries. By defining the role of FDLP libraries, this project will determine whether or not there will be Federal Depository Libraries at all.
Defining the role of libraries
Typically, the roles of libraries are defined either in general terms of who the library serves or by enumerating specific tasks. You could call these the "who you serve" approach and the "what you do" approach. Each has advantages because each one can help focus our thinking and give us criteria against which we can evaluate our actions.
Since 1993, GPO has effectively defined a limited -- and shrinking -- role for FDLP libraries. The FDLP has always been a cooperative venture in which different partners (GPO, regional depositories, selective depositories) played different roles. But, since 1993, when The Government Printing Office Electronic Information Access Enhancement Act (Public Law 103-40) was passed, GPO has arrogated to itself the role of permanent preservation of government information and essentially prevented FDLP libraries from undertaking that role by refusing to deposit digital materials with depository libraries.
But that is changing. Recent developments (for example, GPO collaborating with LOCKSS in the LOCKSS-USDOCS project) have demonstrated that GPO is open to sharing responsibility, is no longer committed to preventing libraries from participating in digital preservation, is open to "digital deposit," and is, in general, open to new roles for FDLP libraries. These "new" roles could look a lot more like the traditional roles of FDLP libraries. Yes, we need to change the FDLP for the digital age, but these should not be changes in our traditional roles (what we do: build collections and provide services for and stewardship of those collections) but changes in procedures (how we do these things using digital tools instead of paper tools). It is in those traditional roles that libraries have a unique value in society. The Ithaka/GPO project will define the role of FDLP libraries either in a way that will expand this new openness, allowing libraries to have the flexibility to participate actively, or in a limited and passive way in the mode of the 1993-2009 model.
1. Who do you serve?
In defining the general role of libraries (who a library serves), there is a popular tendency to focus on the parent institution. This is an easy role to justify and explain and it even lends itself to some degree of quantification and accountability. The recent report from the Association of College and Research Libraries, The value of academic libraries: A Comprehensive Research Review and Report (by Megan Oakleaf, 2010) is a prime example of this view.
An alternative view is that libraries fulfill a larger role in society as a whole and thus serve more than their parent institution. The Darien Statements on the Library and Librarians (by John Blyberg, Kathryn Greenhill, and Cindi Trainor, 2009) is one recent example of this broader view of the role of libraries in society. Barbara Fister, a librarian at Gustavus Adolphus College, has written eloquently about this. New initiatives such as the NSF's rules for sharing of data and creating explicit plans for managing data for the long-term are driving libraries to anticipate a larger role for libraries in fulfilling the societal need for long-term stewardship and curation of information of all kinds across the life-cycle of information. We can anticipate this as a coming major shift in focus and priorities from the local to the global role of libraries.
While these two different views need not exclude each other, in practice they sometimes do. It would be wiser to view these two ideas as complementary rather than incompatible. I would argue, in fact, that any definition of the role of libraries that excludes one or the other of these views will almost certainly be incomplete and fatally flawed.
The role of FDLP as a whole vs. the role of individual FDLP libraries
I would expect that most people in the depository community would assume that the role of FDLP must, by its very nature and purpose, reflect a society-as-a-whole mission and go beyond the missions of individual participants in the program. It will be hard for Ithaka S+R to suggest a model that avoids the big, societal role of the FDLP. But the details of how to fulfill that societal role is the issue. What specific tasks will Ithaka S+R define for depository libraries, on the one hand, and GPO and other (unspecified) "partners" (RFQ, page 5), on the other hand? These definitions will either open up the options for libraries or explicitly limit them.
Despite broad statements of the role of the FDLP as a whole, such as "create an informed citizenry and an improved quality of life" (RFQ, page 4), I think we may see considerable pressure to use a narrow definition of the role of the individual depository libraries. One reason for this is the recent history of GPO and the FDLP in which GPO took over complete control of long-term preservation and access to all digital information in the program and relegated libraries to a role of only providing customer service. Some will see this split of roles as a de facto standard that should be continued. I would argue that, while that split may have seemed appropriate 17 years ago, much has changed and libraries are now more tech-savvy and better equipped to take on digital challenges. Those who argue for the 1993 status quo are the true luddites. It is a time for change.
The pressure to limit the role of libraries will come, I think, from librarians who view the role of their own library as limited to fulfilling the mission of their own institution -- those who would like someone else to take on the big, societal mission. It will come from those who support the Oakleaf ACRL report and its institution-focus and emphasis on "return-on-investment, commodity production, [and] impact." It will come from library directors who, for legitimate but parochial reasons, want to weed depository materials from their collections and do not wish to invest in digital depository collections. It may come from Ithaka S+R itself whose Manager of Research, Roger Schonfeld, has already praised the ACRL report because it "clearly frames the purpose and value of the academic library in the context of the parent organization." And, it may even come from some at GPO, since it matches the model GPO developed in 1993.
But there are those who will be advocating a broader, more active role for FDLP libraries. Those of us who believe that a network of Congressionally-mandated libraries can provide a better, more secure, more robust, more flexible, more responsive system than GPO or GPO and a few commercial "partners" could provide by themselves. One way to do this is to focus on the specific tasks that those libraries might undertake.
2. What do you do? The specific tasks.
There are a number of options for the tasks and roles of libraries. Many of these have been been well articulated and tried with varying degrees of success. For example:
- Libraries as "service centers." You might call this the "libraries without collections" or the "librarians without libraries" model. This is the model designed by GPO in 1993. It is the model that ITHAKA, the parent organization of Ithaka S+R, has used as its own business model for Portico and JSTOR. This model is favored by the Association of Research Libraries, by many library administrators who apparently believe that it would be better if someone else took the responsibility of preserving government information and ensuring its long-term accessibility and usability, and by many depository librarians who do not have the support of their institutions to build and manage digital collections.
- Libraries as "business centers." This model is an extension or complement to the above model because it, too, envisions libraries without collections. It sees the library as the agent that manages leases and licensing agreements with publishers and other producers (including government agencies). This model is advocated by ITHAKA, by commercial data providers who make a profit by limiting access to information to those who pay for it, by government agencies that have accepted that they must be self-supporting and thus sell their information, and by the information industry, which would like the role of libraries to be enforcers of rights-management. In addition, most libraries have already accepted this model for many classes of digital information by leasing access to databases or electronic journals instead of demanding their own digital copies.
- Libraries that provide both collections and services. This is the traditional library model in the paper world and it is increasingly the new model in the digital world. Examples of libraries doing this include libraries and projects of all shapes and sizes: The California Digital Library, the Hathi Trust, the Scholars Portal of the Ontario Council of University Libraries, the Legal Information Archive of the Chesapeake Project, the North Carolina Digital Repository, the Stanford University Freedom of Information and CRS and FRUS collections at Archive-It, the University of North Texas digital government collections, the LOCKSS-USDOCS Network, the Historical Publications of the United States Commission on Civil Rights at the Thurgood Marshall Law Library at the University of Maryland, the many U.S. Government Publication Digitization Projects, and many more. This is also the model that private sector government information service providers use when they obtain copies of digital information so that they can provide services for that information rather than trying to provide services for collections that they do not hold. This model also fits the OAIS preservation model which requires that an archive "obtain sufficient control" of information in order ensure long-term preservation.
Defining our own future
Over the past two decades GPO has redefined the role that FDLP libraries play in the lifecycle of government information by reducing that role to one of providing service for collections held by GPO and others. FDLP Libraries have largely accepted this de facto redefinition of their role. Why? Because they have been caught in a perfect storm of inadequate budgets and staffing and training, demands from library administrators to reduce the size and footprint of paper collections, users who were quick to accept online access without demanding -- or even being cognizant of -- long-term preservation, and a GPO that promised it would single-handedly take on the pesky problem of managing and preserving a single digital collection for everyone while simultaneously providing "access."
That model is failing and GPO is now open to a new model that involves depository libraries as active participants again. We have an opportunity with the Ithaka/GPO project to take the lead in defining the future of our own libraries. In the last two decades, depository libraries have followed the lead of GPO and accepted a diminished role of providing services without collections. We have followed the lead of technologists and digital-pundits who like to call the Google Books project a "library" while simultaneously diminishing the importance of actual libraries that are accountable to their communities. In a chicken-and-egg situation, our diminished activities have made depository activities easy targets for budget cuts that further diminished our ability to provide adequate digital services. As the private sector steps into this gap, our services begin to pale in comparison. We have not adequately differentiated ourselves from the private sector and government digital services, preferring to piggy back on what they do, enforce their rights-restrictions and their privacy-encroaching policies. But that strategy has endangered the principles of the FDLP and reduced our ability to do what the private sector will not and the government cannot do alone. Now we have an opportunity to change all that.
Where GPO must have a one-size-fits all collection and service model, FDLP libraries can each focus on their own communities of interest. Where private sector companies limit access to those who pay and GPO is specifically authorized in the 1993 law to "charge reasonable fees," FDLP libraries are dedicated to providing information without charging. While the private sector, by definition, provides only those services that generate a profit, FDLP libraries are funded to provide services for their communities by leveraging the resources of the community for the benefit of all. A GPO-centered model of preservation is fragile because it has only a single budgetary authority and a single, monolithic "community." In contrast, a preservation model based on FDLP libraries has as many as 1000 budgets, 1000 communities, 1000 locations, 1000 systems, and 1000 reasons to survive and flourish. It is 1000 times more secure.
FDLP libraries can build new digital collections that combine Title-44-materials with non-Title-44 materials. GPO cannot do that. We can support our own communities-of-interest that need no longer be geographically based. GPO has to serve everyone and does not have the resources to focus on every potential community of interest. We can build services and functionalities for our collections that focus on our communities of interest. GPO must provide generic services and generic APIs. We can guarantee that we will preserve the information in our collections for as long as those materials are of interest to our communities. GPO cannot guarantee that Congress will continue to fund preservation for everything forever. In fact, the RFQ does not contain the words "preservation" or "long-term." We can assure our communities that we will preserve their privacy and provide information that is usable and without fees. GPO cannot make those guarantees. We can, collectively, do a better, more secure job of preserving government information for the long-term and assuring that it will be available and usable without fees than any single institution or agency can. Together, we can build a twenty-first century FDLP that will do for digital materials what the FDLP has always done: ensure long-term, free, public access to government information.
It should be clear to us that returning to a model in which FDLP libraries take an active role in building and preserving collections and providing services will provide clear benefits to users of government information. But the benefits go beyond providing access and services. A strong FDLP also benefits the participating libraries, and other non-participating libraries. GPO and the Depository Library Council are working to create a comprehensive list of benefits for libraries participating in the FDLP and benefits afforded to the public by having access to these libraries.
While the Ithaka/GPO project provides an opportunity to turn the FDLP around and make it viable and useful in the twenty-first century, it is not at all certain that this will happen. As noted above, there are those who will argue for the status quo that limits the role of FDLP libraries to an unsustainable, unjustifiable service-without-collections model.
Ithaka S+R has already written a report with a model for the FDLP (Documents for a Digital Democracy: A Model for the Federal Depository Library Program in the 21st Century). In that report, it recommended that "GPO should develop formal partnerships with a small number of dedicated preservation entities -- such as organizations like HathiTrust or Portico or individual libraries -- to preserve a copy of its materials" (page 44, emphasis added). As noted above, Ithaka S+R is affiliated with Portico through their parent organization, ITHAKA. (For another take on a related Ithaka S+R study, see: Nyquist, Corinne(2010) 'An Academic Librarian's Response to the “ITHAKA Faculty Survey 2009: Key Strategic Insights for Libraries, Publishers, and Societies”', Journal of Interlibrary Loan, Document Delivery & Electronic Reserve, 20: 4, 275 -- 280.) Although Roger Schonfeld has assured us that this time Ithaka S+R recommendations "will not focus on specific brands, services, or products, including those provided by any part of our organization," it appears that we will have to convince him and Ithaka S+R that FDLP libraries, not just a "small number of dedicated preservation entities" like Portico, must play a significant role in the preservation of FDLP library materials.
Ithaka S+R has pledged that their work will include broad, vigorous community engagement and that their work will will rely on community input and advice to guide their research and define their recommendations. It is time for us to speak up. Here are some things you can do today:
- Participate in the discussion on the project web site.
- Participate in FDLP's work to create a comprehensive list of benefits for libraries participating in the FDLP.
- Work with your colleagues to create a list of benefits to your own library of being an FDLP library. Share this with your library management.
- Identify your own library's user-communities. Do you have subject collections that are used by people beyond your local geographic community? Would those collections be enhanced if they included government-produced information? Make the case to your communities and your library management.
- Do you already have digital collections? Do you have digital collection tools that you could apply to government information? Or would you like to develop digital collections and services with copyright-free, DRM-free digital materials? Develop a plan that will enhance your library's digital future by utilizing free government information content.
- Add your comments and ideas here to this post on FGI.
UPDATED 9/27. GPO staff kindly sent me the link to the original GPO RFQ, which I added below. JRJ
Many of our readers will have already seen this announcement yesterday that Ithaka S+R (the "strategy and research" arm of the Ithaka group which also includes JSTOR academic journal database and Portico digital archive service) has been contracted by the Government Printing Office (GPO) for a project to "develop sustainable models for the FDLP in the 21st century" (see FedBizOpps award notice and GPO Request for Quotation (RFQ) here). Those interested may track the project on their Web site fdlpmodeling.net. While we have had spirited discussions with Ithaka regarding the future of the FDLP in the past, we look forward to tracking this project, participating in the discussions, and analyzing the outcomes.
Because so many of us WILL participate in this project, I'd like to take this opportunity to point out a recent article by Barbara Fister in her Library Babel Fish column entitled "Assessing the (Enduring) Value of Libraries" (Inside Higher Ed, September 17, 2010). In the article, Fister points out that "library values - such as the preservation of knowledge and the protection of intellectual freedom - are bigger than any one library or any single community's local needs."
"If we focus so exclusively on how we contribute to the bottom line of a single institution, we may lose sight of the fact that libraries are cultural institutions that have something to contribute to society beyond our campuses and beyond this fiscal year. Somehow "return on investment" sounds like the kind of managerial thinking that shortchanges the future. And that worries me."
I hope that Ithaka and my government documents and library colleagues will have Fister's words in the front of their minds as they discuss and plan for the future of the FDLP. And I hope Ithaka will share their draft survey instrument, list of participants, and draft report(s) as they work through this FDLP modeling project.
On behalf of the Government Printing Office, Ithaka S+R is launching a project that will develop sustainable models for the Federal Depository Library Program in the 21st century. The Program plays a critical role in making federal government information available to the American public, preserving it, and providing services to help the public and specialized user communities to make effective use of government information. We’re currently reaching out to libraries of all types – public, government, academic, and law libraries, and both participants and non-participants in the Program – to alert you of the launch of this project. We hope that the FreeGovInfo community will choose to engage with us regularly during the project’s six-month duration, to ensure that your experience and perspective is incorporated to the greatest extent possible.
Engaging the community – including non-participating libraries that may rely on depository libraries in providing government information services to their constituents as well as members of the Program – is a priority for this project. We will rely on the input of the library community in our efforts to model a FDLP that meets the needs of depository libraries as well as of the broader library community and those they serve. Towards this end, we’ve just launched a website – fdlpmodeling.net – to serve as a venue for community engagement, providing updates on the status of the project, offering a variety of mechanisms for community input, and sharing drafts and interim deliverables for comment. We’d encourage you to visit the site, learn about the project, and share your thoughts with us at this early stage. While you’re there, please subscribe to our RSS feed or sign up for email updates so you can be alerted of new posts. And of course, please share this information with any colleagues or others who might be interested.
We hope to hear from you over the course of this project, either via fdlpmodeling.net or directly by email.
Roger Schonfeld & Ross Housewright
[UPDATE 9/23/11: It's come to our attention that scribd, the site that hosts the document below, does not make it easy for users to download. In some instances it appears as if the user has to subscribe to scribd before they can download. So I've attached a copy of the document below for your free downloading pleasure. JRJ]
In early April, Michael Keller, Stanford University Librarian and my boss, had a phone conversation with Beth Simone Noveck, US deputy Chief Technology Officer for Open Government leading President Obama's Open Government Initiative. Noveck requested a short report outlining how the digital FDLP would work.
Below is that report outlining a distributed ecosystem, or publications.gov, that "would incorporate collaborative cataloging/metadata creation, as well as shared or Peer-to-Peer (P2P) technical infrastructure in which data and technological redundancy and collective and proactive action reign." As many of you already know, some of the pieces for a digital FDLP ecosystem are already in place. However, as our recent post, "The State of FDsys and the Future of the FDLP", showed, some of those critical pieces are on shaky ground to say the least.
The report was forwarded to Bob Tapella and Mike Wash at GPO as well as Aneesh Chopra, Chief Technology Officer (CTO), Vivek Kundra, Chief Information Officer (CIO), and US Archivist David Ferriero.
FDLP issues are now front and center to the movers and shakers in the Obama administration. But we'll need more libraries and librarians willing to step up and pitch in to make the digital FDLP ecosystem a reality.
Digital FDLP Ecosystem
I am not an expert on Digital Object Identifiers (DOI) or Handles or other methods of creating permanent, persistent links to information on the web, so I pose this as a question. Could DOIs help solve three problems that, if solved, would provide better preservation, better access, and a better user experience?
The three challenges are:
1. The need for reliable, permanent, persistent links.
2. The need to provide a simple user interface to depository collections.
3. The need to guarantee authenticity of government information.
Here is why I think the answer is Yes.
Problem: Providing reliable, permanent, persistent links. Currently, GPO uses PURLs (Persistent Uniform Resource Locators) for creating permanent links. PURLs provide "persistent" links so that, when a page moves and its URL changes, that change need only be recorded once -- in the PURL database -- and all the hundreds or thousand of links to the PURL resolve to the new address automatically without being changed themselves. While this is an efficient way to deal with the dynamic nature of web addresses and, while this system works, it is fragile. We had a graphic demonstration of that fragility last August when the GPO PURL server crashed. When that happened, no one anywhere in the world who relied on PURL links to the 115,000 PURLs pointing to government information could reach that information using those links for more than two weeks. This was not the fault of GPO (athough restoration time could have been reduced with better disaster planning). Rather, the very nature of PURLs makes them fragile in this way and vulnerable to the crash of a single server.
Solution: Persistence is a function of organizations, not of technology. DOIs address the fragility problem by building a social structure that guarantees persistence. As the DOI organization says, "Persistence is a function of organizations, not of technology; a persistent identifier system requires a persistent organization and defined processes. The International DOI Foundation (IDF) provides a federation of Registration Agencies (RA). Dependency on any one RA is removed." In other words, if one server crashes, others are available immediately. Rather than relying on a single organization (GPO) and a single server at that organization, DOIs rely on multiple Registration Agencies and multiple servers. DOIs are reliable because they use redundancy and have no single points of failure (Wikipedia).
Problem: Providing a simple user interface. Imagine with me for a moment a depository system that deposits digital documents in FDLP libraries. Once such a system is in place, we will have the same document in multiple locations -- perhaps one copy in GPO's Federal Digital System (FDsys), one copy in each of a dozen or more FDLP libraries, perhaps an "original" copy at house.gov or senate.gov, and so forth. What is the user to do? Will libraries show dozens of links with an explanation after each as to what it is and hope users will have the patience to read the explanations, make an informed decision, and, if that particular link is down, go back and repeat the process? This sounds like a lousy user experience to me.
Solution: Multiple redirections. DOIs provide a way to resolve multiple URLs with a single DOI. (Resolution of Multiple URLs). This would mean that multiple copies of digital documents could be stored at many separate FDLP libraries and all could use a single, clickable link (a DOI) that would get users the copy of that document based on criteria the library defines. For example, one library might have the DOI point to the original first and the local library copy second; another library might point to the "network-closest" copy first and then other more distant copies; and so forth. DOIs do this by storing metadata with the DOI. Rather than storing only the current URL of a registered item, DOIs can record a list of locations with hints for how the resolving client should select a location, including an ordered set of selection methods.
Here is an illustration of how it works:
This solution would have the added benefit of enabling and facilitating a true digital depository system in which digital information is deposited into FDLP libraries. FGI is a strong advocate of a depository system that does this for several reasons that we have described repeatedly here and in our writings and presentations. In brief, we believe that this would make it possible for individual FDLP libraries to build their own local digital collections focused on the needs of their own user communities; it would aid preservation by ensuring that multiple copies exist under different technical, financial, and administrative structures; and it would create a better user experience by providing a way to integrate digital FDLP/Title-44 documents with non-Title-44 federal documents, documents from state and local governments, and other non-government information. DOIs would, therefor facilitate preservation as well as access.
Problem: Guarantee Authenticity. How does a user know that the document they just retrieved is "authentic," that it has not been altered, that it really is what it purports to be? Many people hope for a technological solution (e.g., PKI, time stamps, encryption, digital signatures, watermarks). We at FGI believe that these are techniques that people use and that the authenticity comes, not from the technique, but from users' trust in the people who set up the techniques.
No one explained this better than Abby Smith (Digital Authenticity in Perspective in "Authenticity in a Digital Environment,” Council on Library and Information Resources, Publication 92. May 2000). She noted that, when technologists were asked about how to establish the authenticity of a digital object, they were skeptical of technological "solutions" and said that "there is no technological solution that does not itself involve the transfer of trust to a third party."
Solution: Trust is a social phenomenon, not a technical one. So, imagine how this might work. Imagine a document that is in FDsys, and in the digital collections of several FDLP libraries, and also at the New York Times, and at any number of other places on the web. There might be a dozen URLs for that one document. But, if GPO assigned a single DOI to it and made sure it pointed to FDsys and to "Official Depository Copies" at FDLs, that one DOI would, by definition, point to "authentic" copies -- the original and those officially deposited in Title-44-authorized Federal Depository Libraries. The "prefix" part of a DOI refers to the registering agency (in this case GPO) and would further help "brand" the DOI as authentic. Users wanting "authentic" government information would look for DOIs bearing the GPO prefix -- and they would find what they wanted with a single click, no matter where the particular copy they get is stored. (In addition, the DOI metadata can include authenticity information.)
Precedents. GPO would not be alone in using DOIs. Who uses DOIs? ICPSR, OECD, the European Communities' EU publications office, CrossRef, and many others.
Barriers. The main barrier I can see to adopting DOIs is cost. I assume that it will surely cost more than implementing PURLs. But the two costs cannot be compared directly because the costs buy different things. Implementing PURLs gets us a fragile redirection system. Implementing DOIs gets us a redirection system of persistent identifiers, the ability to have multiple redirects to multiple copies, and a new way of thinking about authenticity.
I welcome comments and responses to my question and particularly hope that those with more knowledge in this area will fill in the gaps I have left.