Happy 2020! Now that we’re starting a new decade(!) — and GPO has set up a working group to study and consider digital deposit and Depository Library Council (DLC) will soon announce its PURL working group! — it is time for FGI to make its new year’s resolutions and envision a new agenda for a new Federal Depository Library Program (FDLP). This new digital FDLP will focus on the digital needs of users by building digital services based on digital collections. It will lead the way for libraries of all kinds, showing the value of digital libraries in the twenty-first century.
We recognize that there are challenges to moving the two-hundred-plus year old FDLP made up of some 1100+ individual libraries, with its roots deeply embedded in the slow-changing world of paper and ink publications, into a dynamic, rapidly evolving digital era. But we believe it is possible to do this if we recognize the challenges and develop an agenda based on clear goals and objectives.
We believe there are two distinct but related challenges to address. The good news is that the FDLP is ideally situated to address both.
Preservation Gaps. The first challenge is that too much valuable government information is not being preserved. For a variety of reasons1 GPO is not able to obtain most of the government information that the law authorized it to distribute and provide online access to. This is made worse by the fact that even more of the information produced by the federal government is out of the scope of GPO and the FDLP. GPO cannot preserve what it cannot obtain. This means that, even if GPO were able to collect and preserve every document within its remit (and it is not able to do so), there would still be enormous volumes of valuable government information uncollected and unpreserved.
Hundreds of FDLP libraries can help with this, given the right agenda and tools.
User Needs. The second challenge is that the needs of users of government information are not being adequately met.
GPO’s govinfo.gov portal is valuable because it is designed as a preservation system — and has received ISO 16363 certification as a Trustworthy Digital Repository! — and because it reveals the context of individual government documents by organizing them by provenance.
While this approach is essential, the comprehensiveness and organization can be daunting for users. Each user brings their own needs and approaches to information discovery and use. Although most users will find govinfo.gov’s scope, organization, and user interface helpful (and even essential) at some point, few will find it a useful starting point. It is too much to expect GPO to develop user interfaces or sub-collections for every possible discipline, subject, and user interest.
Again, hundreds of FDLP libraries can help build collections and services to meet user needs, given the right agenda and tools.
Mission of the FDLP
We propose a new mission statement for the FDLP:
- FDLP will meet the long-term needs of its Designated Communities by ensuring the long-term preservation of and free access to curated collections of government information
The mission requires goals that focus on the challenges of user needs and gaps in preservation.
Preserve the record of the legislative, executive and judicial branches of the U.S. Government
Provide a complete inventory of preserved government information
Enhance access and usability of the information by providing collections and services for specific communities of users.
The Mission and Goals set the stage for an agenda of specific Objectives.
- Update the Catalog of Government Publications (CGP) to be an inventory of the Complete Collection (paper and digital, “historical” and current).
- Provide pointers to every digital copy within the FDLP.
- Provide access to non-digital copies by pointing to holdings records. (This will integrate paper and digital and make it easier for users to find the most appropriate copy for their needs.)
- Ensure access in the event of any GPO budget short-fall, policy changes, etc.5
- Enhance public access and usability (including a ‘virtual Content Delivery Network (CDN)‘ allowing users to get a ‘local’ digital copy for faster access).
- Ensure the authenticity of every copy through deposit (and metadata6).
- Provide tools for dealing with editions, versions, and mementos (basically historic copies held in Web archives around the world).
Make it easier for users to cite and use official (authentic) digital objects.
- A first phase might make a cache of the replicated GPO content publicly accessible in the Internet Archive and add the URLs of those items to the DOIs created by agenda item 1.
- A second phase might simply update the DOIs to include the public links of LOCKSS-USDOCS copies as individual LOCKSS libraries make them available.
- Another phase might have some LOCKSS libraries replicating the GPO govinfo.gov interface.
- Another phase might have other LOCKSS libraries creating alternative user-interfaces (i.e. differerent from govinfo.gov) to their collections. (E.g., enhanced searching; enhanced browsing — by agency, subject, persons, places, etc.)
Another phase might have LOCKSS libraries developing APIs that allow other non-LOCKSS libraries to easily link to subsets of the LOCKSS-USDOCS collections. This would enable non-LOCKSS libraries to create smaller "virtual" subject collections designed to enhance access for specific communities of users.
- GPO wil provide tools to help depositories identify and harvest relevant government information that GPO is not harvesting.
- GPO’s own contacts, tools, and staff can identify harvest targets
- GPO can upgrade AskGPO to facilitate sharing of harvest targets
GPO can provide and coordinate original cataloging
1. GPO will create Digital Object Identifiers (DOIs) for every physical and digital object within the scope of its collection.
We are recommending that GPO replace PURLs with DOIs and expand their use to cover non-digital information. There are several good reasons for making this change2, but perhaps the most important reason is that it will have a conspicuous advantage for users: a single DOI can point to multiple copies of an item.3 PURLs cannot do that. This “multiple resolution” feature of DOIs is perfect for a digital-deposit FDLP. One DOI could point to the original source of a document, the GPO copy, and every FDLP-library copy (including holdings of paper copies). No longer would links fail as they can with GPO’s current single-point-of-failure model.4 And, by pointing to authenticated/deposited paper and digital copies and by using the functionality that DOIs provide (functionality that PURLS do not have), this will enable GPO and FDLP to:
2. LOCKSS-USDOCS program will make its replicated GPO content publicly accessible.
Currently, the LOCKSS-USDOCS project essentially backs-up the contents of GPO’s govinfo.gov into multiple “dark archives” (digital archives that are not publicly accessible until a trigger event such as catastrophic infrastructure failure). We are proposing that the LOCKSS-USDOCS community work to make as much of that replicated content publicly accessible as possible.
This would include adding the URLs for public LOCKSS-USDOCS copies to the DOIs for those items created by agenda item 1.
Note that, by making LOCKSS collections public, there will be "complete" digital collections analagous to the role of the "complete" paper collections of Regional Depositories. This duplication will facilitate preservation and enhance access for user communities.
Implementation of this objective will probably roll out in phases. For example:
3. GPO will develop robust selection tools so that FDLP depositories can easily build digital collections that better address the needs of their specific communities of interest.
There is no longer a need for the cumbersome and imprecise “Item Numbers” of the paper deposit era. In the digital age, digital tools can precisely target the needs of a particular community using multiple criteria (e.g., format, subject, agency, etc.). This will enable depositories to create selective digital collections that are more easily used than comprehensive collections at GPO and LOCKSS. Such collections can provide easier browsing and searching by subject, person, region, discipline, data type (e.g., GIS, data, text), and use-case (e.g., reading individual docs, computational analysis, building personal collections, etc.).
Robust selection tools will enable FDLs to select and easily acquire just those digital objects that meet dynamic profiles that are as simple (e.g., all PDFs or all technical reports) or as complex (e.g., all GIS files with aquifers for the Southwest) as the depository wishes to create. And by having their own copies of digital objects, FDLs will be able to manage (e.g., divide, combine, arrange, tag, index, label, etc.) their selections into collections that meet the specific discovery and use-needs of the communities they serve.
4. GPO and FDLP libraries will work together to extend the reach, scope, and mandate of the FDLP.
With goals 1-3 in place, the FDLP will replicate the original FDLP model of providing comprehensive, long-term preservation and access along with user-focused collections to meet the fast changing needs of many different communities of interest. Goal 4 addresses the preservation gaps. This will empower and facilitate FDLP libraries to collect and preserve (without unnecessary duplication) government information that is currently being missed. This will include “fugitives” (information within GPO’s scope, but beyond GPO’s ability to collect), and digital information that is valuable but out of the scope of GPO’s legal remit (eg., records and documents released through FOIA). 7
5. GPO will create an FDLP Digital Technology Committee* to develop, customize, and host digital FDLP tools.
The committee will work with the open source communities to provide tools customized for the FDLP. It will choose tools that enable FDLs to install, maintain, and customize the software for use in building digital collections and services. Scope includes tools for harvesting and ingesting, managing, and preserving digital objects. It also includes tools for building customized user-interfaces for collections including building ‘virtual libraries’ (user-interfaces to digital objects stored by other libraries [e.g., GPO, LOCKSS, etc.] using metadata and DOIs.
6. GPO will create an FDLP Collaborations Committee*.
Its purpose will be to work with FDLP and non-FDLP digital collections. Its goal will be to enhance access to existing collections (e.g, LOCKSS, FDLs, HathiTrust, IA), provide support to collaborative projects, and provide support for digitization projects.
7. GPO will create an FDLP Partnerships Committee*.
Its purpose will be for training and facilitating the creation of partnerships between individual FDLs and individual government agencies for preservation and long-term access.
*these committees will be made up of FDLP community members and GPO staff.
Results: Enhanced Preservation and Access
GPO’s recent Certification as an OAIS compliant Trusted Digital Repository was a big milestone for preservation. GPO even has a succession plan, as required by OAIS. The succession plan is “a signed agreement with the successor (NARA in this case)” but, as far as we know, GPO has not made the details of the agreement public. Even so, GPO needs more than another under-funded federal agency as insurance against under-funding, intentional policy changes, or technological failures. Luckily, it has just what it needs in the FDLP! With digital deposit and a clear agenda for preservation, FDLs will provide both preservation and online access whenever anything prevents GPO from doing so.
Although GPO has done an excellent job of providing “access” to its digital collection through govinfo.gov, more is needed. “Access” in the digital age means more to users than simply reading or downloading a single document at a time — no matter how difficult it is to find the document one needs. GPO’s govinfo.gov fills an important, even essential, need. But GPO’s interface at govinfo.gov and its collection parameters provide users two barriers to finding what they need. First, the size of the collection and its organization (by producer) make it difficult to find what you need unless you know what it is. Second, the gaps in coverage (govinfo is primarily Congressional information with only a small amount of executive and judicial content) mean that it is likely that what you need isn’t there. This makes it difficult for users to justify using govinfo.gov as a starting place. In addition, the single-document-centric retrieval strategy, while well suited to print publications, is no longer adequate to deal with the digital age of multi-part documents, audio-visual files, datasets, series, and aggregate publications (like proceedings, laws, regulations, and judicial decisions). In the digital era, retrieving multiple documents at once or parts of long documents should be as easy and as routine as retrieving single “volumes.” We cannot expect GPO, with its limited resources and limited mandate, to provide user-interfaces to meet the very different needs of vastly different user communities. But this is exactly where FDLs can complement (not duplicate) the services that GPO provides. With digital deposit and a clear agenda, FDLs can build collections that GPO cannot and can add value to those collections that GPO cannot.
Financing the Agenda
You may think that we forgot the biggest challenge: funding. We left funding out of the challenges above intentionally. Instead of assuming reduced funding and seeking ways to adapt, we choose to set our goals and objectives first. We believe that having a visionary agenda will attract funding. If the agenda is adopted and pursued aggressively, we believe it will drive increased traditional funding as well as new sources of grant funding.
In recent years, many libraries have faced inadequate budgets for two reasons. First, they have failed to provide an aggressive digital agenda that provides value to their constituencies. Second, they have promised reduced budgets based on the economies of cutbacks by focusing on "services without collections." We believe that these strategies are short-sighted and unsustainable.
The agenda we propose takes the opposite approach. It starts with users and their information needs and envisions libraries selecting and acquiring the digital information those specific communities need and then building digital services with those collections that meet the discovery and use-needs of those communities.
The result of this approach is that, instead of each library seeking funding in isolation and even in competition with other libraries, hundreds of libraries will be working toward a common purpose. Each improvement in each depository library will further the collective goals and will benefit all libraries. And each FDL will be able to participate at a level that matches its resources and the needs of its own communities. In this environment, those who fund FDLs will be able to see that they are getting more for each dollar they invest than they would otherwise. And they will see that they get more value than they are getting now because they are building digital collections that are controlled by the funded libraries, not just lists of pointers to content that may move, change, or disappear.
This, we believe, will lead to a turnaround: budgets going up instead of down. As FDLs demonstrate the value of this approach, all libraries will have a model to follow, first for public domain content, then for Open Access content — and beyond.
James A. Jacobs, University of California San Diego
James R. Jacobs, Stanford University
- See, for example, what the Library of Congress said in its 2018 Report on Disseminating and Preserving Digital Government Information. ↵
- The PURL (“Persistent Uniform Resource Locator”) technology was designed in the early days of the web as an an “intermediate step,” a “stopgap measure” to the creation of truly permanent links to digital resources (Lynch, Library of Congress, Weibel). Unlike DOIs, PURLs lack a robust institutional infrastructure, and support has been inconsistent over the years, moving among several different institutions (e.g., OCLC, Zephir, Internet Archive, and google) (wikipedia). Without a strong institutional infrastructure, PURLs service can be interrupted as it was for weeks in 2015 when OCLC’s server experienced problems and for weeks in 2009 when GPO’s PURL server crashed. PURLs can only resolve to URLs while DOIs can resolve to objects, metadata, and more. PURLs can resolve to a single copy, while DOIs can resolve to many copies (“mulitiple-resolution”). PURLs are not “global” (as DOIs are), meaning that each PURL is tied to a single resolver. As a result of these problems, PURLs are not widely used while DOIs are, by far, the most used persistent identifiers. For more details, see 20 Years of Persistent Identifiers–Which Systems are Here to Stay?, and Competitive Evaluation of PURLs, and GPO’s own evaluation of “Handles” (DOIs are based on the Handle technology): GPO’s Use of Handles. ↵
- DOI Handbook: 3.3 Multiple resolution. International DOI Foundation (March 21, 2017). ↵
- See, for example, what happens when the government shuts down. Luckily we have not had a total failure, yet. But the failures we have experienced demonstrate the risk of relying on a single system (govinfo.gov) for access to the record of our democracy. ↵
- Note that Chapter 41 of Title 44, which is the only legal basis for govinfo.gov, is vague and flexible about what government information is covered, does not require long-term preservation, and allows GPO to charge fees for its use. Note also the repeated attempts to privatize GPO. ↵
- DOI Handbook, 4.3 DOI metadata. International DOI Foundation (March 21, 2017). ↵
We’ve already seen that FDLP libraries are willing to help. Fang Gao, GPO’s chief of Library Technical Services, shared the following data with us in an email on 12/27/2019 regarding the number of fugitive documents reported to GPO over the last 3 years. She noted that “the table below identifies the total number of the Fugitive publications/LostDocs documents received for processing through AskGPO system. Fugitive publications may be received for processing through a variety of channels and may not be identified as fugitives. These are processed routinely through the acquisitions/cataloging workflow.” These are encouraging numbers of fugitives reported basically via serendipity. Imagine if there were a collaborative program in place similar to the one that the fugitive documents working group (of which James R. is a member!) is in the midst of drafting, and every depository submitted 10 fugitives per year, those numbers would increase to 11,200!
FISCAL YEAR TOTAL RESOLVED COMPLETION % 2020 151 127 84.1 2019 634 607 95.7 2018 619 617 99.6
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.