Introduction
In our previous post about GPO’s National Plan for Access to U.S. Government Information: A Framework for a User-centric Service Approach To Permanent Public Access, we suggested two small changes to the Plan that we believe would improve it by giving it a clearer focus (on preservation for communities of users) and and a wider scope (to include the FDLP as well as GPO). We believe this would provide a foundation for strategic planning for both GPO and the FDLP that would create a truly collaborative infrastructure for a 21st century digital FDLP.
With such a plan in place, we believe that GPO and FDLP together can accomplish many things:
- Preserve the record of government.
- Make the individual documents and bulk data discoverable, deliverable, readable, usable, and re-usable.
- Address the different needs of different communities that search for, acquire, and use government information in different ways.
- Provide a succession plan for FDLP content hosted by GPO so that the content will not disappear if GPO is cut or eliminated.
- Provide individual FDLP libraries with value through the collaborative actions of all FDLP libraries.
The strategic planning process can design the specific goals and objectives that will create curated and preserved collections and provide the services for those collections that meet the needs of a wide variety of Designated Communities. These will not be just archived copies, but copies designed for use and re-use by specific communities of users. This will provide more than simple "access" (a paper-and-ink concept) and instead will provide information for consuming, interacting-with, using, and re-using government information — better informing the public and the democratic process.
This kind of strategic planning would focus on people and their information needs rather than on simply storing bits and providing undifferentiated "access" to those bits.
The “Desired Outcomes And Actions” section of the Plan will have to be evaluated to determine which items match a Mission focused on FDLP as well as GPO and which items match the functional requirements of OAIS. Before doing that, the current environment within which strategic planning will take place must be evaluated.
SWOT Analysis
While a Strategic Plan can provide the strategic planning process with a Mission and Values and broad Goals, the selection of more specific goals and the objectives to achieve those goals takes place in the real world of strengths, weaknesses, opportunities and threats (SWOT).1 At every stage of planning over time, a good planning process will identify each of those to identify the best, most effective next steps.
We offer here an informal and preliminary overview of the SWOT for long-term free public preservation, access, and use of federal government information.
Strengths
-
FDLP Libraries There are 1,155 FDLP Libraries. GPO already has more than one thousand partners. The existing partnership between FDLP libraries and GPO is Legislatively mandated — we can’t stress enough how important this is! — and has a long history of successfully providing collections and services to local communities. These libraries are already preserving the FDLP Historic Collections. They are at least modestly staffed with one or more trained librarians with a dedication to government information.
-
FDLP Libraries that select a high percentage of available FDLP items. These include 47 Regional libraries (that select 100% of FDLP items) and 87 other libraries — mostly academic — that select 70% or more of FDLP items.2
-
FDLP Libraries with an interest in Digital Preservation. In spite of GPO’s continued, long-term disregard of digital deposit, a significant number of FDLP libraries remain interested in receiving digital documents on deposit.3
-
FDsys/GovInfo.gov. GPO’s Federal Digital System (FDsys) content management system, designed using the OAIS standard, and its new user interface/search service, GovInfo.gov, are publishing digital content and providing authentication with digital signatures to some of that content. GPO is working toward getting certification as a Trusted Digital Repository.4
-
The LOCKSS-USDOCS project. Thirty-six libraries, including 10 regional depositories, are already preserving all FDsys collections and participating in what amounts to a de facto system of Digital Deposit. The number of libraries participating in the LOCKSS-USDOCS project has more than doubled since its inception in 2010.
-
Existing Catalogs. The Monthly Catalog of United States Government Publications was printed between 1895 and December 2004. The Catalog of Government Publications (CGP) was originally an online counterpart of the The Monthly Catalog and includes records from as far back as July 1976. CGP is now GPO’s Integrated Library System and as complete a “National Bibliography of U.S. Government Publications” as currently exists. It includes records for publications of all three branches of government and provides links to those that are available online.
-
Existing metadata. Every agency creates and stores metadata about its own publications. Although the quality, format, style, and scope of this information varies wildly, its existence provides an essential, if primitive, component of a comprehensive catalog of all government information.
-
Existing GPO Partnerships. GPO currently lists eighteen partnerships that provide permanent public access to content for the benefit of the public and the FDLP.5 These include some large and significant non-government partners (e.g., University of North Texas (UNT)’s Cybercemetery), and government partners (e.g., GAO, NASA).
-
Significant Others. There is a growing number of projects outside of FDLP that provide government information online. These include many digitization projects listed on the digital registry, digitization archives that include both government- and non-government information (e.g., Hathitrust and Internet Archive), collections built around themes (e.g., the National Security Archive at George Washington University and the Technical Report Archive and Image Library (TRAIL)), collections built around data (e.g., demographic and related data at the University of Minnesota), preservation projects (e.g., LOCKSS-USDOCS), web harvests (e.g., the End of Term web archives and GPO’s own foray into web harvesting), collections of hard-to-get government information (e.g., Congressional Research Service reports), services built with government data (e.g., election data at Maplight), and tools to make government information easier to use (e.g., GovTrack).
Weaknesses
-
No union catalog of the Historic Collections. There is no catalog of printed federal government publications that is complete and comprehensive — and much of the available metadata is less than adequate. This means that there is no way to tell how many documents are in the FDLP Historic Collections, or how many FDLP libraries have a copy of each document, or the condition of those copies. This information is critical to building a distributed, collaborative system of preservation and access for paper and microform documents that were distributed to FDLP libraries and to providing access and authentication to any digitized versions of those documents. A digital version of a union catalog is also vital to the development of a comprehensive system of linked data.
-
No National Bibliography of Digital Government Information. For the most part, the more than four hundred executive agencies (and more than 50 independent commissions) are not currently facilitating the building of a comprehensive national bibliography of born-digital government information. Almost no agencies share their metadata with libraries or with GPO for the creation of a national bibliography.
-
The majority of digital government information is now "fugitive". The vast majority of digital government information created by federal executive agencies that is within scope of the FDLP is "fugitive" information (not sent to GPO by the agencies and not cataloged in the CGP even though it is technically within the scope of Title 44 of the U.S. Code). The fugitive document problem continues to expand exponentially. Executive agencies are given wide leeway into their publishing practices due to Office of Management and Budget (OMB)’s Circular A-130 — which in practice actually undermines and erodes the national bibliography — and GPO does not have the authority to require Executive Agencies to deposit their content with GPO.6
-
Absence of standards. Federal agencies lack adequate digital publishing standards. (Absence of these makes it more difficult for libraries (and GPO) to identify, describe, harvest, version, and preserve content.)7
-
Lack of preservation repositories. The federal government does not have a certified Trusted Digital Repository.8 Existing .gov digital repositories are limited in scope and most have little or no preservation or long-term access policies; few agencies have a specific Congressional mandate to preserve their content for long-term access and use. The activities of the existing repositories are not coordinated with each other. Note that even the LOCKSS-USDOCS project — which is built using the same software that has received very high marks in the TDR audit of the CLOCKSS archive — does not have official GPO partnership status.
Opportunities
-
Agency Cooperation. A few agencies have deposited digitized content into FDsys.9 Other agencies have agreed to self-preserve their own publications.10 The Department of Energy’s Office of Scientific and Technical Information (OSTI) allows libraries to download MARC metadata in bulk. Successes of such cooperation could be a model for other agencies.
-
Open Government. There is a robust and growing "Open Government" movement that includes initiatives inside the government (such as data.gov) and non-commercial activists, programmers, and researchers (e.g. The Sunlight Foundation, Civic Impulse, LLC, Open The Government, etc.).
-
New Initiatives. Large and small scale digital preservation projects offer new opportunities for collaboration and sharing the costs of building and maintaining an infrastructure while increasing visibility and utility. Examples include the Digital Preservation Network, the MetaArchive Cooperative, The National Digital Information Infrastructure and Preservation Program, and the Digital Public Library of America. GPO has already partnered with DPLA and is one of fourteen DPLA Content Hubs. There is also a growing interest in preservation of digital government information; see, for example, the 2014 Center for Research Libraries (CRL) event Leviathan: Libraries and Government Information in the Era of Big Data, this month’s Digital Preservation of Federal Information Summit, and GPO’s own proposal for a Federal Information Preservation Network (FIPNet).
-
Tools. We are blessed with a proliferation of digital tools for building digital collections, web-harvesting (which has in fact a vibrant and growing group of libraries and archives in the International Internet Preservation Consortium (IIPC)), creating and maintaining permanent URLs, digitizing print, etc.
Threats
-
Privatization. There are repeated calls for "commercial" solutions to government information distribution.11 This is a threat to long-term, free, public access for several reasons. It can result in information only being available for a fee when government agencies attempt to be "self-supporting" or just "recover costs" and when private companies succeed in getting exclusive distribution rights. It can result in restrictions on use and re-use of information when technological, legal, or licensing restrictions are used to protect the information "commodity." It can result is complete loss of access when government information is removed from the market when business models fail or change or when there is simply not enough demand.
-
Loss of information. Born-digital information cannot be preserved by accident. Preservation must be intentional and explicit and on-going, or information will be lost. We must assume that, in the absence of a government-wide preservation plan, born-digital information is being lost every day.
-
Link Rot and Content Drift. These two issues are real and ubiquitous in the .gov domain space. When content moves and links break content is effectively lost even if it still exists. When content changes over time, information is lost even if the link still appears to work. Without intentional and systematic preservation practices — including provisions for versioning and permanent links — information will be lost.12
-
Small government. Keeping all digital government information solely in the hands of government agencies is very risky at any time but especially so in our current political environment. In a time when the calls for "small" government and reduced deficits and spending reductions for any new spending, we should consider that there are significant risks that existing government information will be taken offline because of political decisions. At a time when major, visible, essential infrastructure projects like bridges and water systems go unfunded and underfunded, we should consider government information at risk. When huge agencies such as the Department of Education, Department of Energy, and the Environmental Protection Agency (EPA) are under threat of being abolished, we cannot assume that a small agency like GPO will — regardless of its own intentions — be able to provide "readily discoverable and free public access to Federal Government information, now and for future generations."
-
Library priorities. There are too few libraries and librarians working in digital collection development — which includes both digitization *and* collecting born-digital documents. Library directors publicly complain that FDLP is an "unfunded mandate," insist on discarding their collections — or worse, relegating them to off-site storage whether or not they’re cataloged! — threaten to withdraw from FDLP, and rate government information as a low priority. Some libraries want "incentives" for participating in the FDLP rather than seeing participation as a good for the institution and it’s primary users, the General Public.
-
"An alarmingly casual indifference to accuracy and authenticity.”. While many libraries are anxious to replace printed collections with digital surrogates, they seem to be willing to do so without regard to the completeness or accuracy of those surrogates. This could easily result in discarding of paper collections without having adequate numbers of copies.13
SWOT Conclusions
What can we learn from even this informal SWOT analysis?
First, the threats (and consequently, the risks of loss of information) are real and severe. Loss is occurring and, without action, loss will continue.
Second, the in-place strengths and existing opportunities can be used to ameliorate those threats. The threats are not insurmountable. They are not technological. The digital evolution of information dissemination does not make loss of information inevitable.
Third, although loss of information is not inevitable, it will take action to prevent it. The technological tools and standards exist already. We already have existing models of digital preservation and collaboration. We even have the beginnings of an infrastructure for a truly collaborative 21st century digital FDLP — an FDLP ecosystem, if you will.14
The conclusion is obvious and compelling: Since the threats are real and the solutions are at hand, FDLP libraries will either choose to do something to mitigate the risks and preserve this vital information, or they will we choose to do nothing and risk losing this information.
Endnotes
-
GPO has used SWOT analysis before. See Recommendations of the Depository Library Council to the Public Printer 2010-2014, and Federal Depository Library Program Strategic Plan, 2009-2014, and Shaping and Transforming the Future of the FDLP Summary of the Regional Meeting October 2003 — Arlington, VA, and SWOT: Strengths, Weaknesses, Opportunities and Threats Prepared by The Regional Planning Committee for the Fall 2003 Regional Meeting. ↩
-
We chose 70% as an identifier of libraries with large, historic physical collections as the ones most likely to be interested in participating in both a preservation network for historic documents and digital collection development going forward. ↩
-
It is almost impossible to get an accurate sense of how many FDLP libraries might be open to digital-deposit because GPO has neither offered digital deposit as an option, nor consistently asked comparable questions over time in the biennial surveys of FDLP libraries. Nevertheless, responses to biennial survey questions have shown substantial (if far from universal) interest in digital deposit — even in the absence of specific GPO proposals or policies. At least 25% and sometimes as much as 37% of FDLP libraries have expressed an interest in digital deposit and between 100 and 200 libraries are already downloading digital documents and hosting them locally. ↩
-
See: GPO’s 2015 National Digital Stewardship Resident (August 25 2015). ↩
-
Specific agreements with partners may vary, but the sample partnership agreement for content specifies "permanent public access" and that any termination of the agreement will result in transferring the collections and metadata to GPO. Note, however, that, in its public list of requirements, GPO includes "System Admin/Security" but not preservation or long-term access. ↩
-
Born-Digital U.S. Federal Government Information: Preservation and Access. 2014. ↩
-
The Office of Management and Budget does have Information Policy regulations and is in the process of revising its Circular A-130 "Management of Federal Information Resources," to be renamed Managing Information as a Strategic Resource. FGI submitted comments on the revision which suggested that each agency should be required to have an “Information Management Plan” for the public information it acquires, assembles, creates, and disseminates and be required to make its own website compatible with a few basic, consistent requirements to make it easier for the public to discover, acquire and use its information. ↩
-
Certification as a Trusted Digital Repository is based on OAIS (Reference Model for an Open Archival Information System), TRAC (Trustworthy Repositories Audit & Certification: Criteria and Checklist, and TRAC’s successor TDR (Audit And Certification Of Trustworthy Digital Repositories also known as Space data and information transfer systems — Audit and certification of trustworthy digital repositories ↩
-
The Department of Education has deposited ERIC reports, the Government Accountability Office has deposited older GAO Reports and Comptroller General Decisions, the Treasury Department has deposited several series , and the the Administrative Office of the United States Courts has deposited United States Courts Opinions. ↩
-
Agencies that have signed agreements with GPO guaranteeing that they will provide permanent public access to content include: NASA (NASA Technical Reports Server), The Department of Labor (Davis Bacon wage determinations), The Federal Reserve Bank of St. Louis (FRASER), The Government Accountability Office (GAO Reports and Comptroller General Decisions databases), National Library of Medicine (publications and other data files), National Renewable Energy Laboratory (laboratory and outreach publications), and the Naval Postgraduate School (Homeland Security Digital Library) ↩
-
See: Privatization of GPO, Defunding of FDsys, and the Future of the FDLP. ↩
-
See, for example, Thoughts on Referencing, Linking, Reference Rot by Herbert Van de Sompel, and Temporal Context in Digital Preservation, and Link Rot up to 51% for .gov domains, and Government Link Rot, and Another study of link rot and content drift. ↩
-
Diana Kichuk documented problems with inaccurate metadata and OCR digitized text in her article “Loose, Falling Characters and Sentences: The Persistence of the OCR Problem in Digital Repository E-Books” (Portal: Libraries and the Academy 15, no. 1 (2015): 59–91. doi:10.1353/pla.2015.0005). Paul Conway’s examination of page images in the HathiTrust showed that 35% of the volumes examined were not reliable surrogates (“Preserving Imperfection: Assessing the Incidence of Digital Imaging Error in HathiTrust”). See also: "An alarmingly casual indifference to accuracy and authenticity." What we know about digital surrogates. ↩
-
Letter to Deputy CTO Noveck: “Open Government Publications” by James R. Jacobs (April 19, 2010). ↩
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Latest Comments