Home » Articles posted by James A Jacobs

Author Archives: James A Jacobs

The FDLP Historical Collections

A recent question on the govdoc-l mailing list asked if GPO had ever officially defined the term “legacy collection” or “legacy document” and if the definition goes beyond something that has historical value or importance. I posted a short answer there. Here, I document and explain that brief response.

The term was introduced by the Association of Research Libraries (ARL) and by Superintendent of Documents Judy C. Russell in 2003. The phrase has been used almost exclusively in the documents community in the context of digitizing and discarding FDLP historical paper collections ever since.

Before 2003

Before 2003, documents and articles that discuss collections (even in the context of digitizing them) rarely if ever used the adjective “legacy” to describe FDLP collections. For example, a 2002 GODORT report on digitizing government information did not include the word “legacy” to describe the collections to be targeted for digitization.

I did not find any references to “legacy collections” in DttP or govdoc-l or Google Scholar or Library, Information Science & Technology Abstracts before 2003.

2003: Introduction of a New Term

Judy C. Russell, then GPO Superintendent of Documents, apparently introduced the term in late 2003 in an announcement of an agreement between ARL and GPO to “digitize a complete legacy collection.” GODORT mentioned it at ALA in January 2004 and GPO included it in its Strategic Vision for the 21st Century in December 2004.

Russell also referred to “legacy content” at the Center For Research Libraries Forum on “Building Blocks of a National Print Preservation Network.” And, in a 2005 Dissemination Implementation Plan, GPO referred to the “legacy collection of tangible U.S. Government publications held in libraries participating in the Federal Depository Library Program (FDLP).”

Adoption of the term

After 2004, the mentions of an FDLP “legacy collection” increased. Documents librarians adopted the phrase to refer to the paper (and sometimes microform, and, occasionally, even “tangible” digital) documents that GPO had actually deposited into FDLP libraries.

What is the FDLP “Legacy Collection?”

Russell described the “legacy collection” as “tangible items in your libraries” in her remarks to DLC in April 2004. She also said that the legacy collection of U.S. government documents consisted of “an estimated 2.2 million print publications totaling approximately 60 million pages.” A report of the 2004 GPO meeting of experts on digital preservation described the legacy collection as “U.S. government documents currently held in depositories, estimated to be about 2.2 million items (excluding microfiche).”

It is worth noting here that the term was applied to all paper; there was no singling-out of any documents that would have more historical value or importance. The key to inclusion within the definition of “legacy collection” was, apparently, that they were paper and were in FDLP libraries and were targets of digitization (and, as we will see in a moment, targets for discarding).

Digitization

As noted above, the introduction of the phrase accompanied a plan to digitize the paper FDLP collections. GODORT referred to the initiative as “Digitizing Legacy Federal Documents Collections.” The GPO Strategic Vision described converting “printed legacy documents” into digital format. The Dissemination Implementation Plan enumerated priorities for digitization of the “Legacy Collection.” The purpose of the Experts Meeting was to address digitizing “the entire legacy collection of U.S. government documents.”

The government information community adopted that context along with the phrase. Every use of the phrase that I found was in the context of digitizing paper.

Why “Legacy”?

Why did ARL and Russell choose the term “legacy collection”? Since the use of the phrase was directly and explicitly tied to digitization of those collections, why not describe those collections as “analog materials” or “historical collections” or “paper collections” or even (ugh!) “tangible collections?” Why “legacy“?

The Merriam-Webster dictionary says that the word “legacy” was not used as an adjective until 1990. That use comes not from the libraries but from the computing world. It is used by IT managers to describe software or systems that are outdated and unwanted. Wikipedia says that it is often considered a “pejorative term” and is used to describe systems that are “potentially problematic.” And the New Oxford American Dictionary defines it as “software or hardware that has been superseded.” In practice, IT managers would like to stop supporting “legacy software” and discard it. Sound familiar?

It is, of course, possible that the choice of the term to describe the FDLP Historical Collections was not well thought out and no one intended to imply that the collections are problems that need to be discarded. But it is revealing that GPO’s own 2004 Strategic Vision statement not only used “legacy” to describe “printed documents,” but also said that GPO needed to reduce costs associated with the operation and maintenance of “stand alone, legacy computer systems.” This was not a mysterious, obscure word with an ambiguous meaning — even within the walls of GPO.

Legacy (adj.). Unwanted.

Thus, the use of the term “legacy” as an adjective to describe print FDLP collections reflects a particular attitude (one might even say a bias) about the FDLP Historical Collections. It defines the FDLP Historical Collections as out-of-date, unnecessary, and unwanted. Using this term pre-determines the fate of the collections. Those who use this term are expressly saying that they have already decided that they want to throw the collections away – even if they say that what they want is better access.

Using such terminology helps explain why the discussions about these collections have not focused on their intrinsic value, or their value to specific user communities, or the quality of the digital surrogates being used to replace (not supplement) them. Instead, the discussion has returned to a single question again and again and again: How many copies should we keep? – which is the wrong question.

Digitize and Discard

The phrase fits in well with ARL’s long-term advocacy of digitizing paper collections and then discarding them. See for example its 2008 report in which it proposed “a small number of physical regional legacy collections” and its 2010 report when it recommended that there should be “a distributed system for storage of print legacy collections that involves no more than 15 regionally distributed comprehensive print collections.” These recommendations to discard Historical Collections in order to reduce the number of paper copies in the FDLP are not supported with any evidence that such policies will either meet the needs of our communities or preserve the written record of the government.

Let me be clear. I am not an advocate of saving print collections for the sake of print collections. Tautologies are not useful for planning. But, in the same way, vague promises to enhance access through digitization are also not useful. Vague promises need to be backed up with procedures to minimize the risk of loss of information and long-term planning that provides adequate resources for preservation, access, and service. As James R. Jacobs and I have repeatedly argued (see endnotes), decisions about retention and discarding need to be premised on the needs of our communities and the ability of libraries to preserve and provide free access to the FDLP collections. Just labeling the collections as unwanted and out of date may be a clever way to try to persuade librarians to discard their collections without examining the outcomes of doing so. But labeling without evidence is not an application of Library and Information Science. It is rhetorical misdirection.

Libraries are free to digitize their collections (and they should!). If enhanced access is the goal, this can be done today without unnecessarily discarding a single document. But ARL and their supporters have been adamant that digitization must be linked to “flexibility … for the efficient management of the legacy collections” and reducing the number of print copies by requiring only a “small number of physical regional legacy collections (print and microforms).” And some libraries are using digitization as an excuse and a technique for discarding.

A better term: FDLP Historical Collections

I suggest that librarians use the term “FDLP Historical Collections.”

“Historical” because these documents tell us something about the past. Indeed, these documents are also, in a very real sense, “historic” in that they are the unique official record of our democracy.

“Collections” (plural) because we have many separate collections – not one big one – and we do not have an accurate and complete inventory of holdings across all FDLP libraries that would allow us to call it a single “collection.”

Legacy (noun). Gift, Inheritance.

I think it is fine to use the word “legacy” as a noun when speaking of our historical collections because they have been handed down to us. They are more like a valuable inheritance than an unwanted copy of WordStar. Who will preserve and take care of this legacy? Only FDLP libraries have this as their mission. Only FDLP libraries are responsible for the stewardship of this legacy.

For us to discard those paper publications without ensuring the accurate and complete preservation of the information in them would be to discard a valuable inheritance and ignore our responsibility.

Conclusions

Words matter. Library professionals are supposed to be professional and should be clear and unambiguous when they choose their terminology. This is important when making plans for the future and it is even more important when the planning involves irreversible decisions. Librarians should reject the use of the term “legacy collection” when discussing the FDLP Historical Collections and challenge those who use it.

But choosing a different term is not enough. We should clearly articulate both the inherent value of the FDLP Historical Collections and their specific value to our designated communities.

The documents in the FDLP Historical Collections may not exist anywhere outside of FDLP libraries. Even Judy Russell had to admit that discarding paper collections without a clear preservation and access strategy can be a big mistake. In her remarks to ARL in 2003, Russell said:

Many years ago GPO turned over its historical collection to the National Archives and almost immediately we began to regret the absence of a tangible collection. We have decided to re-establish a comprehensive collection of tangible and electronic documents as a collection of last resort for the program, and the new organization will dedicate staff resources to that effort.

Unfortunately, there has, apparently, been little progress in rebuilding GPO’s paper collection as a Collection of Last Resort. Instead, GPO is actively promoting changes that will make it easier to discard more paper collections.

While individual documents or volumes may exist elsewhere, FDLP libraries have collections that put those individual documents in context of their provenance. Although casual internet users may not understand the value of context and provenance, librarians do (or should) and researchers require it. Before FDLP libraries use digitization as an excuse and a technique for discarding these collections, librarians should insist on several essential criteria. My colleague James R. Jacobs has developed a preliminary checklist in his What Are We To Keep? (FAQ). Let’s think about that checklist and think carefully before we assign pejorative labels to our valuable legacy.

Endnotes

Association of Research Libraries. 2008. Future Directions for the Federal Depository Library Program (Dec 4, 2008).

Association of Research Libraries. 2010. Statement of Principles on the Federal Depository Library Program (October 2010).

Center for Research Libraries. 2004. Building Blocks of a National Print Preservation Network. Focus on Global Resources, Vol. 24, Num. 1 (Fall 2004).

Depository Library Council. 2004. Advice to the Public Printer (January 22, 2004).

Federal Depository Library Program. 2014. Future Roles and Opportunities: An FDLP Forecast Study Working Paper (March 28, 2014).

GODORT. 2002. Report: Digitization Of Government Information. Ad Hoc Committee on Digitization Of Government Information, Cathy Nelson Hartman Committee Chair (June 14, 2002).

GODORT. 2004. First Steering Committee Meeting Agenda. 2004. ALA Midwinter Conference, San Diego, Friday, January 9, 2004.

Jacobs, James A. and James R. Jacobs. 2013. The Digital-Surrogate Seal of Approval: a Consumer-oriented Standard. D-Lib Magazine (2013).

Jacobs, James A. 2015. “An alarmingly casual indifference to accuracy and authenticity.” What we know about digital surrogates. FreeGovInfo (March 1, 2015).

Jacobs, James A. 2015. Legacy collections. “Discussion of Government Document Issues” (25 Jun 2015).

Jacobs, James R. 2014. Why GPO’s proposed policy to allow Regionals to discard is a bad idea. FreeGovInfo (August 27, 2014).

Jacobs, James R., What are we to keep?, Documents to the People (Spring 2015).

Jacobs, James R., What Are We To Keep? (FAQ). FreeGovInfo (April 30, 2015.).

Rossmann, Brian W. 2005. Legacy Documents Collections: Separate the Wheat from the Chaff. DttP: Documents to the People Volume 33, No. 4 (Winter 2005).

Russell, Judith. 2003. Remarks by Judy Russell, 142nd ARL Membership Meeting, 142nd ARL Membership Meeting, Federal Relations Luncheon (May 15, 2003).

Russell, Judy C. 2003. Information Dissemination Operations. Remarks by Judy C. Russell Superintendent of Documents Depository Library Conference/Fall Council Meeting October 20, 2003, Administrative Notes Vol. 24, no. 13 (November 15, 2003).

Russell, Judy C. 2004. Remarks of Superintendent of Documents Depository Library Conference St. Louis, Missouri (April 18, 2004).

U.S. Government Printing Office. 2004. A Strategic Vision for the 21st Century, (Dec. 2004).

U.S. Government Printing Office. 2004. Report on the Meeting of Experts on Digital Preservation. U.S. Government Printing Office Washington, D.C. (March 12, 2004).

U.S. Government Printing Office. Office of Information Dissemination. 2005. Information Dissemination Implementation Plan: Priorities For Digitization Of Legacy Collection. Washington, D.C. (September 15, 2005).

DOTD: Recovering from identity theft

logo-1x
The Federal Trade Commission web site IdentityTheft.gov is today’s Document of the Day.

“IdentityTheft.gov, a new website, offers step-by-step checklists of what to do right away, and what to do next, depending on the information that’s been stolen or exposed. It lists warning signs indicating your identity was stolen, and gives websites and phone numbers for organizations you’ll need to reach. And, it has sample letters for disputing fraudulent charges, correcting information in your credit reports, and getting business records relating to the theft.” New One-Stop Resource for Identity Theft Victims [USA.gov annoucement]).

FED study says many in debt, at risk, and cannot retire

emergency_expenses

Document of the Day. The results of a survey conducted on behalf of the Board of Governors of the Federal Reserve System have been reported in the news recently.

Some of the findings of the report include:

  • Only 53 percent of respondents indicate that they could cover a hypothetical emergency expense costing $400 without selling something or borrowing money.
  • 23 percent of the adult population has some form of education debt.
  • 31 percent of non-retirees have no retirement savings or pension. Among lower-income respondents, 55 percent plan to keep working as long as possible or never plan to retire.

Dodging the memory hole

Abbey Potter’s comments about preserving digital news are also very relevant to the preservation of government information.

Potter is the Program Officer with the the National Digital Information Infrastructure and Preservation Program (NDIIPP). In her post on The Signal blog, she elaborates on her closing keynote address at the Dodging the Memory Hole II: An Action Assembly meeting in Charlotte NC last month.

UKwebstudy

She quotes a presentation by Andy Jackson of the UK Web Archive in which he addresses the questions: “How much of the content of the UK Web Archive collection is still on the live web?” and “How bad is reference rot in the UK domain?”

By sampling URLs collected in the UK Web Archive, Jackson examined URLs that have moved, changed, or gone missing. He analyzed both link rot (a file gone missing) and content drift (a file that has changed since being archived). He shows that 50 percent of content had gone, moved, or changed so as to be unrecognizable in only one year. After three years the figure rose to 65 percent.

Potter says that it is safe to assume that the results would be similar for newspaper content on the web. It would probably also be similar for U.S. government web sites.

What can we learn from this and what can we do? For newspapers, Potter says, libraries have acquisition and preservation methods that are too closely linked to physical objects and that too often exclude digital objects. This results in libraries having gaps in their collections – “especially the born-digital content.” She summarizes the problem:

Libraries haven’t broadly adopted collecting practices so that they are relevant to the current publishing environment which today is dominated by the web.

This sounds exactly like what is happening with government information.

First, because GPO has explicitly limited actual deposit of government information to so-called “tangible” products (Superintendent Of Documents Policy Statement 301 [SOD 301]). This policy does exactly what Potter says is wrong: it establishes collecting practices that are not relevant to the current publishing environment. (See more on the effects of SOD 301 here.)

Second, because most of the conversation within the FDLP in the last few years has been about our historic paper collections rather than about the real digital preservation issue we should be facing: born-digital government information. (See Born-Digital U.S. Federal Government Information: Preservation and Access.)

As Potter says, “We have clear data that if content is not captured from the web soon after its creation, it is at risk.” And, “The absence of an acquisition stream for this [born-digital] content puts it at risk of being lost to future library and archives users.”

Potter outlines a plan of action for digital newspaper information that is surprisingly relevant for government information. She suggests that libraries should establish relationships (and eventually agreements) with the organizations that create, distribute, and own news content. That sounds like exactly what FDLP libraries have always done for 200+ years with paper and should be doing, could be doing, with digital government information today. There is no legal or regulatory barrier to GPO depositing FDLP digital files with FDLP libraries; indeed, GPO is already doing this de facto with its explicit actions that allow “USDocs” private LOCKSS network partners to download FDsys content.

Potter also recommends web archiving as another promising strategy. Since many agencies are reluctant to deposit digital content with FDsys, and because they are allowed by law to refrain from doing so, web archiving is a practical alternative, even if it is imperfect. Indeed, GPO does its own web harvesting program. Although some libraries also do web harvesting that includes U.S. Federal government web sites, more needs to be done in this area. (See: Webinar on fugitive documents: notes and links.)

I find it ironic that libraries are not at least experimenting with preserving born-digital government information. It is difficult to find an article about library projects that does not assert scarcity of funds or high barriers of copyright to overcome in digital library projects. So, why not use born-digital government information as a test bed for preserving digital content? The FDLP agreements and commitments are already in place, most of the content is public domain, and communities of interest for the content already exist. FDLP libraries could start today by building digital library collections and test-bed technology for government information and later expand to other more difficult collections and build on a base of experience and success. The fact that this would help our designated communities, preserve essential information, and further the goals of the FDLP would be welcome side-effects.

Current Awareness for Statistical Information

drinking-fountain-small

Two of the most useful current awareness tools I use to learn about new reports and developments are InfoDocket by Gary Price Full Text Reports by Gary Price and Shirl Kennedy. I only recently learned that they have another site that announces statistically-focused reports:

The announcement about StatFountain says:

If you haven’t already, do take a look at StatFountain, a new sister blog to FTR that aggregates statistically-focused reports. We know that people are often looking for numbers; it may be more helpful to corral all of these things together, categorized and tagged for easier searching. Posts are now being routinely split between FullTextReports and StatFountain.

StatFountain is a great resource and worth monitoring regularly. (Of course, it has its own RSS feed).

StatFountan just links to resources, it does not store them. As they remind us: “It’s the Internet. Stuff gets relocated, goes missing, ends up behind a paywall… whatever. If you see something potentially useful, download it and save it right then and there.”

Check out our blogroll (in the navigation column on the right side of most FGI pages) for other recommended current awareness sites!

Archives

Subscribe to FGI posts

By signing up, you agree to our Terms of Service and Privacy Policy.