It is hard to keep up with everything even just in the world of government information. FGI provides a list of recommendations in its “blogroll” (look in the right column just below the list of recent comments). These are people, and blogs, and organizations that we find useful if you want to keep up with government information issues.
Today we add a link to the RSS feed for the University of Washington Gov Pubs Finds tumblr. I have been following it every day for some time and find it just wonderful. It provides a wonderful mix of interesting finds, historic documents, and just plain inspiration. It started with an examination of discarded federal documents. I cannot recommend it highly enough. Enjoy!
The description of the U.WA. tumblr:
This site originates from a final project for a Government Publications course taught at the University of Washington in the Fall of 2013. The students, both in the MLIS program, were given access to over 2,000 boxes containing discarded federal documents donated to the UW by the Seattle Public Library. In browsing through the boxes during the quarter, items were found that fit a theme of government research, policies, and programs investigating youth and family advocacy, health, and safety.
After the conclusion of the course this site will continue to be updated with items from the above-mentioned SPL gift collection, as well as other areas within the government publications holdings of the UW Libraries, that speak to the history of Seattle, the Pacific Northwest, and of the unique and oftentimes overlooked qualities of government information.
A recent question on the govdoc-l mailing list asked if GPO had ever officially defined the term “legacy collection” or “legacy document” and if the definition goes beyond something that has historical value or importance. I posted a short answer there. Here, I document and explain that brief response.
The term was introduced by the Association of Research Libraries (ARL) and by Superintendent of Documents Judy C. Russell in 2003. The phrase has been used almost exclusively in the documents community in the context of digitizing and discarding FDLP historical paper collections ever since.
Before 2003, documents and articles that discuss collections (even in the context of digitizing them) rarely if ever used the adjective “legacy” to describe FDLP collections. For example, a 2002 GODORT report on digitizing government information did not include the word “legacy” to describe the collections to be targeted for digitization.
I did not find any references to “legacy collections” in DttP or govdoc-l or Google Scholar or Library, Information Science & Technology Abstracts before 2003.
2003: Introduction of a New Term
Judy C. Russell, then GPO Superintendent of Documents, apparently introduced the term in late 2003 in an announcement of an agreement between ARL and GPO to “digitize a complete legacy collection.” GODORT mentioned it at ALA in January 2004 and GPO included it in its Strategic Vision for the 21st Century in December 2004.
Russell also referred to “legacy content” at the Center For Research Libraries Forum on “Building Blocks of a National Print Preservation Network.” And, in a 2005 Dissemination Implementation Plan, GPO referred to the “legacy collection of tangible U.S. Government publications held in libraries participating in the Federal Depository Library Program (FDLP).”
Adoption of the term
After 2004, the mentions of an FDLP “legacy collection” increased. Documents librarians adopted the phrase to refer to the paper (and sometimes microform, and, occasionally, even “tangible” digital) documents that GPO had actually deposited into FDLP libraries.
What is the FDLP “Legacy Collection?”
Russell described the “legacy collection” as “tangible items in your libraries” in her remarks to DLC in April 2004. She also said that the legacy collection of U.S. government documents consisted of “an estimated 2.2 million print publications totaling approximately 60 million pages.” A report of the 2004 GPO meeting of experts on digital preservation described the legacy collection as “U.S. government documents currently held in depositories, estimated to be about 2.2 million items (excluding microfiche).”
It is worth noting here that the term was applied to all paper; there was no singling-out of any documents that would have more historical value or importance. The key to inclusion within the definition of “legacy collection” was, apparently, that they were paper and were in FDLP libraries and were targets of digitization (and, as we will see in a moment, targets for discarding).
As noted above, the introduction of the phrase accompanied a plan to digitize the paper FDLP collections. GODORT referred to the initiative as “Digitizing Legacy Federal Documents Collections.” The GPO Strategic Vision described converting “printed legacy documents” into digital format. The Dissemination Implementation Plan enumerated priorities for digitization of the “Legacy Collection.” The purpose of the Experts Meeting was to address digitizing “the entire legacy collection of U.S. government documents.”
The government information community adopted that context along with the phrase. Every use of the phrase that I found was in the context of digitizing paper.
Why did ARL and Russell choose the term “legacy collection”? Since the use of the phrase was directly and explicitly tied to digitization of those collections, why not describe those collections as “analog materials” or “historical collections” or “paper collections” or even (ugh!) “tangible collections?” Why “legacy“?
The Merriam-Webster dictionary says that the word “legacy” was not used as an adjective until 1990. That use comes not from the libraries but from the computing world. It is used by IT managers to describe software or systems that are outdated and unwanted. Wikipedia says that it is often considered a “pejorative term” and is used to describe systems that are “potentially problematic.” And the New Oxford American Dictionary defines it as “software or hardware that has been superseded.” In practice, IT managers would like to stop supporting “legacy software” and discard it. Sound familiar?
It is, of course, possible that the choice of the term to describe the FDLP Historical Collections was not well thought out and no one intended to imply that the collections are problems that need to be discarded. But it is revealing that GPO’s own 2004 Strategic Vision statement not only used “legacy” to describe “printed documents,” but also said that GPO needed to reduce costs associated with the operation and maintenance of “stand alone, legacy computer systems.” This was not a mysterious, obscure word with an ambiguous meaning — even within the walls of GPO.
Legacy (adj.). Unwanted.
Thus, the use of the term “legacy” as an adjective to describe print FDLP collections reflects a particular attitude (one might even say a bias) about the FDLP Historical Collections. It defines the FDLP Historical Collections as out-of-date, unnecessary, and unwanted. Using this term pre-determines the fate of the collections. Those who use this term are expressly saying that they have already decided that they want to throw the collections away – even if they say that what they want is better access.
Using such terminology helps explain why the discussions about these collections have not focused on their intrinsic value, or their value to specific user communities, or the quality of the digital surrogates being used to replace (not supplement) them. Instead, the discussion has returned to a single question again and again and again: How many copies should we keep? – which is the wrong question.
Digitize and Discard
The phrase fits in well with ARL’s long-term advocacy of digitizing paper collections and then discarding them. See for example its 2008 report in which it proposed “a small number of physical regional legacy collections” and its 2010 report when it recommended that there should be “a distributed system for storage of print legacy collections that involves no more than 15 regionally distributed comprehensive print collections.” These recommendations to discard Historical Collections in order to reduce the number of paper copies in the FDLP are not supported with any evidence that such policies will either meet the needs of our communities or preserve the written record of the government.
Let me be clear. I am not an advocate of saving print collections for the sake of print collections. Tautologies are not useful for planning. But, in the same way, vague promises to enhance access through digitization are also not useful. Vague promises need to be backed up with procedures to minimize the risk of loss of information and long-term planning that provides adequate resources for preservation, access, and service. As James R. Jacobs and I have repeatedly argued (see endnotes), decisions about retention and discarding need to be premised on the needs of our communities and the ability of libraries to preserve and provide free access to the FDLP collections. Just labeling the collections as unwanted and out of date may be a clever way to try to persuade librarians to discard their collections without examining the outcomes of doing so. But labeling without evidence is not an application of Library and Information Science. It is rhetorical misdirection.
Libraries are free to digitize their collections (and they should!). If enhanced access is the goal, this can be done today without unnecessarily discarding a single document. But ARL and their supporters have been adamant that digitization must be linked to “flexibility … for the efficient management of the legacy collections” and reducing the number of print copies by requiring only a “small number of physical regional legacy collections (print and microforms).” And some libraries are using digitization as an excuse and a technique for discarding.
A better term: FDLP Historical Collections
I suggest that librarians use the term “FDLP Historical Collections.”
“Historical” because these documents tell us something about the past. Indeed, these documents are also, in a very real sense, “historic” in that they are the unique official record of our democracy.
“Collections” (plural) because we have many separate collections – not one big one – and we do not have an accurate and complete inventory of holdings across all FDLP libraries that would allow us to call it a single “collection.”
Legacy (noun). Gift, Inheritance.
I think it is fine to use the word “legacy” as a noun when speaking of our historical collections because they have been handed down to us. They are more like a valuable inheritance than an unwanted copy of WordStar. Who will preserve and take care of this legacy? Only FDLP libraries have this as their mission. Only FDLP libraries are responsible for the stewardship of this legacy.
For us to discard those paper publications without ensuring the accurate and complete preservation of the information in them would be to discard a valuable inheritance and ignore our responsibility.
Words matter. Library professionals are supposed to be professional and should be clear and unambiguous when they choose their terminology. This is important when making plans for the future and it is even more important when the planning involves irreversible decisions. Librarians should reject the use of the term “legacy collection” when discussing the FDLP Historical Collections and challenge those who use it.
But choosing a different term is not enough. We should clearly articulate both the inherent value of the FDLP Historical Collections and their specific value to our designated communities.
The documents in the FDLP Historical Collections may not exist anywhere outside of FDLP libraries. Even Judy Russell had to admit that discarding paper collections without a clear preservation and access strategy can be a big mistake. In her remarks to ARL in 2003, Russell said:
Many years ago GPO turned over its historical collection to the National Archives and almost immediately we began to regret the absence of a tangible collection. We have decided to re-establish a comprehensive collection of tangible and electronic documents as a collection of last resort for the program, and the new organization will dedicate staff resources to that effort.
Unfortunately, there has, apparently, been little progress in rebuilding GPO’s paper collection as a Collection of Last Resort. Instead, GPO is actively promoting changes that will make it easier to discard more paper collections.
While individual documents or volumes may exist elsewhere, FDLP libraries have collections that put those individual documents in context of their provenance. Although casual internet users may not understand the value of context and provenance, librarians do (or should) and researchers require it. Before FDLP libraries use digitization as an excuse and a technique for discarding these collections, librarians should insist on several essential criteria. My colleague James R. Jacobs has developed a preliminary checklist in his What Are We To Keep? (FAQ). Let’s think about that checklist and think carefully before we assign pejorative labels to our valuable legacy.
Association of Research Libraries. 2008. Future Directions for the Federal Depository Library Program (Dec 4, 2008).
Association of Research Libraries. 2010. Statement of Principles on the Federal Depository Library Program (October 2010).
Center for Research Libraries. 2004. Building Blocks of a National Print Preservation Network. Focus on Global Resources, Vol. 24, Num. 1 (Fall 2004).
Depository Library Council. 2004. Advice to the Public Printer (January 22, 2004).
Federal Depository Library Program. 2014. Future Roles and Opportunities: An FDLP Forecast Study Working Paper (March 28, 2014).
GODORT. 2002. Report: Digitization Of Government Information. Ad Hoc Committee on Digitization Of Government Information, Cathy Nelson Hartman Committee Chair (June 14, 2002).
GODORT. 2004. First Steering Committee Meeting Agenda. 2004. ALA Midwinter Conference, San Diego, Friday, January 9, 2004.
Jacobs, James A. and James R. Jacobs. 2013. The Digital-Surrogate Seal of Approval: a Consumer-oriented Standard. D-Lib Magazine (2013).
Jacobs, James A. 2015. “An alarmingly casual indifference to accuracy and authenticity.” What we know about digital surrogates. FreeGovInfo (March 1, 2015).
Jacobs, James A. 2015. Legacy collections. “Discussion of Government Document Issues” (25 Jun 2015).
Jacobs, James R. 2014. Why GPO’s proposed policy to allow Regionals to discard is a bad idea. FreeGovInfo (August 27, 2014).
Jacobs, James R., What are we to keep?, Documents to the People (Spring 2015).
Jacobs, James R., What Are We To Keep? (FAQ). FreeGovInfo (April 30, 2015.).
Rossmann, Brian W. 2005. Legacy Documents Collections: Separate the Wheat from the Chaff. DttP: Documents to the People Volume 33, No. 4 (Winter 2005).
Russell, Judith. 2003. Remarks by Judy Russell, 142nd ARL Membership Meeting, 142nd ARL Membership Meeting, Federal Relations Luncheon (May 15, 2003).
Russell, Judy C. 2003. Information Dissemination Operations. Remarks by Judy C. Russell Superintendent of Documents Depository Library Conference/Fall Council Meeting October 20, 2003, Administrative Notes Vol. 24, no. 13 (November 15, 2003).
Russell, Judy C. 2004. Remarks of Superintendent of Documents Depository Library Conference St. Louis, Missouri (April 18, 2004).
U.S. Government Printing Office. 2004. A Strategic Vision for the 21st Century, (Dec. 2004).
U.S. Government Printing Office. 2004. Report on the Meeting of Experts on Digital Preservation. U.S. Government Printing Office Washington, D.C. (March 12, 2004).
U.S. Government Printing Office. Office of Information Dissemination. 2005. Information Dissemination Implementation Plan: Priorities For Digitization Of Legacy Collection. Washington, D.C. (September 15, 2005).
In a recent paper published on arxiv.org entitled “On the Shoulders of Giants: The Growing Impact of Older Articles”, the authors examined the citation arc over time of older scholarly articles and how that impact has changed over time and with increased digital access. They found that citations to older articles (and therefore their impact) has substantially grown as older papers have become as easy to find as new ones. Check out the arxiv blog for more explanation.
I’d like to see similar research on historic government documents. My sense is that, over time, digitized government documents will be used more — IF they’re made findable in lots of library catalogs and on the open Web and IF govdocs librarians will do more to “seed the cloud” with Q&As and blog posts about interesting documents they come across in their work! — AND that as they’re used more, the original paper documents from which they were scanned will also be used more. Anyone want to do the research?
On the Shoulders of Giants: The Growing Impact of Older Articles. Alex Verstak, Anurag Acharya, Helder Suzuki, Sean Henderson, Mikhail Iakhiaev, Cliff Chiung Yu Lin, Namit Shetty
That raises an interesting question — if old papers are now as easy to find as modern ones, are they having as great an impact?
Today we get an answer of sorts thanks to the work of Alex Verstak and pals at Google. These guys have studied how often older articles are cited in modern papers and how this has changed since the advent of electronic publishing in the 1990s. Their conclusion is that older papers are having an increasingly important impact on modern science — that the distinction between old and new, between the historical and the modern, no longer creates a division in science.
These guys base their work on a database of citations in scientific papers published between 1990 and 2013 in 9 broad areas of research subdivided into 261 subject areas. For each discipline, they then plotted the percentage of citations to papers that were at least ten years old.
The results show a clear trend. “Our analysis indicates that, in 2013, 36% of citations were to articles that are at least 10 years old and that this fraction has grown 28% since 1990,” say Verstak and co. What’s more, the increase in the last ten years is twice as big as in the previous ten years, so the trend appears to be accelerating.
The results solve an ongoing conundrum among researchers involved in scientometrics, the study of science and scientific research. Some of these researchers have long argued that the ongoing digitisation of historical papers should automatically ensure that they are cited more often. Others point out that there has been a huge increase in the number scientific papers published in recent years so historical papers should be a smaller proportion of the total and therefore cited less.
The work of Verstak and co shows that the former effect has won out. “Now that finding and reading relevant older articles is about as easy as finding and reading recently published articles, significant advances aren’t getting lost on the shelves and are influencing work worldwide for years after,” they say.
The Slate Vault today highlighted a “data-packed” map of American immigration in 1903 from the annual report of the Commissioner-General of Immigration. The Vault always posts interesting and beautiful maps, images etc. They also linked to anew-to-me site called Handsome Atlas that has some beautiful scans and visualizations of historic US atlases. GO and check them out.
But what they didn’t mention was that this Annual Report — technically titled the “Annual report of the Commissioner-General of Immigration to the Secretary of the Treasury for the fiscal year ended …” — is available in libraries around the country as it was distributed by the Federal Depository Library Program (FDLP) AND that the map “Race and occupation of immigrants by destination” is just one of the many maps, statistical tables, infographics, and photographs embedded in these annual reports. Stanford University Library, where I work, has the annual report going back to 1892!
And, yes, you can find this publication in Google Books, HathiTrust, and the Internet Archive, BUT you WON’T find any of the many foldout maps/infographics because they simply weren’t weren’t scanned.
A reader could use the map to see which proportion of the immigrant population of a state came from each of six “races or peoples”: “Teutonic,” “Keltic,” “Slavic,” “Iberic,” “Mongolic,” or Other. These designations echoed popular eugenic racial ideologies of the time, which used quasi-scientific theories to lump people into basic groups of origin understood to share common characteristics. The bars showing percentages of immigrants in each state color-code the newcomers according to “race or people,” so that these can be seen at a glance, then use text to explain which countries these “Mongolians” or “Slavics” came from.
The map was put together as part of an annual report made for the Commission-General of Immigration, and printed by the Government Printing Office in 1903.
CIA closes office that declassifies historical materials, By Ken Dilanian, Los Angeles Times (August 21, 2013).
“The Historical Collections Division is the latest casualty of sequester cuts. The office handling Freedom of Information Act requests will take over the work.
“…Some of the declassification is required by law, so the Historical Collections Division, which focused on discretionary declassification involving topics that scholars found compelling, was the easiest target for trimming costs….”
Hat tip to InfoDocket!