Home » Posts tagged 'historic paper collections'
Tag Archives: historic paper collections
In a recent paper published on arxiv.org entitled “On the Shoulders of Giants: The Growing Impact of Older Articles”, the authors examined the citation arc over time of older scholarly articles and how that impact has changed over time and with increased digital access. They found that citations to older articles (and therefore their impact) has substantially grown as older papers have become as easy to find as new ones. Check out the arxiv blog for more explanation.
I’d like to see similar research on historic government documents. My sense is that, over time, digitized government documents will be used more — IF they’re made findable in lots of library catalogs and on the open Web and IF govdocs librarians will do more to “seed the cloud” with Q&As and blog posts about interesting documents they come across in their work! — AND that as they’re used more, the original paper documents from which they were scanned will also be used more. Anyone want to do the research?
On the Shoulders of Giants: The Growing Impact of Older Articles. Alex Verstak, Anurag Acharya, Helder Suzuki, Sean Henderson, Mikhail Iakhiaev, Cliff Chiung Yu Lin, Namit Shetty
That raises an interesting question — if old papers are now as easy to find as modern ones, are they having as great an impact?
Today we get an answer of sorts thanks to the work of Alex Verstak and pals at Google. These guys have studied how often older articles are cited in modern papers and how this has changed since the advent of electronic publishing in the 1990s. Their conclusion is that older papers are having an increasingly important impact on modern science — that the distinction between old and new, between the historical and the modern, no longer creates a division in science.
These guys base their work on a database of citations in scientific papers published between 1990 and 2013 in 9 broad areas of research subdivided into 261 subject areas. For each discipline, they then plotted the percentage of citations to papers that were at least ten years old.
The results show a clear trend. “Our analysis indicates that, in 2013, 36% of citations were to articles that are at least 10 years old and that this fraction has grown 28% since 1990,” say Verstak and co. What’s more, the increase in the last ten years is twice as big as in the previous ten years, so the trend appears to be accelerating.
The results solve an ongoing conundrum among researchers involved in scientometrics, the study of science and scientific research. Some of these researchers have long argued that the ongoing digitisation of historical papers should automatically ensure that they are cited more often. Others point out that there has been a huge increase in the number scientific papers published in recent years so historical papers should be a smaller proportion of the total and therefore cited less.
The work of Verstak and co shows that the former effect has won out. “Now that finding and reading relevant older articles is about as easy as finding and reading recently published articles, significant advances aren’t getting lost on the shelves and are influencing work worldwide for years after,” they say.
When we think about the historical paper-and-ink collections that FDLP libraries have built over the last 200 years, we often wish we could make them more accessible through digitization. But we have to be careful when we think this way. One thing I have learned repeatedly as I have worked with digital information over the last twenty five years is that, in the digital world, “access” and “preservation” have to go together. When we neglect either, we lose both.
Some recent writings have reinforced this old idea and are worth remembering:
- All Digital Objects are Born Digital Objects, by Trevor Owens, The Signal (May 15th, 2012).
There is no large red button that says “digitize” on it, we make decisions about what significant properties we want to record from a physical object and we work to ensure that those properties are recorded in the newly created digital object. When we talk about the scanner “digitizing” it’s all too easy to forget the history of the creation of the digital object and we can easily forget that there are a range of individual and institutional authorial intentions that go into deciding what and how to digitize.
- Digitization is Different than Digital Preservation: Help Prevent Digital Orphans!, by Kristin Snawder, The Signal (July 15th, 2011).
Many institutions see the immediate value of having materials available electronically. This is valid reasoning. Many researchers no longer want to come and see the materials. They want access from the comfort of their own couch and fuzzy slippers. But, in the hurry to meet user expectations, institutions may scan large quantities of materials without having a solid plan for preserving the digital images into the future.
Approaching Digitisation Through A Digital Preservation Perspective. by Alenka Kavčič-Čolić. Presented at the SEEDI (South-Eastern European Digitisation Initiative) 2012, Ljubljana, Slovenia.
Most libraries still conceive digitisation as a digital reproduction aimed to provide access to library materials only. The master files resulted from digitisation are usually not digitally preserved and the digital collections run the risk of being lost for the future.
The above examples are about short-term thinking and lack of planning when libraries aim for access without planning for preservation. The same mistake can be made the other way, too: when libraries plan for preservation without access. Paul Conway made this point more than 15 years ago:
For years, preservation simply meant collecting. The sheer act of pulling a collection of manuscripts from a barn, a basement, or a parking garage and placing it intact in a dry building with locks on the door fulfilled the fundamental preservation mandate of the institution. In this regard, preservation and access have been mutually exclusive activities often in constant tension. “While preservation is a primary goal or responsibility, an equally compelling mandate–access and use–sets up a classic conflict that must be arbitrated by the custodians and caretakers of archival records,” states a fundamental textbook in the field (Ritzenthaler, Mary Lynn. Preserving Archives and Manuscripts. Chicago: Society of American Archivists, 1993. p. 1). Access mechanisms, such as bibliographic records and archival finding aids, simply provide a notice of availability and are not an integral part of the object.
In the digital world, the concept of access is transformed from a convenient byproduct of the preservation process to its central motif. The content, structure, and integrity of the information object assume center stage; the ability of a machine to transport and display this information object becomes an assumed end result of preservation action rather than its primary goal. Preservation in the digital world is not simply the act of preserving access but also includes a description of the “thing” to be preserved. In the context of this report, the object of preservation is a high-quality, high-value, well-protected, and fully integrated version of an original source document.
— Paul Conway Head, Preservation Department Yale University Library. Preservation in the Digital World Council on Library and Information Resources, Pub62 (March 1996).
An interesting perspective on the limitations a simple web search comes today from an Emeritus Professor of Criminal Justice at the University of Nebraska at Omaha. He notes that “The contested history of Executive Order 11246 is an important aspect of the history of the modern women’s rights movement and of the presidency of Lyndon Johnson,” but that a simple search for it yields the revised, not the original, version of the order:
- The Perils of Internet Research: The Case of LBJ and Affirmative Action, By Samuel Walker, History News Network (5-28-12).
A standard Google search for “Executive Order 11246” yields multiple web sites, including those of the U.S. Department of Labor (which enforces the federal contractor provision), the National Archives, and Wikipedia. These sites post the current revised version of E. O. 11246. While it duly notes the many revisions over the years, only historians who are specialists on the subject and some employment law attorneys (but only those interested in history), will realize that it is not the original. Consequently, they will gain no hint of the contested initial history of affirmative action regarding sex discrimination or of LBJ’s record on women’s rights.
This is not an insignificant issue. Wikipedia is widely used by average Americans as a research tool. College undergraduates use it routinely, as do many graduate students. Only PhD or some MA students who are closely supervised by their faculty are likely to know they are missing some important history. Few people, moreover, are likely to question the National Archives as an authoritative source on American history. Executive Order 11246, finally, is hardly the only document where the original does not immediately appear through a Google search. Try finding the original text of the 1966 Freedom of Information Act, for example.
Experienced government information specialists will not be surprised by this and will recognize the need for sophisticated searching (and careful interpretation of search results) in general.
But this is also an example of the importance of our historical collections. Because government information is a record of the activities and attitudes and knowledge of a government at particular points in time, it retains historical value even when it is “out of date” — as in the above example. Different versions of laws, old censuses, series of annual reports, early maps, photographs: all these are important historical records which require the same attention and care we devote to the most current information.
Too often, however, I hear librarians focus on “currency” as a value to such an extent that they seem to deprecate the value of historical records. I feel this is the case when library administrators refer to our historical paper collections as “legacy” collections.
The word “legacy,” when used as an adjective, comes from computing and means superseded, no longer useful, difficult to use, and in need of replacement. In this way the use of “legacy” as an adjective as a description of our historical collections is both incorrect and demeaning. Those who call our historical collections “legacy collections” are diminishing the value of those collections. I don’t know if they do this intentionally or not, but I do know that this use carries an implication that cheapens the value of these collections. That can lead to bad decisions.
If we must use the term “legacy” to describe our historical collections, we should use it as a noun. The noun “legacy” means bequest, heritage, endowment, gift, and birthright. Our historical collections are a legacy from the past to us and to our children and must be treated with respect.