OSTI using Archive-It for E-Prints

The Energy Department's Office of Scientific and Technical Information (OSTI) is using the Internet Archive's Archive-It service to "provide uninterrupted access to more than a million online research papers from OSTI's E-print Network."

  • EPrint Network Special Collection
    This collection provides searching of more than 1 million scientific e-prints. The E-print Network is a deep Web source of scientific and technical information created by researchers active in a wide range of fields, including chemistry, biology and life sciences, materials science, nuclear sciences and engineering, energy research, and computer and information technologies. Information customers can use E-print Network to browse scientific Web sites, find scientific societies, receive alerts and search and access scientific e-prints, the documents circulated electronically to facilitate peer exchange and scientific advancement. OSTI leads development and adaptation of new capabilities for preservation and dissemination of research important to the U.S. Department of Energy (DOE).

See also: OSTI archives scientific data on the Web, by Trudy Walsh, GCN, 06/29/07.

"Without a way to periodically archive this material, important science content within this ever-growing, ever-changing online, e-print environment could disappear," said Walter Warnick, director of OSTI.

No votes yet

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Good model for GPO

This is a great model for FDLP libraries in concert with the GPO to build off of. Imagine if Archive-It was used to crawl the .gov/.mil/.state domains? Digital documents could be captured, harvested, described and distributed.

One thing I've been thinking about is next steps for Archive-It and other digital harvesting systems. We need to put a system in place for sifting through the vast amounts of data that get captured. One way to discern what's important (i.e., documents) and what's chaff (i.e., spacer gifs) is to create a system that connects queries to search results and automatically tags those items that are frequently clicked on by users. Tagged items could then be fed into the cataloging work flow (preferably distributed among 1250 FDLP libraries) in order to add more thorough descriptive metadata. Goodbye fugitive documents!

Am I crazy? Perhaps, but I'd love to hear others' thoughts.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Easily link to terms in various wikis. For help, see <a href="/interwiki/3">interwiki</a>.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Syndicate content