Home » Posts tagged 'cyberinfrastructure'

Tag Archives: cyberinfrastructure

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

Purls vs handles

Building on yesterday’s post on Critical GPO systems and the FDLP cloud, I’ve done a little digging into GPO’s proposed migration from [w:Purl]s to the use of “handles.” According to RFP 3650 “Handle system overview,”

The Handle System includes an open protocol, a namespace, and a reference implementation of the protocol. The protocol enables a distributed computer system to store names, or handles, of digital resources and resolve those handles into the information necessary to locate, access, and otherwise make use of the resources. These associated values can be changed as needed to reflect the current state of the identified resource without changing the handle. This allows the name of the item to persist over changes of location and other current state information. Each handle may have its own administrator(s) and administration can be done in a distributed environment (my emphasis). The Handle System supports secured handle resolution. Security services such as data confidentiality, data integrity, and non-repudiation are provided upon client request.

Purls and handles do roughly the same thing: they’re link resolvers. But, as Larry Stone’s 2000 article for MIT’s Persistent Naming discovery project, “Competitive Evaluation of PURLs” points out, there are differences that make handles a better choice for long-term operation and persistence. Without getting too technical, handles are not connected to any protocol (i.e., [w:HTTP]) or domain (i.e., .gov) and can therefore work regardless of the network design or protocol used. This is extremely important for scalability and persistence over the long term. In addition, handles can do more than resolve to URLs. “The Handle System design allows for various other types of resolution objects, metadata, and extensible addtions to each Handle object record.”

In short, handles are more persistent, more scaleable, and can do more. But most importantly in my mind, handle administration, “can be done in a distributed environment.” This makes handles perfect for the FDLP cloud because the work of resolving links can be done in a distributed environment. So I say, kudos to GPO for moving to the handle system.

Oh, hold that applause for a moment. My search also turned up the following document from Fall 2007 Depository Library Council meeting entitled, “Handles Council Briefing Topic” (PDF). This briefing document basically describes what I’ve just said above and describes a gradual transition/migration from purls to handles with an anticipated timeline to, “coincide with Release 1-C of FDsys in 2008.” There’s a March, 6 2008 report, “Report on the handles beta test” that calls the handles beta test “satisfactory.” But no information is available after that report. So what happened?

I know the building of FDsys has been no easy task and that GPO staff have worked really hard to keep to their published release schedule; but I’d like to know why the handles migration didn’t occur in 2008. If more testing is involved, I’m sure there are libraries that would be willing to be beta-beta testers for handles. Perhaps this is an opportune time to finally implement the migration to the handles system.

–that is all.

NSF understands what GPO doesn’t seem to

The National Science Foundation (NSF) has been active in technology and Internet efforts for many years. I think it’s fair to consider most NSF people technologists. So, it was with pleasant surprise that libraries and the concept of geographically distributed local digital collections got several favorable mentions in the new NSF Cyberinfrastructure Vision and Strategic Plan.

Although the plan doesn’t explicitly deal with government information, I think it does have something to say to the depository community and to the Government Printing Office which appears to be favoring a centralized model of information dissemination.

First the plan explains the advantages libraries held in preserving scientific output in print (emphasis mine):

In print form, the preservation process is handled through a system of libraries and other repositories throughout the country and around the globe. Two features of this print-based system make it robust. First, the diversity of business models deriving support from a variety of sources means that no single entity bears sole responsibility for preservation, and the system is resilient to changes in any particular sector. Second, there is overlap in the collections, and redundancy of content reduces the potential for catastrophic loss of information. – Page 19

Rather than try to consign this robust system as a “legacy collection” of scientific information, the NSF has seen the future and has found it to be geographically and institutionally diverse (emphasis mine):

The national data framework is envisioned to provide an equally robust and diverse system for digital data management and access. It will: promote interoperability between data collections supported and managed by a range of organizations and organization types; provide for appropriate protection and reliable long-term preservation of digital data; deliver computational performance, data reliability and movement through shared tools, technologies and services; and accommodate individual community preferences. The agency will also develop a suite of coherent data policies that emphasize open access and effective organization and management of digital data, while respecting the data needs and requirements within science and engineering domains. – Page 20

Where does NSF expect to find the expertise and willingness to build these diverse digital collections? In part, in libraries! (Emphasis mine):

At the institutional level, colleges and universities are developing approaches to digital data archiving, curation, and analysis. They are sharing best practices to develop digital libraries that collect, preserve, index and share research and education material produced by faculty and other individuals within their organizations. The technological implementations of these systems are often open-source and support interoperability among their adopters. University-based research libraries and research librarians are positioned to make significant contributions in this area, where standard mechanisms for access and maintenance of scientific digital data may be derived from existing library standards developed for print material. These efforts are particularly important to NSF as the agency considers the implications of not just making all data generated with NSF funding broadly accessible, but of also promoting the responsible organization and management of these data such that they are widely usable.

The moral of this story for the depository library community is simple. If the Uber geeks at NSF appreciate the contributions of libraries and the logic of local digital collections, maybe we should too.

Archives