Could DOIs Solve Three Depository Challenges?
I am not an expert on Digital Object Identifiers (DOI) or Handles or other methods of creating permanent, persistent links to information on the web, so I pose this as a question. Could DOIs help solve three problems that, if solved, would provide better preservation, better access, and a better user experience?
The three challenges are:
1. The need for reliable, permanent, persistent links.
2. The need to provide a simple user interface to depository collections.
3. The need to guarantee authenticity of government information.
Here is why I think the answer is Yes.
Problem: Providing reliable, permanent, persistent links. Currently, GPO uses PURLs (Persistent Uniform Resource Locators) for creating permanent links. PURLs provide "persistent" links so that, when a page moves and its URL changes, that change need only be recorded once -- in the PURL database -- and all the hundreds or thousand of links to the PURL resolve to the new address automatically without being changed themselves. While this is an efficient way to deal with the dynamic nature of web addresses and, while this system works, it is fragile. We had a graphic demonstration of that fragility last August when the GPO PURL server crashed. When that happened, no one anywhere in the world who relied on PURL links to the 115,000 PURLs pointing to government information could reach that information using those links for more than two weeks. This was not the fault of GPO (athough restoration time could have been reduced with better disaster planning). Rather, the very nature of PURLs makes them fragile in this way and vulnerable to the crash of a single server.
Solution: Persistence is a function of organizations, not of technology. DOIs address the fragility problem by building a social structure that guarantees persistence. As the DOI organization says, "Persistence is a function of organizations, not of technology; a persistent identifier system requires a persistent organization and defined processes. The International DOI Foundation (IDF) provides a federation of Registration Agencies (RA). Dependency on any one RA is removed." In other words, if one server crashes, others are available immediately. Rather than relying on a single organization (GPO) and a single server at that organization, DOIs rely on multiple Registration Agencies and multiple servers. DOIs are reliable because they use redundancy and have no single points of failure (Wikipedia).
Problem: Providing a simple user interface. Imagine with me for a moment a depository system that deposits digital documents in FDLP libraries. Once such a system is in place, we will have the same document in multiple locations -- perhaps one copy in GPO's Federal Digital System (FDsys), one copy in each of a dozen or more FDLP libraries, perhaps an "original" copy at house.gov or senate.gov, and so forth. What is the user to do? Will libraries show dozens of links with an explanation after each as to what it is and hope users will have the patience to read the explanations, make an informed decision, and, if that particular link is down, go back and repeat the process? This sounds like a lousy user experience to me.
Solution: Multiple redirections. DOIs provide a way to resolve multiple URLs with a single DOI. (Resolution of Multiple URLs). This would mean that multiple copies of digital documents could be stored at many separate FDLP libraries and all could use a single, clickable link (a DOI) that would get users the copy of that document based on criteria the library defines. For example, one library might have the DOI point to the original first and the local library copy second; another library might point to the "network-closest" copy first and then other more distant copies; and so forth. DOIs do this by storing metadata with the DOI. Rather than storing only the current URL of a registered item, DOIs can record a list of locations with hints for how the resolving client should select a location, including an ordered set of selection methods.
Here is an illustration of how it works:
This solution would have the added benefit of enabling and facilitating a true digital depository system in which digital information is deposited into FDLP libraries. FGI is a strong advocate of a depository system that does this for several reasons that we have described repeatedly here and in our writings and presentations. In brief, we believe that this would make it possible for individual FDLP libraries to build their own local digital collections focused on the needs of their own user communities; it would aid preservation by ensuring that multiple copies exist under different technical, financial, and administrative structures; and it would create a better user experience by providing a way to integrate digital FDLP/Title-44 documents with non-Title-44 federal documents, documents from state and local governments, and other non-government information. DOIs would, therefor facilitate preservation as well as access.
Problem: Guarantee Authenticity. How does a user know that the document they just retrieved is "authentic," that it has not been altered, that it really is what it purports to be? Many people hope for a technological solution (e.g., PKI, time stamps, encryption, digital signatures, watermarks). We at FGI believe that these are techniques that people use and that the authenticity comes, not from the technique, but from users' trust in the people who set up the techniques.
No one explained this better than Abby Smith (Digital Authenticity in Perspective in "Authenticity in a Digital Environment,” Council on Library and Information Resources, Publication 92. May 2000). She noted that, when technologists were asked about how to establish the authenticity of a digital object, they were skeptical of technological "solutions" and said that "there is no technological solution that does not itself involve the transfer of trust to a third party."
Solution: Trust is a social phenomenon, not a technical one. So, imagine how this might work. Imagine a document that is in FDsys, and in the digital collections of several FDLP libraries, and also at the New York Times, and at any number of other places on the web. There might be a dozen URLs for that one document. But, if GPO assigned a single DOI to it and made sure it pointed to FDsys and to "Official Depository Copies" at FDLs, that one DOI would, by definition, point to "authentic" copies -- the original and those officially deposited in Title-44-authorized Federal Depository Libraries. The "prefix" part of a DOI refers to the registering agency (in this case GPO) and would further help "brand" the DOI as authentic. Users wanting "authentic" government information would look for DOIs bearing the GPO prefix -- and they would find what they wanted with a single click, no matter where the particular copy they get is stored. (In addition, the DOI metadata can include authenticity information.)
Precedents. GPO would not be alone in using DOIs. Who uses DOIs? ICPSR, OECD, the European Communities' EU publications office, CrossRef, and many others.
Barriers. The main barrier I can see to adopting DOIs is cost. I assume that it will surely cost more than implementing PURLs. But the two costs cannot be compared directly because the costs buy different things. Implementing PURLs gets us a fragile redirection system. Implementing DOIs gets us a redirection system of persistent identifiers, the ability to have multiple redirects to multiple copies, and a new way of thinking about authenticity.
I welcome comments and responses to my question and particularly hope that those with more knowledge in this area will fill in the gaps I have left.