Home » Posts tagged 'authenticity'

Tag Archives: authenticity

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

Could DOIs Solve Three Depository Challenges?

I am not an expert on Digital Object Identifiers (DOI) or Handles or other methods of creating permanent, persistent links to information on the web, so I pose this as a question. Could DOIs help solve three problems that, if solved, would provide better preservation, better access, and a better user experience?

The three challenges are:
1. The need for reliable, permanent, persistent links.
2. The need to provide a simple user interface to depository collections.
3. The need to guarantee authenticity of government information.

Here is why I think the answer is Yes.

Problem: Providing reliable, permanent, persistent links. Currently, GPO uses PURLs (Persistent Uniform Resource Locators) for creating permanent links. PURLs provide “persistent” links so that, when a page moves and its URL changes, that change need only be recorded once — in the PURL database — and all the hundreds or thousand of links to the PURL resolve to the new address automatically without being changed themselves. While this is an efficient way to deal with the dynamic nature of web addresses and, while this system works, it is fragile. We had a graphic demonstration of that fragility last August when the GPO PURL server crashed. When that happened, no one anywhere in the world who relied on PURL links to the 115,000 PURLs pointing to government information could reach that information using those links for more than two weeks. This was not the fault of GPO (athough restoration time could have been reduced with better disaster planning). Rather, the very nature of PURLs makes them fragile in this way and vulnerable to the crash of a single server.

Solution: Persistence is a function of organizations, not of technology. DOIs address the fragility problem by building a social structure that guarantees persistence. As the DOI organization says, “Persistence is a function of organizations, not of technology; a persistent identifier system requires a persistent organization and defined processes. The International DOI Foundation (IDF) provides a federation of Registration Agencies (RA). Dependency on any one RA is removed.” In other words, if one server crashes, others are available immediately. Rather than relying on a single organization (GPO) and a single server at that organization, DOIs rely on multiple Registration Agencies and multiple servers. DOIs are reliable because they use redundancy and have no single points of failure (Wikipedia).

Problem: Providing a simple user interface. Imagine with me for a moment a depository system that deposits digital documents in FDLP libraries. Once such a system is in place, we will have the same document in multiple locations — perhaps one copy in GPO’s Federal Digital System (FDsys), one copy in each of a dozen or more FDLP libraries, perhaps an “original” copy at house.gov or senate.gov, and so forth. What is the user to do? Will libraries show dozens of links with an explanation after each as to what it is and hope users will have the patience to read the explanations, make an informed decision, and, if that particular link is down, go back and repeat the process? This sounds like a lousy user experience to me.

Solution: Multiple redirections. DOIs provide a way to resolve multiple URLs with a single DOI. (Resolution of Multiple URLs). This would mean that multiple copies of digital documents could be stored at many separate FDLP libraries and all could use a single, clickable link (a DOI) that would get users the copy of that document based on criteria the library defines. For example, one library might have the DOI point to the original first and the local library copy second; another library might point to the “network-closest” copy first and then other more distant copies; and so forth. DOIs do this by storing metadata with the DOI. Rather than storing only the current URL of a registered item, DOIs can record a list of locations with hints for how the resolving client should select a location, including an ordered set of selection methods.

Here is an illustration of how it works:

This solution would have the added benefit of enabling and facilitating a true digital depository system in which digital information is deposited into FDLP libraries. FGI is a strong advocate of a depository system that does this for several reasons that we have described repeatedly here and in our writings and presentations. In brief, we believe that this would make it possible for individual FDLP libraries to build their own local digital collections focused on the needs of their own user communities; it would aid preservation by ensuring that multiple copies exist under different technical, financial, and administrative structures; and it would create a better user experience by providing a way to integrate digital FDLP/Title-44 documents with non-Title-44 federal documents, documents from state and local governments, and other non-government information. DOIs would, therefor facilitate preservation as well as access.

Problem: Guarantee Authenticity. How does a user know that the document they just retrieved is “authentic,” that it has not been altered, that it really is what it purports to be? Many people hope for a technological solution (e.g., PKI, time stamps, encryption, digital signatures, watermarks). We at FGI believe that these are techniques that people use and that the authenticity comes, not from the technique, but from users’ trust in the people who set up the techniques.

No one explained this better than Abby Smith (Digital Authenticity in Perspective in “Authenticity in a Digital Environment,” Council on Library and Information Resources, Publication 92. May 2000). She noted that, when technologists were asked about how to establish the authenticity of a digital object, they were skeptical of technological “solutions” and said that “there is no technological solution that does not itself involve the transfer of trust to a third party.”

Solution: Trust is a social phenomenon, not a technical one. So, imagine how this might work. Imagine a document that is in FDsys, and in the digital collections of several FDLP libraries, and also at the New York Times, and at any number of other places on the web. There might be a dozen URLs for that one document. But, if GPO assigned a single DOI to it and made sure it pointed to FDsys and to “Official Depository Copies” at FDLs, that one DOI would, by definition, point to “authentic” copies — the original and those officially deposited in Title-44-authorized Federal Depository Libraries. The “prefix” part of a DOI refers to the registering agency (in this case GPO) and would further help “brand” the DOI as authentic. Users wanting “authentic” government information would look for DOIs bearing the GPO prefix — and they would find what they wanted with a single click, no matter where the particular copy they get is stored. (In addition, the DOI metadata can include authenticity information.)

Precedents. GPO would not be alone in using DOIs. Who uses DOIs? ICPSR, OECD, the European Communities’ EU publications office, CrossRef, and many others.

Barriers. The main barrier I can see to adopting DOIs is cost. I assume that it will surely cost more than implementing PURLs. But the two costs cannot be compared directly because the costs buy different things. Implementing PURLs gets us a fragile redirection system. Implementing DOIs gets us a redirection system of persistent identifiers, the ability to have multiple redirects to multiple copies, and a new way of thinking about authenticity.

I welcome comments and responses to my question and particularly hope that those with more knowledge in this area will fill in the gaps I have left.


Who do you Trust? The Authentication Problem

How do we know when a digital document is “authentic”? While many in the library and academic communities hope that there will be a technological solution, the reality is that technology alone cannot solve the problem of authenticity. A report this week of research at a Chinese university illuminates one reason for this: technical tools are subject to failure, compromise, forgery, and hacking.

The article reports a flaw in an official federal standard that was originally devised by the National Security Agency and is widely used to create and verify digital signatures in e-mail and on the Web. In fact, it is embedded in every modern Web browser and operating system. The CNET article notes that, while the flaw that Chinese scientists discovered in the “Secure Hash Algorithm” is “theoretical,” it will eventually make it easier to forge electronic signatures.

But authenticity requires more than secure software. Even if we had a tool that could never be hacked and that would last forever, we would still only have part of a solution: the technical part. The other part of the solution is social: it is the issue of Trust.

Software provides the technical part of the solution

The technology of authentication provides a way to verify that a document is what it purports to be and determine if it has been altered or not. Document-creators can use software to create special files (called “hashes” or “signatures” or “keys”) based on the original document. These special files are typically stored with a “trusted third party” — neither the document creator nor the recipient. Document-users can then use software to check the authenticity of the document in hand against that “hash.” The software is able to determine only if the document in hand is identical to the original. Even the smallest change (e.g., the insertion or removal of a blank space) will result in a report that the documents are not identical.

Trust is the social part of the solution

But this technological check does not solve the authentication problem by itself. The check against the hash is only as reliable as the trusted third party. The software just gives us a technical means of shifting who we trust — instead of trusting the party that delivered the document to us, for example, we trust a third party that tells us that the hash is correct and authentic. If the hash isn’t authentic and unchanged, the check against the hash is worthless.

This concept of a trusted third party is, therefore, an essential component of the authentication chain. That should lead us to an important question: who will we choose as our trusted third parties? This is important because the tools only work if we can trust the third party to do its job. In the case of government information essential to our democracy, this trust has to last forever.

Who do you trust?

Ask yourself who in society is the most trusted third party in delivering information? The government? The press? Publishers? Technology companies like Microsoft and Verizon?

What about libraries?

Now ask yourself what we will do if we think that technological-verification is all we need to ensure authentication and we find one day that the tools have failed as described in the CNET article.

A Social Solution built on Trusted Institutions and Legal Deposit

Trust is a social phenomenon, not a technical one. What if, instead of putting all our faith in potential technological “solution” for ensuring authenticity of government documents, we instead relied on the existing infrastructure of depository libraries to ensure authenticity through their collective possession of multiple copies of digital government publications, distributed by GPO at the time of their publication under the legal-mandate of 44 USC?

This solution promises to be a sound, sustainable one because it relies on libraries as the trusted repository of information. Libraries have a long, well-established social role of providing information; people trust libraries because of it. Libraries have a vested interest in ensuring that the information they provide is authentic and people trust them to do so because it is their primary mission — not a byproduct of publishing or making money or the various missions of government agencies.

The trust people place in libraries in general can be increased in the digital environment by relying, not on one or two libraries, but on many libraries with different funding streams and missions. Any unforeseen compromise in one institution becomes a single error in a large system of information-provision. (See Article outlines bottom-up standards for digital preservation systems.) Even in the paper and ink world, forgeries are possible — though more difficult than in the digital world — and one important way we determine authenticity is by comparing multiple copies.

A different approach

This approach is subtly different from the approach of hoping for a technological solution to authenticity. It recognizes that the social issue of trust (along with the existence of multiple copies controlled by different parties) is paramount and the role of technology is secondary. The role of technology is simply to provide tools to help implement that trust. Indeed, if we used this social-trust legal-digital-deposit approach, libraries would still use technical tools (e.g., LOCKSS, PKI, state of the art hash technologies) to validate the integrity of digital files. Combine these tools with trusted institutions, legal deposit, and multiple copies under multiple jurisdictions and you have fail-safe a recipe for ensuring authenticity.


The problem with hoping for a technological solution was clearly articulated back in 2000 by Abby Smith, Director of Programs at the Council on Library and Information Resources.

Interestingly, the scholar-participants suggested that technological solutions to the problem [of establishing the authenticity of a digital object] will probably emerge that would obviate the need for trusted third parties. Such solutions may include, for example, embedding texts, documents, images, and the like with various warrants (e.g., time stamps, encryption, digital signatures, and watermarks). The technologists replied with skepticism, saying that there is no technological solution that does not itself involve the transfer of trust to a third party. Encryption — for example, public key infrastructure (PKI) — and digital signatures are simply means of transferring risk to a trusted third party. Those technological solutions are as weak or as strong as the trusted third party. To devise technical solutions to what is, in their view, essentially a social challenge is to engender an “arms race” among hackers and their police.
Digital Authenticity in Perspective in “Authenticity in a Digital Environment,” Council on Library and Information Resources, Publication 92. (May 2000).

James A. Jacobs, November 3, 2005