Home » Posts tagged 'purl'

Tag Archives: purl

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

Happy 2023! The state of government information in 2023

Happy new year 2023! We hope all our readers had a relaxing holiday break and are ready to get back to the important work of preserving government information and assuring its long-term access!

In the latest First Branch Forecast — you really should subscribe to this important newsletter if you haven’t already! — a side comment about the findings of the January 6th Committee caught our attention. In discussing the release of the COmmittee’s final reports, as well as the many witness transcripts, Daniel Schuman noted “We’re linking to the PDF on the Wayback Machine because the Committee’s website will be toast in early January.”

This is the troubling reality we find ourselves in. Digital government information turns out to be extremely fragile and reliant on the political winds of Washington DC. The Government Publishing Office (GPO) captured the committee’s final report and various hearings (though NOT the various witness testimony transcripts that the committee has released to its website (of which I’m also linking to the Wayback Machine!)), the final report has already been published by a private company (in this case the New Yorker and Celadon Books), and the report will no doubt be be saved by Library of Congress, NARA, and various libraries around the country. But each of those will have their own URL rather than the official URL from the actual committee that did the work. It would be amazing if there were a system of permanent URLs (called PIDs) that stay permanent and point to all the copies in the same way that DOIs work for journal articles. I and many of my depository library colleagues are working hard on putting a system like this in place for US government information. It was one of FGI’s resolutions for 2020 and I’ve been busy working on the Depository Library Council (DLC) Working Group Exploring the Durability of PURLs and Their Alternatives (charge). The working group is finishing up its work and will soon release its final report and recommendations.

Let’s hope that 2023 is the year that electronic government information is collected, preserved, and made easily accessible for the public!

New year’s resolutions for 2020: setting a new agenda for a new FDLP

Happy 2020! Now that we’re starting a new decade(!) — and GPO has set up a working group to study and consider digital deposit and Depository Library Council (DLC) will soon announce its PURL working group! — it is time for FGI to make its new year’s resolutions and envision a new agenda for a new Federal Depository Library Program (FDLP). This new digital FDLP will focus on the digital needs of users by building digital services based on digital collections. It will lead the way for libraries of all kinds, showing the value of digital libraries in the twenty-first century.


We recognize that (more…)

GPO to migrate to PURLZ

Well this is good news to anyone who remembers the great GPO purl crash of ought-9. GPO just announced the contract for upgrading their permanent URL architecture, migrating to PURLZ. I hope they’ll build participation by depository libraries into their new architecture. It would be great to have a failsafe on non-.gov servers as well.

GPO is pleased to announce that a contract for upgrading the PURL Server architecture and hosting the new solution has been awarded to Zepheira. The new PURL architecture provides greater flexibility, new features, and the scalability to face an increased demand for PURLs. GPO is currently in the process of migrating to this new architecture (PURLS to PURLZ.)

This transition boasts many benefits, including:

* A more robust system architecture
* Immediate system back-up through synchronization
* Immediate system fail-over
* Enhanced statistical reporting
* Enhanced Web referral reporting
* Improved speed for resolution of redirects

The targeted date for the transition to this new architecture is summer 2010.

More information will be forthcoming at the Spring Depository Library Council Meeting in Buffalo, New York (April 26 – 28, 2010) and via the FDLP Desktop and fdlp-l.

Could DOIs Solve Three Depository Challenges?

I am not an expert on Digital Object Identifiers (DOI) or Handles or other methods of creating permanent, persistent links to information on the web, so I pose this as a question. Could DOIs help solve three problems that, if solved, would provide better preservation, better access, and a better user experience?

The three challenges are:
1. The need for reliable, permanent, persistent links.
2. The need to provide a simple user interface to depository collections.
3. The need to guarantee authenticity of government information.

Here is why I think the answer is Yes.

Problem: Providing reliable, permanent, persistent links. Currently, GPO uses PURLs (Persistent Uniform Resource Locators) for creating permanent links. PURLs provide “persistent” links so that, when a page moves and its URL changes, that change need only be recorded once — in the PURL database — and all the hundreds or thousand of links to the PURL resolve to the new address automatically without being changed themselves. While this is an efficient way to deal with the dynamic nature of web addresses and, while this system works, it is fragile. We had a graphic demonstration of that fragility last August when the GPO PURL server crashed. When that happened, no one anywhere in the world who relied on PURL links to the 115,000 PURLs pointing to government information could reach that information using those links for more than two weeks. This was not the fault of GPO (athough restoration time could have been reduced with better disaster planning). Rather, the very nature of PURLs makes them fragile in this way and vulnerable to the crash of a single server.

Solution: Persistence is a function of organizations, not of technology. DOIs address the fragility problem by building a social structure that guarantees persistence. As the DOI organization says, “Persistence is a function of organizations, not of technology; a persistent identifier system requires a persistent organization and defined processes. The International DOI Foundation (IDF) provides a federation of Registration Agencies (RA). Dependency on any one RA is removed.” In other words, if one server crashes, others are available immediately. Rather than relying on a single organization (GPO) and a single server at that organization, DOIs rely on multiple Registration Agencies and multiple servers. DOIs are reliable because they use redundancy and have no single points of failure (Wikipedia).

Problem: Providing a simple user interface. Imagine with me for a moment a depository system that deposits digital documents in FDLP libraries. Once such a system is in place, we will have the same document in multiple locations — perhaps one copy in GPO’s Federal Digital System (FDsys), one copy in each of a dozen or more FDLP libraries, perhaps an “original” copy at house.gov or senate.gov, and so forth. What is the user to do? Will libraries show dozens of links with an explanation after each as to what it is and hope users will have the patience to read the explanations, make an informed decision, and, if that particular link is down, go back and repeat the process? This sounds like a lousy user experience to me.

Solution: Multiple redirections. DOIs provide a way to resolve multiple URLs with a single DOI. (Resolution of Multiple URLs). This would mean that multiple copies of digital documents could be stored at many separate FDLP libraries and all could use a single, clickable link (a DOI) that would get users the copy of that document based on criteria the library defines. For example, one library might have the DOI point to the original first and the local library copy second; another library might point to the “network-closest” copy first and then other more distant copies; and so forth. DOIs do this by storing metadata with the DOI. Rather than storing only the current URL of a registered item, DOIs can record a list of locations with hints for how the resolving client should select a location, including an ordered set of selection methods.

Here is an illustration of how it works:

This solution would have the added benefit of enabling and facilitating a true digital depository system in which digital information is deposited into FDLP libraries. FGI is a strong advocate of a depository system that does this for several reasons that we have described repeatedly here and in our writings and presentations. In brief, we believe that this would make it possible for individual FDLP libraries to build their own local digital collections focused on the needs of their own user communities; it would aid preservation by ensuring that multiple copies exist under different technical, financial, and administrative structures; and it would create a better user experience by providing a way to integrate digital FDLP/Title-44 documents with non-Title-44 federal documents, documents from state and local governments, and other non-government information. DOIs would, therefor facilitate preservation as well as access.

Problem: Guarantee Authenticity. How does a user know that the document they just retrieved is “authentic,” that it has not been altered, that it really is what it purports to be? Many people hope for a technological solution (e.g., PKI, time stamps, encryption, digital signatures, watermarks). We at FGI believe that these are techniques that people use and that the authenticity comes, not from the technique, but from users’ trust in the people who set up the techniques.

No one explained this better than Abby Smith (Digital Authenticity in Perspective in “Authenticity in a Digital Environment,” Council on Library and Information Resources, Publication 92. May 2000). She noted that, when technologists were asked about how to establish the authenticity of a digital object, they were skeptical of technological “solutions” and said that “there is no technological solution that does not itself involve the transfer of trust to a third party.”

Solution: Trust is a social phenomenon, not a technical one. So, imagine how this might work. Imagine a document that is in FDsys, and in the digital collections of several FDLP libraries, and also at the New York Times, and at any number of other places on the web. There might be a dozen URLs for that one document. But, if GPO assigned a single DOI to it and made sure it pointed to FDsys and to “Official Depository Copies” at FDLs, that one DOI would, by definition, point to “authentic” copies — the original and those officially deposited in Title-44-authorized Federal Depository Libraries. The “prefix” part of a DOI refers to the registering agency (in this case GPO) and would further help “brand” the DOI as authentic. Users wanting “authentic” government information would look for DOIs bearing the GPO prefix — and they would find what they wanted with a single click, no matter where the particular copy they get is stored. (In addition, the DOI metadata can include authenticity information.)

Precedents. GPO would not be alone in using DOIs. Who uses DOIs? ICPSR, OECD, the European Communities’ EU publications office, CrossRef, and many others.

Barriers. The main barrier I can see to adopting DOIs is cost. I assume that it will surely cost more than implementing PURLs. But the two costs cannot be compared directly because the costs buy different things. Implementing PURLs gets us a fragile redirection system. Implementing DOIs gets us a redirection system of persistent identifiers, the ability to have multiple redirects to multiple copies, and a new way of thinking about authenticity.

I welcome comments and responses to my question and particularly hope that those with more knowledge in this area will fill in the gaps I have left.