Home » Posts tagged 'distributed digital infrastructure'

Tag Archives: distributed digital infrastructure

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

Letter to Deputy CTO Noveck: “Open Government Publications”

[UPDATE 9/23/11: It’s come to our attention that scribd, the site that hosts the document below, does not make it easy for users to download. In some instances it appears as if the user has to subscribe to scribd before they can download. So I’ve attached a copy of the document below for your free downloading pleasure. JRJ]

In early April, Michael Keller, Stanford University Librarian and my boss, had a phone conversation with [[Beth_Simone_Noveck|Beth Simone Noveck]], US deputy Chief Technology Officer for Open Government leading President Obama’s Open Government Initiative. Noveck requested a short report outlining how the digital FDLP would work.

Below is that report outlining a distributed ecosystem, or publications.gov, that “would incorporate collaborative cataloging/metadata creation, as well as shared or Peer-to-Peer (P2P) technical infrastructure in which data and technological redundancy and collective and proactive action reign.” As many of you already know, some of the pieces for a digital FDLP ecosystem are already in place. However, as our recent post, “The State of FDsys and the Future of the FDLP”, showed, some of those critical pieces are on shaky ground to say the least.

The report was forwarded to Bob Tapella and Mike Wash at GPO as well as Aneesh Chopra, Chief Technology Officer (CTO), Vivek Kundra, Chief Information Officer (CIO), and US Archivist David Ferriero.

FDLP issues are now front and center to the movers and shakers in the Obama administration. But we’ll need more libraries and librarians willing to step up and pitch in to make the digital FDLP ecosystem a reality.

Digital FDLP Ecosystem

Critical GPO systems and the FDLP cloud

[Update: 10/13/09: I’ve revised my thinking on the cloud as the term is loaded and doesn’t really mean what I’m describing. A friend from the San Diego Supercomputer Center said, “some greybeards are going back to the original metaphor: the grid” and suggested the term “shared digital libraries” which is good. But what I’m describing is more like a biological ecosystem, the FDLP ecosystem. jrj]

Last week’s GPO purl server crash should be disconcerting to both the documents community and the public at large (in fact, although the hardware’s been restored, resolution is ongoing as I write). I know GPO staff are just as worried about this and are doing everything they can to fix the purl server.

“The PURL Server is currently inaccessible. GPO is working with IT staff to restore service as soon as possible. We regret any inconvenience caused by the server problems. An updated listserv will be sent once service is restored.”

But in the meantime, there are 1250+ library catalogs and innumerable links to government documents that are not working. The crash of a critical piece of GPO’s infrastructure brings a couple of things to mind:

1) What worries me about this is that FDsys and it’s supposed upgrade in hardware/software/systems design is for all intents and purposes the same as GPOaccess. That is, FDsys is a monolith where the failure of one piece can cause the whole system to ground to a halt. As our readers know, we’ve been advocating for a long time for a distributed digital FDLP (a *true* “digital depository” system!). We’re heartened by what we see of FDsys so far, but we need to be building a system with built-in redundancies.

I envision a collaborative and distributed system of digital content, collaborative cataloging/metadata creation, as well as technical infrastructure. With this kind of system in place, a failed purl server will only cause a momentary blip in service as a backup purl server kicks on instead of a several week+ outage. How many system degradations (WAIS) and failures (purl server) until we shift our thinking from “[w:client-server]” (with libraries decidedly on the “client” side of the equation) to “[w:Peer-to-peer]” concepts and build systems with built-in redundancies that mirror what the FDLP has been for the last 150 years? How long before we build an FDLP cloud?

(**This post was updated on October 26, 2017. The image was originally uploaded to Scribd, but Scribd deleted the document from their servers at some point, probably when they went to a pay-to-play model :-|. JRJ)

2) There was an interesting discussion of purl server outage on the code4lib list including a good workaround from a technological standpoint (pasted email below). It points to the fact/reminder that what we do within the FDLP has an affect on others in the wider library community (not to mention the public at large!) and that “our” content and the systems built to serve that content is critical for the work of others whether we know it or not. It also points to the need for us to reach out to those communities in order to build systems of use to both end-users as well as those building other systems, mashups, repositories etc. So I would highly recommend that we be *more* proactive in connecting with other communities within the library community (LITA, CODE4LIB, WEB4LIB, ACRL, state associations etc) as well as outside the FDLP (govt transparency community, historians and other academic communities, journalists etc).

—————— CODE4LIB POST (with added info by James re MARC view) ——————————–

Thanks to everyone who helped me confirm that the GPO PURL server is down. An official announcement on the GPO Listserv said:

“The PURL Server is currently inaccessible. GPO is working with IT staff to restore service as soon as possible. We regret any inconvenience caused by the server problems. An updated listserv will be sent once service is restored.”

While the server is down, here is one workaround (thanks to Patricia Duplantis):

  1. Copy the purl link listed in your library catalog
  2. Go to http://catalog.gpo.gov/
  3. Click “Advanced Search”
  4. Search for word in “URL/PURL”, enter the PURL
  5. Click “Go”
  6. In MARC view, the original URL at the time of cataloging should appear in a 53x note.

This incident, however, illuminates a weakness in PURL systems: access is broken when the PURL server breaks, even though the documents are still online at their original URLs.

Maybe someone more familiar with PURL systems can tell me… is there any way to harvest data from a PURL server, so that a backup/mirror can be available?


–that is all.