distributed digital infrastructure
[UPDATE 9/23/11: It's come to our attention that scribd, the site that hosts the document below, does not make it easy for users to download. In some instances it appears as if the user has to subscribe to scribd before they can download. So I've attached a copy of the document below for your free downloading pleasure. JRJ]
In early April, Michael Keller, Stanford University Librarian and my boss, had a phone conversation with Beth Simone Noveck, US deputy Chief Technology Officer for Open Government leading President Obama's Open Government Initiative. Noveck requested a short report outlining how the digital FDLP would work.
Below is that report outlining a distributed ecosystem, or publications.gov, that "would incorporate collaborative cataloging/metadata creation, as well as shared or Peer-to-Peer (P2P) technical infrastructure in which data and technological redundancy and collective and proactive action reign." As many of you already know, some of the pieces for a digital FDLP ecosystem are already in place. However, as our recent post, "The State of FDsys and the Future of the FDLP", showed, some of those critical pieces are on shaky ground to say the least.
The report was forwarded to Bob Tapella and Mike Wash at GPO as well as Aneesh Chopra, Chief Technology Officer (CTO), Vivek Kundra, Chief Information Officer (CIO), and US Archivist David Ferriero.
FDLP issues are now front and center to the movers and shakers in the Obama administration. But we'll need more libraries and librarians willing to step up and pitch in to make the digital FDLP ecosystem a reality.
Digital FDLP Ecosystem
[Update: 10/13/09: I've revised my thinking on the cloud as the term is loaded and doesn't really mean what I'm describing. A friend from the San Diego Supercomputer Center said, "some greybeards are going back to the original metaphor: the grid" and suggested the term "shared digital libraries" which is good. But what I'm describing is more like a biological ecosystem, the FDLP ecosystem. jrj]
Last week's GPO purl server crash should be disconcerting to both the documents community and the public at large (in fact, although the hardware's been restored, resolution is ongoing as I write). I know GPO staff are just as worried about this and are doing everything they can to fix the purl server.
"The PURL Server is currently inaccessible. GPO is working with IT staff to restore service as soon as possible. We regret any inconvenience caused by the server problems. An updated listserv will be sent once service is restored."
But in the meantime, there are 1250+ library catalogs and innumerable links to government documents that are not working. The crash of a critical piece of GPO's infrastructure brings a couple of things to mind:
1) What worries me about this is that FDsys and it's supposed upgrade in hardware/software/systems design is for all intents and purposes the same as GPOaccess. That is, FDsys is a monolith where the failure of one piece can cause the whole system to ground to a halt. As our readers know, we've been advocating for a long time for a distributed digital FDLP (a *true* "digital depository" system!). We're heartened by what we see of FDsys so far, but we need to be building a system with built-in redundancies.
I envision a collaborative and distributed system of digital content, collaborative cataloging/metadata creation, as well as technical infrastructure. With this kind of system in place, a failed purl server will only cause a momentary blip in service as a backup purl server kicks on instead of a several week+ outage. How many system degradations (WAIS) and failures (purl server) until we shift our thinking from "client-server" (with libraries decidedly on the "client" side of the equation) to "Peer-to-peer" concepts and build systems with built-in redundancies that mirror what the FDLP has been for the last 150 years? How long before we build an FDLP cloud?