fdsys

Bringing GPO into the Digital World

Mike Wash, CIO for the Government Printing Office, is Washington Post's Federal Player of the Week.

Also interesting, this article was jointly prepared by the Washington Post and the Partnership for Public Service, which strives to improve federal government performance and recognize the good, hard work of our public servants. In an environment replete with knee-jerk anti-government sentiment, such work seems especially important for enhancing the public's understanding of government professionals' work and impact. Another similar organization I follow is Understanding Government, which supports excellence in reporting of the executive branch. Of course, we need our watchdogs and gadflies to push back against government secrecy, but we also need to know when people at all levels of government are doing things well.

FDsys enhanced functionality and collections update

FYI the Government Printing Office (GPO) has announced a scheduled maintenance and update of their Federal Digital System (FDsys). New collections and functionalities will be added. Please be aware that there may be some intermittent downtime over the next day or two.

The U.S. Government Printing Office (GPO) will be performing scheduled maintenance on FDsys . During this time, new collections will be added and enhanced functionality will be introduced.

* Start Date: Wednesday, July 28, 2010 05:00 p.m. EDT
* End Date: Thursday, July 29, 2010 12:00 Noon EDT

During the scheduled maintenance, FDsys may experience brief intermittent downtime.

The new collections are being added are:

* Independent Counsel Investigations
* House Journal
* Privacy Act Issuances
* Unified Agenda
* Education Reports from ERIC

As part of ongoing efforts to enhance FDsys, the ability to only display the most recent edition of a publication as well as one-click access to view all editions from simple search results is being introduced for selected applications. The affected applications are:

* United States Government Manual
* Congressional Directory
* Privacy Act Issuances
* United States Code (U.S. Code)
* House Rules and Manual
* Senate Manual
* Economic Indicators
* Economic Report of the President
* Code of Federal Regulations (CFR)

The new functionality being introduced requires reprocessing of selected U.S. Code and CFR data, which is expected to be completed within 2 days of launch. Select U.S. Code and CFR data may be inaccessible during the reprocessing period.

Thank you for your cooperation during this time.

More news on FR 2.0

We're getting closer to the release date for Federal Register 2.0, the Office of the Federal Register's joint project with GPO to take the XML version of the FR and make it into a "newspaper" of regulatory activity.

According to this press release, the beta version will be released to the public on 26 July 2010, a date chosen to coincide with the 75th anniversary of the Federal Register Act. My hope is that this project will set the bar for future uses of XML made available in FDsys, and I'm looking forward to reviewing this initial version at the end of the month.

GPO joins LOCKSS: digital deposit a reality

According to yesterday's press release, GPO has joined the LOCKSS alliance! The Stanford News Service also wrote a story about this historic event, complete with a goofy picture of yours truly :-)

But what the GPO press release didn't explain is that, as part of GPO's participation in the LOCKSS Alliance, GPO will assist the LOCKSS-USDOCS project (which I'm organizing) in preserving content harvested from fdsys.gov in a geographically distributed network of digital archives. GPO has put LOCKSS permission statements (for example here, and here and here) throughout the FDsys.gov site in order for LOCKSS-USDOCS to harvest GPO content. LOCKSS-USDOCS -- which is 18 libraries strong (including 4 regionals!) and growing -- replicates key aspects of the FDLP in the digital environment and furthers the concept of "digital deposit," an essential component of the digital FDLP.

We're actively looking for other libraries to participate in the project, especially regionals. Together we can provide an essential digital preservation piece to the FDLP. Please contact me (jrjacobs AT stanford DOT edu) with questions or interest.

--That is all.

FDsys Program Review

In addition to the recent GPO Inspector General's report on FDsys (see The State of FDsys and the Future of the FDLP), there is another new report on FDsys.

  • FDsys Program Review. Bob Tapella, Ric Davis, Mike Wash,Scott Stovall, Selene Dalecky, John Shuler, Suzanne Sears, Mike White. Government Printing Office (April 7, 2010)

    Summary: On Wednesday, April 7, 2010, Bob Tapella, Public Printer, United States Government Printing Office (GPO), convened a public meeting to review the status of GPO’s Federal Digital System (FDsys) program. The objective of the meeting was to receive a program status update and to discuss program successes, issues, and opportunities with key stakeholders including GPO’s Library Services and Content Management (LSCM) business unit, the Office of the Federal Register (OFR), and representatives from the Federal Depository Library Council. The meeting was also attended by observers from GPO, the House Administration Committee, and the House Committee on Appropriations.

This report gives a much more sanguine view of the state of FDsys than the Inspector General report gives. It does, indeed, step through "program successes, issues, and opportunities." As I noted in my coverage of the IG report, there are successes and there is lots too hope for when all the system requirements are met. This report notes that "The estimated cost to complete Release 1 was reduced from $62 million to $42 million, saving $20 million" while the IG report focuses on the fact that the original cost estimate for the first phase of FDsys implementation was $16 million and the fact that GPO has redefined "Release 1" (which originally was slated to include "basic, additional, and final features") to include only "basic" features and now calls "additional and final features" "Release 2."

Nevertheless, it does a good job of pointing out what GPO has accomplished, which is significant.

The new report also identifies one critical risk to FDsys:

[T]here is risk associated with a delayed completion of the core system. Mitigation steps include maintaining sufficient investment to complete the core system and preventing loss of key resources resulting in more cost and time.

It also includes this statement of purpose:

The purpose of FDsys is not to serve as a portal, but instead to provide access to official and authentic content from all three branches of the U.S. government on our site, and through links to official agency and partnering web sites. Our main system functions encompass publishing information, enabling searching for information, preserving the information, and providing version control.

This is a sound, and probably sustainable, purpose. The report notes with satisfaction that the provision of XML formatted information has powered other, more user friendly, websites such as FedThread.org, GovPulse.us, and Regulations.gov. This vision of FDsys is, perhaps, close to view of those who say that the government should reimagine its role as an information provider to providing raw data and leave the fancy websites to others. (See The Federal Government Must Reimagine Its Role As An Information Provider.)

It is, however, probably not as close to the view that FDLP librarians have of easy access to government information. In light of the problems described in the IG report, it makes me wonder if there is a slight "re-imagining" of FDsys going on to make its vision fit closer with what GPO can do rather than what FDLP would like it to do. Time will tell.

Update. When asked about this issue at DLC meeting yesterday (Monday, April 26, 2010), the Supt. of Docs. responded (as reported by Shari Laster): "It's an advanced search system, a content management system, and a digital repository. Is GPO Access/etc. a portal? No. This is an official content repository."

The report also intriguingly notes that "FDsys content is available in all major search engines." I did a couple of quick Google searches of full text hearings that are in FDsys and got no hits. I would be interested to hear if GPO has more details about what is "available" in all major search engines and what is not. (If you have different results, please share them!)

Oh, yes. One other little thing. Ric Davis, Director of Library Services and Content Management and Acting Superintendent of Documents lists several "opportunities" afforded by FDsys. One is "Digital Dissemination"!

While having a repository of content available at GPO is critical, there are opportunities to facilitate the availability of digital collections in libraries. Some in the FDLP community have expressed strong interest in having Access and/or Preservation level files digitally deposited in FDLP libraries. This will further the model established for tangible collections of content by having dispersed collections of electronic content, and through partnerships better ensure access and preservation of content.
(FDsys Program Review, page 7)

Thanks Ric!!

The State of FDsys and the Future of the FDLP

The recent report on the state of the Government Printing Office's Federal Digital System (FDsys) should raise important questions for GPO and should be a wake up call for FDLP libraries. The report documents that the project is over budget, behind schedule, and lacks sufficient resources and planning to move forward successfully.

The report (Federal Digital System (FDsys) Independent Verification and Validation (IV&V) – Tenth Quarter Report on Risk Management, Issues, and Traceability Report Number 10­-05, "IV&V Risk Management, Issues, And Traceability Report," January 14, 2010) was written by American Systems, under contract to the GPO Office of Inspector General and is attached to the following memo available from GPO:

The consultants were charged with assessing the state of FDsys implementation. Some of the findings of the report that will be of most interest to FDLP libraries are:

  • FDsys as it exists today "bears only partial resemblance of the system that was envisioned."
  • The program is significantly over budget. The original cost estimate for the first phase of FDsys implementation was $16 million. Through August 2009, GPO had spent more than twice that (approximately $33.6 million) and, by the end of FY 2010, the total costs for FDsys contractor support will be approximately $42 million.
  • Even though the cost has more than doubled, the project is significantly behind schedule. "Release 1" of FDsys was slated to be accomplished in three phases ending in the Fall of 2009, but only the first phase has been deployed -- and even that phase is incomplete.
  • GPO now says that only 42 Collections will be migrated to FDsys instead of the originally-planned 55.
  • There is an on-going indexing problem within the FAST search product. The FDsys database was due to approach a critical threshold of 2.5 million records in December of 2009 and FAST will require changes to accomodate more records.
  • The majority of the work on FDsys after the release of the first phase of Release 1 has been centered on fixing problems and dealing with emerging issues. GPO is focusing more on fixing and upgrading a deployed system than on building the final system.
  • The FDsys Program has performed little to no analysis, planning, design, or development work for Release 2.
  • The "large number [25] of deployments [production builds] over a ten month period reflects the obvious fact that the originally deployed system contained numerous deficiencies."
  • The lack of clear definition of the system and the lack of a detailed implementation plan prevent GPO from determining realistic cost estimates for future development and endanger the ability of GPO to develop and deploy the final system.
  • The consultants say that GPO does not have sufficient system engineering expertise to direct and oversee the development of FDsys and that this has resulted in a system with incomplete functionality, design problems, and numerous deficiencies. They recommend that GPO hire a senior system engineer and say that, without one, these problems will continue and future releases will likely take longer and cost more than anticipated. GPO management, however, completely disagrees with this recommendation.

There is more in the 38 page report, but the above gives you the gist of the problems.

The positive, the negative, and the risks.
There are some positive things. Much has been accomplished. Twenty-five of the most complex collections have been transferred from GPO Access to FDsys. The project managed to incorporate a significant design change during implementation to accomodate "Collections with numerous granules." The project also was able to create a new capability to support public access to FDsys information via the Data.gov website.

But these accomplishments are overshadowed by the numerous problems that the report documents. Even the already-deployed system is apparently overwhelmed with problems. The report documents 232 problems that adversely affect the accomplishment of an operational or mission­-essential capability and notes that the many unresolved problems with the system create "a serious risk that the overall goals for FDsys ... will take much longer and require significantly more funding to achieve."

The failed "paradigm shift."
Since 1993, GPO has been championing a "paradigm shift" in responsibilities in which GPO arrogated to itself the responsibility for both access and preservation of government information and diminished the role of FDLP libraries. (See, for example, the discussion on GPO's draft regional libraries report and FGI comments.) We at FGI have been concerned about this shift from the beginning for a number of reasons. One of those reasons is the danger of entrusting all preservation and free access to any single organization. Any interruption or failure of that organization (financial, technological, political, etc.), could mean a catastrophic loss of access to government information for everyone.

We have been hopeful that interruptions would be small and short and failures would be in the future. But we have hoped to persuade the FDLP library community, including GPO, that it would be wiser and more prudent and more durable to build on the existing FDLP model of sharing responsibility for access and preservation across many institutions. These institutions with different infrastructures, governing bodies, technologies, and communities of users, would, we have argued, do a better job collectively than any one institution could do by itself.

We have feared the day when Congress would cut back GPO appropriations after we all were irreversibly dependent upon GPO -- a day when it would be too late to create a new system of free, permanent, public access to government information. With the release of this report, we worry that that day may be closer than we had imagined.

The original design and specifications for FDsys were expansive and ambitious. That was a good thing. It would be wonderful if GPO could support FDsys and all of its almost three thousand system requirements and features. OAIS compliance, persistent naming, metadata management, and support for RSS are among the features we look forward to. And we hope for other features: maybe, someday, APIs and OAI-PMH support, for example. But what do we do if GPO does not have the resources or the expertise to fully develop FDsys? It is hard to read this new report without being concerned that this is exactly the reality we face today. We worry that this report is "the writing on the wall" that is telling us that "the paradigm shift" will not work and is not sustainable.

Many FDLP libraries (or at least the directors of those libraries, and, in many cases, the FDLP librarians as well) have been hoping for almost two decades that they could rely on GPO to provide the services that the FDLP libraries themselves used to provide. If this is proving to be a false hope what will happen next? Is it only a matter of time before Congress pulls the plug, or GPO throws in the towel, or the private sector raises a stink?

What are our options for the future?
Naturally, we hope GPO can continue to develop FDsys. It would best for access and preservation to have FDsys in place. But it would be better if FDsys was not our only resource for preservation and access. It would be better if we had more systems in place to complement FDsys. It would be better if we had a digital FDLP that shared responsibility for access and preservation.

What options do we face right now? The obvious status-quo next step is for GPO to get more resources. It needs more money and more expertise so that it can deal with existing problems and move forward faster and with better planning that will make it easier for it to succeed and do so in a reasonable time frame and on budget.

But FDLP needs a "Plan B" to deal with the real possibility of GPO not getting adequate resources to finish or maintain FDsys. What will happen if we don't have a plan in place? We can imagine at least three generic scenarios: One, GPO will scale back and provide less access, or less secure preservation, or fewer collections, or some combination of those. Two, preservation and access will remain government-provided, but will become completely fee-based (somewhat like NTIS and STAT-USA). Three, the private sector will move in and demand, perhaps under OMB regulations, that GPO shouldn't have undertaken this job in the first place and that the government shouldn't provide a system that the private sector could provide. It would argue that raw information should be given to private sector companies who will produce their own preservation and access systems that will be fee-based. (We almost certainly will see a proposal to replace GPO's single-entity model with a private-sector, fee-based, single-entity model. Ithaka is already laying the groundwork for such a proposal. [See: Ithaka report on the future of the FDLP.] To those of us at FGI, this seems the worst of both worlds: relying again on a single organization rather than a community of organizations and moving that model to a fee-based system.)

There is also the possibility that none of these will happen and we will simply lose access because no one will take responsibility.

A better option; a more durable future.
But there is one other possibility: a collaborative effort in which GPO deposits digital files with FDLP libraries and those libraries preserve those files and make them accessible. This would be a real digital depository system with shared, distributed responsibility. It would have many advantages but, in the context of the current report, it has one major advantage over the current system: it has no single-point of failure (which is what we have with the GPO, single-entity, paradigm-shift model).

Such a system will take planning and resources and will not be trivial to implement. But the time to start planning for such a system is now. It would be much worse to wait until FDsys is in technological or budgetary crisis. At that point it could be too late.

8 more collections added to GPO's federal digital system (FDsys)

The Government Printing Office (GPO) has just released 8 new collections into the Federal Digital System (FDsys) -- http://www.fdsys.gov/. That brings the number of collections in FDsys to 21 -- very cool indeed. The new collections are:

The Congressional Directory, Congressional Record (Bound), United States Government Manual, and United States Statutes at Large will be available with authenticated digital signatures.

There is a capabilities release schedule with an API and several other useful functionalities scheduled to be operational in 2010, only a few months away.

Given all the hubbub about the GPO purl server crash over 2 weeks ago (and counting), I decided to re-read FDSys Releases and Capabilities version 5.0 (PDF). There's nothing in the document about the migration from purls to handles (which seems to have been put on some back burner in a back closet). There's mention of "System Backup/Restore" (section 4.6.13), but this being a "definitions" document, there's no discussion about *how* the system backup/restore will occur nor how the system "shall support an average peak time availability of 99.7%." I hope that information regarding system infrastructure backup and redundancy is soon forthcoming.

Critical GPO systems and the FDLP cloud

[Update: 10/13/09: I've revised my thinking on the cloud as the term is loaded and doesn't really mean what I'm describing. A friend from the San Diego Supercomputer Center said, "some greybeards are going back to the original metaphor: the grid" and suggested the term "shared digital libraries" which is good. But what I'm describing is more like a biological ecosystem, the FDLP ecosystem. jrj]

Last week's GPO purl server crash should be disconcerting to both the documents community and the public at large (in fact, although the hardware's been restored, resolution is ongoing as I write). I know GPO staff are just as worried about this and are doing everything they can to fix the purl server.


"The PURL Server is currently inaccessible. GPO is working with IT staff to restore service as soon as possible. We regret any inconvenience caused by the server problems. An updated listserv will be sent once service is restored."

But in the meantime, there are 1250+ library catalogs and innumerable links to government documents that are not working. The crash of a critical piece of GPO's infrastructure brings a couple of things to mind:

1) What worries me about this is that FDsys and it's supposed upgrade in hardware/software/systems design is for all intents and purposes the same as GPOaccess. That is, FDsys is a monolith where the failure of one piece can cause the whole system to ground to a halt. As our readers know, we've been advocating for a long time for a distributed digital FDLP (a *true* "digital depository" system!). We're heartened by what we see of FDsys so far, but we need to be building a system with built-in redundancies.

I envision a collaborative and distributed system of digital content, collaborative cataloging/metadata creation, as well as technical infrastructure. With this kind of system in place, a failed purl server will only cause a momentary blip in service as a backup purl server kicks on instead of a several week+ outage. How many system degradations (WAIS) and failures (purl server) until we shift our thinking from "client-server" (with libraries decidedly on the "client" side of the equation) to "Peer-to-peer" concepts and build systems with built-in redundancies that mirror what the FDLP has been for the last 150 years? How long before we build an FDLP cloud?

FDLP Cloud

(**made with IHMC Cmap tools**)

2) There was an interesting discussion of purl server outage on the code4lib list including a good workaround from a technological standpoint (pasted email below). It points to the fact/reminder that what we do within the FDLP has an affect on others in the wider library community (not to mention the public at large!) and that "our" content and the systems built to serve that content is critical for the work of others whether we know it or not. It also points to the need for us to reach out to those communities in order to build systems of use to both end-users as well as those building other systems, mashups, repositories etc. So I would highly recommend that we be *more* proactive in connecting with other communities within the library community (LITA, CODE4LIB, WEB4LIB, ACRL, state associations etc) as well as outside the FDLP (govt transparency community, historians and other academic communities, journalists etc).


------------------ CODE4LIB POST (with added info by James re MARC view) --------------------------------

Thanks to everyone who helped me confirm that the GPO PURL server is down. An official announcement on the GPO Listserv said:

"The PURL Server is currently inaccessible. GPO is working with IT staff to restore service as soon as possible. We regret any inconvenience caused by the server problems. An updated listserv will be sent once service is restored."

While the server is down, here is one workaround (thanks to Patricia Duplantis):

  1. Copy the purl link listed in your library catalog
  2. Go to http://catalog.gpo.gov/
  3. Click "Advanced Search"
  4. Search for word in "URL/PURL", enter the PURL
  5. Click "Go"
  6. In MARC view, the original URL at the time of cataloging should appear in a 53x note.

This incident, however, illuminates a weakness in PURL systems: access is broken when the PURL server breaks, even though the documents are still online at their original URLs.

Maybe someone more familiar with PURL systems can tell me... is there any way to harvest data from a PURL server, so that a backup/mirror can be available?

Keith

--that is all.

Congressional Documents on FDsys

Peggy Garvin has a new article that covers the basics of searching for Congressional information in GPO's Federal Digital System (FDsys). Thanks Peggy!

The Government Domain - Congressional Documents on FDsys: the Basics, by Peggy Garvin, LLRX (July 27, 2009).

Ten Great Government Web Sites

GCN's list of "great" .gov web sites this year includes GPO's FDsys.

  • Great .Gov Web Sites SPECIAL REPORT: "10 sites that take online government to the next level" by Joab Jackson, Government Computer News (Jul 27, 2009)

Other sites GCN lists include: data.gov, The California Metropolitan Transportation Commission's Transit.511.org, the U.S. State Department, the State of Utah, and Science.gov.

While the description of FDsys in the GCN article has no new information for those who have been following its development for years, its presence in the list is notable and important for at least two reasons. First, it is the only one of the ten that emphasizes permanence and long term access.

Second, it is revealing to see the technologies that GCN lists for each site. Every site on the list is noted for use of technologies that provide good access and rich content. These include the current batch of usual suspects, from Adobe Flash and Microsoft Silverlight, to RSS and Cascading Style sheets; from Wikipedia and Twitter, to Google keyhole Markup Language and ArcGIS. But only FDsys also includes technologies that are specifically designed for long-term preservation and for authenticating content: The Reference Model for an Open Archival Information System, and "Digital signatures."

Now if we could just combine that with digital deposit into FDLP libraries, we'd be able to multiply the technical guarantees of long-term free public access to government information by the number of participating FDLP libraries.

"Chat with GPO" Session on Authentication

Today I attended the "Chat with GPO" OPAL session, which focused on authentication and authentication for FDLP partners.Ted Priebe, GPO's Director of Library Planning & Development (LPD) and Lisa Russell, the Manager of LPD's Content Management unit presented material and answered questions.

Basically, LSCM wants to partner with Federal Depository Libraries and find ways to authenticate content hosted by the FDL partners. The digital signatures of authentication will indicate partnership with the FDL institution and the contact information for that institution. This is great news, especially for those FDLs also interested in hosting digital content in partnership with GPO.

The authentication session is archived on the GPO OPAL site.

Public Printer's Letter to President Obama Regarding Open Government

The Public Printer recently released GPO's letter to the President regarding open government (PDF) (Robert C. Tapella, Public Printer, March 9, 2009). Since it specifically mentions FreeGovInfo, we feel the need to comment and contextualize a bit.

On the one hand, it's great that GPO is reaching out publicly to offer infrastructural help with the government transparency initiative. We're happy to assist in any way we can. We hope FDLP libraries will join GPO in such efforts.

On the other hand, FGI has always argued for a geographically dispersed system of local, official digital repositories, so we cannot support GPO’s goal 1 to make FDsys the official repository for Federal Government publications -- unless it includes a network of distributed repositories modeled after the Federal Depository Library Program (FDLP). What we can support is FDSys as the official distribution channel for federal government publications.

It's not a trivial distinction. "Repository" means that GPO assumes sole responsibility for preservation, a role not specified in legislation. "Distribution channel" means GPO continues its solid century and a half record of distributing information to other institutions which will continue their solid century and a half record of preserving government information for future use while making sure it remains freely available over the internet. Since digital deposit is currently #2 on The Sunlight Foundation's Our Open Government List (OOGL) of top ideas for the President's open government initiative, we can only assume that the public -- or at least those that are most interested in government transparency -- agrees that a geographically dispersed system is a key ingredient in government transparency.

We also believe it is important in discussions of transparency to plan for preservation of and long-term access to information. If, in concentrating on short-term access and on information-as-service, we fail to consider long-term access and instantiation of information for long-term preservation, we will inevitably lose information -- and that would be bad for transparency.

Incomplete Access

We commend and support GPO for building APIs into FDSys. It is heartening and encouraging to see that GPO is publicly and officially proclaiming that "access" means more than providing a web site. But APIs and a web site are only two of the three parts of a complete access system. GPO has yet to acknowledge or even mention the third part of access: the provision of unfiltered bulk data access to government information.

A GPO web site can provide a human-friendly interface for the public and APIs can provide a computer-program-friendly way of querying, fetching, and using information. But, even taken together, these two access points provide only the government-approved, government-designed, government-hosted view of government information.

The problem with these government-only views of government information is that they are limited. No single provider (government or non-government) can provide unlimited access points or views or interfaces.

APIs are not magic. Each is a design for access and the product of choices made by the designer. Each has its own constraints built in. For example, an API might be tied to a particular agency or department, which would limit cross-agency utility. Or an API might be generalized to work across agencies or departments and thus lose rich access to agency-specific information content or structure.

One way to overcome these limitations is for the government to provide bulk data access. This means allowing the public to download raw content in bulk. Where web sites provide one "page" at a time and APIs can provide one or many "facts" at a time, bulk data access provides the raw information so that users can build their own collections, interfaces, and APIs.

This could improve access in ways that GPO could never hope to do all by itself. Imagine, for example, an agricultural library building a digital collection that contains agricultural reports, data, and audio visual content from the The Department of Agriculture, the EPA, the SBA, and NOAA combined with reports, maps, and GIS data from state and local government agencies and other content from its own institutional repository or university press. Then imagine that specialized digital collection having its own state-specific, agriculture-specific API and web site and bulk data access. Then imagine that these repositories are part of the rapidly expanding cloud and you get a sense of a rich govt information ecology.

Such scenarios are possible, but only if GPO and other government agencies make raw content easily, freely available in bulk for use and re-use and re-purposing. Providing only government web sites and government APIs without bulk data downloads and the ability for others to build collections for specific or general purposes will provide only a tiny fraction of open usability and transparency that we could have. There is nothing standing in the way of this happening today except the will of government agencies to make it happen.

Incomplete Preservation

The Public Printer's letter glosses over the problems of long-term access and preservation.

Let's be as clear as we can: we cannot and should not rely solely on GPO for long-term preservation and free access. The shift to digital does not change the methodology for long-term preservation and access. On the contrary, the tenuousness of digital information means that a distributed methodology is even more vital.

We cannot rely solely on GPO because the GPO Electronic Information Access Enhancement Act of 1993 does not even mention permanent access, nor does it guarantee that access will always be free. Indeed, the law specifically allows GPO to charge for access and even for use of its "directory" of information. The law also covers only "appropriate publications distributed by the Superintendent of Documents" -- effectively excluding huge bodies of born-digital information from the scope of what is GPO is allowed to handle. Regardless of GPO's intentions, there is no existing legislative mandate for GPO to provide free, permanent, public access to government information and we therefore cannot rely on it alone to do so.

We should not rely solely on GPO because no single digital archive or repository can ever be as secure and safe as multiple archives, libraries, and repositories. Even if GPO had a legislative mandate to provide permanent preservation and access (which it does not), and even if anyone could guarantee that GPO would always get adequate funding so that it never had to withdraw anything or charge for access for anything (which no one can), it would still be impossible to guarantee that GPO would never lose any information. The nature of digital information is that it can easily be corrupted, altered, lost, or destroyed. It can become unreadable or unusable without constant attention. Relying on any single entity is simply not as safe as relying on multiple organizations. It is more than a truism that Lots of Copies Keep Stuff Safe -- safer than backups and "mirror sites." But this is about more than redundant copies. It is also about relying on different organizations because they have different funding sources, different constituencies, different technologies, and different collections. No single digital collection can ever be as safe as multiple, reliable digital collections.

The good news

The good news is that there are existing organizations that can start working on this right away. There is nothing standing in the way of GPO and the existing FDLP libraries from implementing a digital depository system in which GPO enables FDLP libraries to download bulk data and build local digital collections.

There are existing technologies to facilitate this. The U.S. Government Documents Private LOCKSS Network is preserving "harvested" government information. Peer-to-peer (P2P) networks (like Napster and BitTorrent) have become increasingly popular because more and more people and some businesses have begun to realize that "distributed files" equals faster access and better preservation. (A geographically dispersed system of local, official digital repositories would be, for all intents and purposes, a P2P network.) Open source software for building digital repositories is widely available and increasingly easy to use.

Summary

APIs are good. They are a necessary part of adequate government information access. But digital distribution is also essential because only digital distribution will enable FDLP libraries and others to build new APIs, to de-ghettoize government information by better integrating it with non-government information, and to ensure long-term, free, public access and usability of government information.

GPO's "Google"?

A Washington Times article about FDsys describes it as GPO's "Google" for Federal Documents:

Web users could just as easily use Google or other search engines to find government information, but FDsys assures access to the original, authentic versions of government documents.

I am amused by the comments posted to this article so far:

Why would you call this "Google" for documents, unless you are simply try to get web hits by using the google name. I went to the site and it is not immediately clear where in tarnation to find anything. I'm 38, computer literate and immediately find the site unappealing, and definitely not user friendly. The genius of google is it's simplicity. The GPO site is just like the GPO building itself Old, archaic, and difficult to navigate. I suppose if I was in the business of publishing paper copies, I wouldn't want people finding things easily online either.

Perhaps a government documents librarian could help you navigate it? ;-)

$20 million, five years, and it's still not as good as Google - classic government work.
Why didn't they just give the docs to Google and let them do it?

Because Google would mess it up even worse. Look at Google Books and how they block full access to public domain government documents! Sigh...why did the article have to mention Google?

I found the search engine very easy to use and very powerful -- apparently it's metadata-aware, so it can do much more sophisticated searches than Google allows. It also looks like they built in a nice drilldown feature that lets you find what you're looking for when you don't know what you're looking for.

Glad this user noticed that. And my favorite quote:

It's a good start at providing information to interested citizens.
The information overload will be intense.

Yes, it is a good start. But the info overload is already intense!

Won't Get Fooled Again: Day 14

The Obama administration is well into the civic wilderness of administration at the national level. Several nominations have faltered and failed. The extremely critical , and massive, economic recovery legislation is wallowing in the half-century
long trench philosophical warfare between democrats and republicans. Obama attempts to include the republicans in the discussion, they continue to hold their approval. So far the age of bipartisan tranquility is still aborning. Though the election in November clearly pointed the way to a desired change, our political parties do not want to follow the suggestion.

While the quick cannon shot of opportunity fades with time, the Government Printing Office announced today the first public release of FDsys. Described as

"a one-stop site to authentic, published government information. FDsys allows GPO to receive information from federal agencies in all three branches of government and create a repository for permanent, public access. More than 154,000 documents are currently accessible, with additional documents being added daily. FDsys offers incredible search capabilities for users such as: searching by Congressional Committee, a Member of Congress, keyword and date. FDsys will replace GPOAccess in mid-2009 and releases with additional functionality will occur throughout the next several years."

The upshot for this small milestone on the journey to ALA's summer conference -- the more things change, the more they stay the same. We have our work cut out for us. While the GPO continues to evolve in many good ways, and while we have a President whose rhetoric and policy perspective mirrors our own civic sensibilities, we should not be too surprised by the staying power of partisan muck to gum up the civic machinery of hope and change.

See you on Day 15.

Guide of the Week: Transition to Digital Television

Sad to say, for the second time during this Guide of the Week: Transition Edition, I've come up empty. While the General Accountability Office identified the Transition to Digital Television as one of 13 critical transition issues, there appears to be no librarian-produced guides linked to ALA GODORT Handout Exchange Wiki to inform people about this issue. If you know of one, please post it to the Handout Exchange.

As a consolation prize, it turns out that the phrase "Digital Television Transition" is a good way to kick the tires on the public beta of GPO's new FDSys at http://fdsys.gpo.gov. Type in the words without quotes and you'll get 2,654 results. That seems too many to look at once? Then use the "narrow your search" options in the left hand column. You might start with "congressional hearlings (761)" That brings you a new list of "narrow your search" options. From there you might choose "House appropriations committee (81)" After that, try clicking on "see more" under "keywords." Try clicking on "public broadcasting" and you'll find four hearings. The search results contain snippets with your search terms and come from several different congresses.

As long as I'm talking about FDSys, I want to say THANKS to GPO for including "find in a library" links that are equal in effectiveness to their "purchase this item" links.

Next week I'll be dealing with librarian produced guides relating to "Defense Readiness" So if you have any guides relating to that topic, please try and post them to the Handout Exchange this week.

Syndicate content Syndicate content