gpo

iConference presentation on the future of govt information

[UPDATE: I added the slides for Tom Bruce's talk]

Shinjoung and I submitted a panel on the future of govt information for iConference 2010 in Champaign, IL. We had a good far-reaching discussion with Tom Bruce (Cornell Legal Information Institute), Daniel Schuman (Sunlight Foundation) and Cindy Etkin (GPO). Below are my slides and notes. I've also attached the notes and abstract as PDFs. As Tom tweeted, "World's problems: solved."

If the other panelists agree, I'll post their notes/slides as well. This is of course an ongoing conversation so please feel free to leave comments, questions, rants etc.

--that is all!


3:45 - 5:15 pm Thursday, February 4, 2010
Roundtable 4 : : Technology Room
"Gone today, Here tomorrow: assuring access to government information in the digital age." ShinJoung Yeo, University of Illinois; and James R. Jacobs, Stanford University

Panelists:

  • Shinjoung Yeo, Moderator
  • James Jacobs, Stanford University Library
  • Thomas Bruce (Legal Information Institute, Cornell University)
  • Daniel Schuman (Sunlight Foundation policy director)
  • Cindy Etkin (Govt Printing Office)

[SLIDE 1: govt documents]

Right up front, I'm a librarian and a collaborator in the LOCKSS distributed digital preservation project (Lots of Copies Keep Stuff Safe). I've been in academia/education my whole life as a student, teacher, librarian and technologist. I've been a government information/FDLP librarian since 2002 and currently am serving a 3 year term on the Depository Library Council, the body which informs and advises the Govt Printing Office regarding issues of the Federal Depository Library Program (which Cindy talked about). So my mindset/perspective/bias is from one who assists in the scholarly communication process, one who believes that libraries have a place in the digital information landscape, and one who believes strongly in the idea that access to govt information is a fundamental right. As Ralph Nader has said, “There can be no daily democracy without daily citizenship.” And there can be no citizenship without access to government information.

[SLIDE 2: mmm documents]

With that in mind, I'd like to talk about the underlying historical ideals of the FDLP, discuss how those ideals have been under fire from both within and without the library community and argue that those ideals applied to today's information landscape give us the best chance at access to and long-term preservation and assurance of govt information.

[SLIDE 3: FDLP logo]

The federal depository library program (FDLP) has been around since 1813 in one form or another. The basis underlying the need for an FDLP is to give the public free access to government information. Depository libraries have long safeguarded the public's right to know by cooperating with and receiving for free the govt publications published by the Govt Printing Office (GPO), organizing, maintaining, and preserving those publications, assisting users in accessing said information in a geographically dispersed system and most importantly, assured that govt information is freely available and tamper-proof -- think Napster for govt information. Taken together, the collections of the 1238 depository libraries make up the historic corpus of govt information available for free to every citizen. Jessamyn West of librarian.net, recently called the FDLP the longest running open source project. I would add that it's the longest government-run public-centric open-source project to support the democratic ideal.

[SLIDE CHUCK QUOTE]

Over the last 20-30 years, developments in publishing and Internet technologies have affected the way government information is produced, disseminated, controlled, and preserved. These changes have affected the policies and procedures of the GPO and, in turn, have affected the depository library program. Despite the often-heard promises that Web technologies will bring more information to more people more quickly and easily, the actual effects have been decidedly mixed. The highly visible, short-term successes of rapid dissemination of single titles directly to citizens (e.g., the large number of downloads of the 9/11 report) mask the loss of a secure infrastructure (GPO's Federal Digital System (FDsys) notwithstanding) for long-term preservation of and access to government information as more and more agencies publish content on their own Web sites rather than using the GPO conduit (which librarians call "fugitive documents") and very few agencies publish to any standards or have policies in place that deal with archiving and preservation. As Chuck Humphrey, a data librarian friend of mine, once said, “there seems to be an inverse relationship between convenience of dissemination and preservation standards.”

In addition to this lack of a secure infrastructure, the growing din of the call for digitization of historic govt publications (most recently the Ithaka/ARL report "Documents for a Digital Democracy: A Model for the Federal Depository Library Program in the 21st Century"), while no doubt a boon for access today, is somewhat of a red herring that makes library administrators believe that they will soon be able to dispose of their physical collections and use that space for today or tomorrow's buzz word. This call for digitization may instead have the deleterious affect of damaging the long-term preservation of govt publications.

Lastly, the growing trend toward privatization of govt information has actually caused a decrease in public access despite it's digital nature. This is not a new trend. Herbert Schiller noted this in 1986 in his book "Information and the Crisis Economy." Speaking of machine readable formats, he wrote that, "Library information capability is greatly enhanced. Yet this benefit is accompanied by the abandonment of libraries' historical free access policy. User charges are introduced. The public character of the library is weakening as its commercial connection deepens. No less important, the composition and character of its holdings change as the clientele shifts from general public to the ability-to-pay user."

[SLIDE: GAO contract]

We've seen over the last 30 years a disturbing rise in Federal Agencies entering into contracts with private companies whereby public domain govt documents are digitized and then taken out of the commons via licensing agreements. See for example, the Government Accountability Office (GAO)'s deal with Thomson-West whereby Thomson-West digitized the GAO's 20,597 legislative histories of most public laws from 1915-1995 and in return received exclusive license to sell access to the content. GAO received nothing in return but an account on Thomson's service while the public received nothing at all.

Rapid technological change and the misplaced assumption that "it's all in google" have caused some in the FDLP community to question the need for the FDLP and some others to drop out of the program altogether. I believe that the inherent nature of digital information actually increases the need for a distributed network of dedicated, legislatively authorized libraries. It would be prudent to draw upon the existing infrastructure of FDLP libraries and the almost 200 years of cumulative experience of these institutions in assuring preservation of and access to government information. We must reinforce FDLP’s traditional mission of selection, collection, free access, and preservation of government information in the digital era in order to assure free access to this information into the foreseeable future. Some in the depository community, like my library, are doing just that by participating in the LOCKSS-USDOCS network, harvesting digital govt information -- for example, harvesting openCRS that Daniel mentioned along with other sites that post CRS reports -- and yes digitizing parts of their collections. But we need more libraries not less.

[SLIDE: FDLP ecosystem]

Nobody knows for sure how to preserve digital content for the long-term. This means to me that a loosely coupled, independently administered, distributed ecosystem is the best way to assure long-term preservation -- many organizations with many funding models and a distributed technical infrastructure(s) have a better shot at preservation than 1 or 2 organizations -- especially if one of those organizations has a tenuous budget, or is a private corporation etc.

Imagine if you will 2 future govt information systems: on the one hand, the system where there are one or two digital collections (say for example GPO's Federal Digital System (fdsys) and Portico, the dark archive currently housing digital journals); and on the other hand, one with many digital collections in fdlp libraries. How would each of these deal with or react to different stress situations or threat models (e.g., reduced budgets, increased demand for privatization, increased demand for censorship or control or removal of information, media/hardware/software/network failure, natural disaster, organizational failure etc.)? It's easy to see that a highly replicated, distributed FDLP model of preservation would deal with these situations much better than a centralized model. A web is much stronger than a silo.

[SLIDE: Federal Register XML]

law.gov, Carl Malamud’s proposal for a registry and repository of all legal information -- from what I've seen and heard and read, is a compelling proposal for a significant piece of the federal (and state) legal information ecosystem. What we ought to be doing is a) figuring out how to make law.gov a reality; b) figuring out how to expand it beyond legal materials to include ALL federal information -- information from all 3 branches of government, federal agencies as well as the regional and local offices of those agencies, data and statistics, the entire Congressional/legislative process including the funding that goes into that process to grease the skids so to speak, and making sure public information stays in public control; and c) MOST IMPORTANTLY from my perspective as a librarian, figure out how to preserve that ecosystem for the long term so that the public can inform itself not just today or tomorrow but 100 years from now. Now the 4 of us on this panel are just 4 players with dogs in this fight. But if we agree on the goals, then we ought to work together to proceed toward them and mobilize our communities and the public to support this endeavor.

It's going to take the government (and not just GPO) being serious about transparency and funding the necessary changes in its own federal information distribution system to include open format standards with no DRM, bulk data channels, indexing, description, collection and authentication of information resources, multiple digital preservation strategies to not only assure preservation but also to insure against tampering and deletion of vital information (which, as I've stated earlier, the FDLP historically has done very well!). It's also going to take libraries being serious about and applying the ideals of the FDLP to build a distributed digital infrastructure that takes into account access to as well as preservation of digital govt information.

I agree with Tom and am absolutely convinced that the changes in the information ecosystem that are needed should not be left to the market because the information market leans heavily toward monopoly, proprietary standards, licensing restrictions, lack of access, "rights management" and the like.

If an evolving ecosystem that is free, open, standards-based, authenticated, and privacy-protecting is built and sustained correctly then citizens, libraries, non-profit watchdogs, hackers, activists, AND government will thrive.

[SLIDE 7: THANKS! lockss, archive-it]

digital changes a lot of things about information, but it doesn't change the need to fund it, collect it, share it, preserve it, and give access to it. As my friend and colleague Jim Jacobs recently stated, "lots of collections keep stuff safe!"

January 2010 Lost Docs Report and Appeal

With the January 2010 Lost Docs Report and Appeal, we have come to the last of our "saved receipts" with which we first seeded the blog. This means that starting February 1, 2010, every single posting to the Lost Docs Blog will be a receipt submitted during that month or during the last week of the proceeding month. That means that if everyone who sent in a lost document report to GPO also sent to lostdocs@freegovinfo.info, we would have an accurate report of the volume of document reports provided to GPO. We hope you will help make this happen.

Now on to the January 2010 Lost Docs Report and Appeal

REPORT

Thanks to the continued generosity of documents librarians, we posted 85 reports of fugitive documents submitted to GPO. About two thirds of these items were reported during December 2009/January 2010.

Of these 85 reported items, 11 items have been cataloged by GPO. You can view this list by visiting lostdocs.freegovinfo.info/category/found/ and looking at the postings with January 2010 dates. We are appreciative of these new records.

In our view, three of the items reported to GPO and posted to the blog in January were either out of scope for the Catalog of Government Publications (CGP) or were already in the catalog. You can view these items by visiting lostdocs.freegovinfo.info/category/false/ and looking for items with January 2010 dates.

There were two items added to the "E-Version Needs Cataloging" category. You can view these items by visiting http://lostdocs.freegovinfo.info/category/catalog-eversion and looking for items with January 2010 dates. If your library has either of these documents, please consider adding an 856 field to the record(s) so your patrons will be able to link to the electronic version(s) through your catalog.

APPEAL

If you like the concept of a public listing of fugitive documents reported to GPO, there are a number of easy ways to help us:

  1. If you report a fugitive document to GPO, send your e-mailed receipt to lostdocs@freegovinfo.info. We welcome any item reported to GPO in the past month. It is best if you can send us the receipt the same day you get it from GPO. Some e-mail programs will support auto-forwarding. If so, please consider autoforwarding items where the subject contains "lostdocs submission."
  2. Visit the blog at lostdocs.freegovinfo.info and comment on the listed items. Comments can include -- Did your library receive the item? Did you find it in the CGP? Do you think the item is out of scope for the CGP? Did you report the item as well and so on.
  3. Post the blog link to your website or share it on Facebook, Twitter, or other social media.
  4. Subscribe to the blog feed at lostdocs.freegovinfo.info/feed/
    or better yet incorporate the feed into your website or blog.

Chat with GPO: Helping GPO Identify Fugitive Publications

If you'd like to hone your skills at locating and reporting fugitive documents, check out this e-mail from GPO:

----------------------

From: Announcements from the Federal Depository Library Program
[mailto:GPO-FDLP-L@LISTSERV.ACCESS.GPO.GOV] On Behalf Of FDLP Listserv
Sent: Thursday, January 21, 2010 12:40 PM
To: GPO-FDLP-L@LISTSERV.ACCESS.GPO.GOV
Subject: Chat with GPO: Helping GPO Identify Fugitive Publications

On Thursday, January 28, 2010 at 1:30PM EST, Joe McClane, Manager of
GPO's Content Acquisitions and Linda Nainis, GPO's Acquisitions
Librarian will discuss how documents librarians can help GPO identify
fugitive publications. 

The presentation will feature a 30-minute slideshow that explains how
GPO staff find fugitive documents and ways the community can help GPO
improve the researching and processing of new documents. Time will be
allocated at the end of the session for questions. 

Space is limited to the first 100 participants on a first come basis.
GPO recommends arriving at least 10 minutes early in order to reserve
your spot and test your connection.

Connect to the GPO OPAL Room:
<http://www.conference321.com/masteradmin/room.asp?id=rs38bb0e4b3a5a>.

For more information on GPO's OPAL implementation and OPAL requirements,
visit: <http://www.fdlp.gov/outreach/onlinelearning/68-opal>.

_________________________________

If you have questions or comments, please use the askGPO help service
at: <http://www.gpoaccess.gov/help>. When submitting a question,
please choose the category "Federal Depository Libraries" and the
appropriate subcategory, if any, in order to ensure that your question
is routed to the correct area.

-----------------------

If you have an interest in identifying fugitive publications, I strongly encourage you to attend this OPAL session. The better reports that GPO has, the faster any given item will be cataloged. This benefits everyone. Hope to see you there.

December 2009 Lost Docs Report and Appeal

In September 2009 we at Free Government Information (FGI) started the "lost docs blog" at lostdocs.freegovinfo.info to collect your receipts from GPO about the fugitive documents you reported through GPO's lost docs form at www.fdlp.gov/lostdocs or through GPO's Help system at gpo.custhelp.com.

Here is the December 2009 Lost Docs Report and Appeal:

REPORT

Thanks to the continued generosity of documents librarians, we posted 93 reports of fugitive documents submitted to GPO. More than two thirds of these items were reported during November/December 2009.

Of these 93 reported items, nine items have been cataloged by GPO. You can view this list by visiting lostdocs.freegovinfo.info/category/found/ and looking at the postings with December 2009 dates. We are appreciative of these new records.

In our view, seven of the items reported to GPO and posted to the blog in December were either out of scope for the Catalog of Government Publications (CGP) or were already in the catalog. You can view these items by visiting lostdocs.freegovinfo.info/category/false/ and looking for items with December 2009 dates.

This month we added a new subcategory of fugitive document to the blog, that of "needs URL addded." These are reported documents where a record of the tangible version is in the CGP, but the record makes no reference to online availability. Since we feel that documenting online availability is important, we left them listed as fugitive documents because the electronic version are unknown to GPO. This month there were 20 items where the CGP knew about the tangible version but not the internet version. You can view these items by visiting http://lostdocs.freegovinfo.info/category/catalog-eversion and looking for items with December 2009 dates. If your library has any of this documents, please consider adding an 856 field to the record(s) so your patrons will be able to link to the electronic version(s) through your catalog.

APPEAL

If you like the concept of a public listing of fugitive documents reported to GPO, there are a number of easy ways to help us:

  1. If you report a fugitive document to GPO, send your e-mailed receipt to lostdocs@freegovinfo.info. We welcome any item reported to GPO in the past month.
  2. Visit the blog at lostdocs.freegovinfo.info and comment on the listed items. Comments can include -- Did your library receive the item? Did you find it in the CGP? Do you think the item is out of scope for the CGP? Did you report the item as well and so on.
  3. Post the blog link to your website or share it on Facebook, Twitter, or other social media.
  4. Subscribe to the blog feed at lostdocs.freegovinfo.info/feed/
    or better yet incorporate the feed into your website or blog.

UPDATE 1/5/2010

John Stevenson, my friend and distinguished government information librarian, reminded me that current GPO cataloging policy is to create multiple records for a given work based on format. This means that instead of adding a URL to an 856 field in an existing record, GPO would create a new record based on the electronic format of the document. I wonder how efficient that is, but that's another post for another time.

So we have renamed our new category "eVersion Needs Cataloging." You can get a feed for just these items by subscribing to http://lostdocs.freegovinfo.info/category/catalog-eversion/feed/.

Keep comments and your lostdocs receipts coming!

Malamud calls for a national scan center public works project

Carl Malamud posed this question over on twitter: "What if our national cultural institutions all worked together on a common problem, attracted White House support?" In his post on the O'Reilly blog, "A National Scan Center: A Public Works Project", Malamud scopes out the issues and calls for Library of Congress, the Smithsonian Institution, the Government Printing Office, the National Archives and Records Administration, and the National Technical Information Service to come together and make the compelling case for funding a 5-year $500 million effort to create a National Scan Center. Here here Carl!

In the U.S., we face a similar deluge of paperwork that we faced in the 1930s. A huge backlog of paper, microfiche, audio, video, and other materials is located throughout the federal government. Little money has gone from Congress for digitization, and bureaucracies have resorted to a series of questionable private-public partnerships as a way of digitizing their materials. For example, the Government Accountability Office shipped 60 million pages of our Federal Legislative Histories (the record of each law from the initial bill through the hearings and conference reports) off to Thomson West, but didn't even get digital copies back. Another example is the recent failed effort by the Government Printing Office to digitize 60 million pages of the Federal Depository Library Program, an effort they tried to get through as a "zero dollar cost to the government" effort with the private sector.

There are no free lunches and there are no "no cost to the government" deals. The costs involve the government effort to supervise the contract, prepare the materials, and ship them, and in both the GAO and GPO cases, the government wasn't getting much back for its effort. What the government and the people usually get is a lien on the public domain, preventing the public from accessing these vital materials. Similar efforts are sprinkled throughout the government. I testified to Congress that I had learned that the National Archives was contemplating a scan of congressional hearings with LexisNexis under similar circumstances, and many may be aware of the questionable deal the Archives cut with Amazon where my favorite online superstore got de facto exclusive rights to 1,899 wonderful pieces of video.

Help GPO distribute library catalog records

A GPO staffer has asked that I post the notice below about a pilot MARC record distribution project to "ensure the automatic dissemination of bibliographic records to FDLP libraries." I hope libraries will volunteer to help out with this project as it seems like a significant step for gpo to take. We've talked for a while about collaborative cataloging of govt information; while this is primarily a "push" project, perhaps it could be the first step toward GPO opening up the cataloging workflow to depository libraries (many hands make light work right?!) and lead to other data sharing opportunities (XML, OAI, RSS, APIs etc.) both within the FDLP and with the public. This could be a significant piece of the FDLP ecosystem.


Calling all depositories! FREE Records! FREE Records!

GPO is looking for libraries who wish to take part in the Cataloging Record Distribution Pilot. Applications are being accepted now through January 11, 2010.

Federal depository libraries will be chosen to participate in this pilot program in which GPO bibliographic records will be distributed from GPO’s Integrated Library System (ILS) to these libraries. GPO will be accepting a group of 30 – 35 FDLP libraries to participate.

GPO is looking for a mixture of different library sizes and types. Of that group, GPO would like some current MARCIVE subscribers, as well as some non-subscribers. GPO is also aiming to select a variety of libraries that use a diverse group of ILS vendors.

Visit the Cataloging Record Distribution Pilot Web page for more information on the project, including details on how to apply and an informational FAQ sheet on the details of the project.

How Long Does It Take to Catalog a Fugitive?

We started the LostDocs blog back in September 2009 to collect e-mail receipts for items that were reported to GPO as "fugitive documents" -- agency documents that should have made it into the Federal Depository Library Program and/or the Catalog of Government Publications.

In the process of running this blog, we have identified 40 documents reported since April 2008 that were cataloged by GPO after being reported as "fugitive documents." These fall into the "found documents" category of our blog.

You can find our list of 40 (and counting) cataloged fugitives here. This spreadsheet will be updated whenever we identify new GPO cataloging for items that had been reported as fugitive documents.

The results are interesting and somewhat disturbing, but not definitive.

The 40 items were cataloged in times varying from three days to 524 days. The mean cataloging time was 213 days. The median cataloging time was 184 days or about six months.

If the cataloging times above were typical of all documents reported through the LostDocs process, we think this would be a major problem for GPO that would require some serious soul searching and dialog about how this result could be changed and what tradeoffs and/or extra community involvement would be required as a result.

We are NOT making the claim that these cataloging times are typical for reported fugitive documents. We honestly do not know what is typical. Jim Jacobs, FGI's resident data librarian, had this to say about our sample of cataloged documents:

As for sample size and relevance: the number of items in the sample can't tell us the significance or accuracy of the results. We'd have to know two other things: the size of the universe (of all reported lost docs), and the accuracy of the sample. Since the sample was self- selected (by those reporting) rather than random, and since we don't know if the sample is 1% or 85% of all submitted lostdocs, we can't claim that the findings necessarily reflect the status of the whole universe. (does that make sense? If only people w/ long waits reported to us, our sample does not accurately reflect all lostdocs.)

When we first thought about making lostdocs reports available to the community at large, we first approached GPO with a partnering opportunity. We would maintain the blog, and offer them the opportunity to comment on the blog whether something was out of scope for CGP or already in the catalog. In return, we asked them to modify their LostDocs form so that when they received a report, the blog would automatically get a copy. If this partnership had been accepted, then we would know the two facts Jim cited above that are needed to tell us whether we have typical results or not. GPO declined to accept our partnership agreement, citing their workload. We're not questioning that they are overworked.

We do feel that the results above deserve further investigation. Perhaps GPO could prepare a report on documents cataloged as a result of fugitive reports over the past few years. Unless they've discarded the e-mail receipts (which would be defensible), they have the dates of when documents were reported. The CGP lists when an item was first added to the CGP. They could have an intern make a semester project of putting the two together and then posting the results to fdlp.gov.

If they have tossed previous e-mail receipts, they could start saving them for a year starting in January 2010 and do the analysis we propose above in 2011. But in either case we feel the analysis should be done. If it confirms our results then it will be good ammunition in Congress to procure more cataloging staff or to start cataloging collaborations with FDLP members. If the GPO analysis concludes that items reported to lost docs are in fact cataloged in a timely manner, then that will help build trust with the documents community and motivate more people to report fugitive documents. Either way it is a win-win for GPO.

Lunchtime Listen: Finding Docs and Geology Information

While poking around the Government Printing Office's (GPO)'s OPAL training site at http://www.opal-online.org/archivegpo.htm, I found a couple of online workshops that I think will be valuable to beginner and expert alike:

Searching for Free Government Full Text Docs Online: Where to Begin? presented in October 2009 by Holly Harper, GPO intern and MLIS student at the University of Washington.

* Streaming audio with slides

Geology Librarianship and Government Documents presented in August 2009 by Stephanie Earls, GPO intern and MLIS student at the University of Washington.

* Streaming audio with slides

They appear to run best in Internet Explorer. The recordings were made by two library school interns working with GPO's Robin Haun-Mohamed. The intention was to create programming that would be helpful to generalist librarian and new depository staff.

I think they've done well at this and created some videos that should be shared with non-librarians as well. I publicly thank Robin and the GPO staff that made these possible. You may wish to pause the videos in places to make notes of URLs.

One new thing I learned (or was reminded anew) by the "Full Text Docs" presentation was the ability to browse publications in FDSys by collection, congressional committee or by Date. Use the "last 24 hours" option to see just how much information government is pumping out these days. And that's just a fraction of what's available.

My highlighting these two OPAL presentations should not be interpreted as a slight on the other good material you can find there. Go, watch and explore.

November 2009 Lost Docs Report and Appeal

In September 2009 we at Free Government Information (FGI) started the "lost docs blog" at lostdocs.freegovinfo.info to collect your receipts from GPO about the fugitive documents you reported through GPO's lost docs form at www.fdlp.gov/lostdocs or through GPO's Help system at gpo.custhelp.com.

Here is the November Lost Docs Report and Appeal:

REPORT

Thanks to the continued generosity of documents librarians, we posted 60 reports of fugitive documents submitted to GPO. These receipts were a mixture of old receipts and items actually reported in November 2009.

Of these 60 reported items, 17 items have been cataloged by GPO. You can view this list by visiting lostdocs.freegovinfo.info/category/found/ and looking at the postings with November 2009 dates. We are appreciative of these new records.

In our view, only one of the items reported to GPO and posted to the blog in November were either out of scope for the Catalog of Government Publications or were already in the catalog. You can view this item by visiting lostdocs.freegovinfo.info/category/false/ and looking for items with November 2009 dates.

APPEAL

If you like the concept of a public listing of fugitive documents reported to GPO, there are a number of easy ways to help us:

  1. If you report a fugitive document to GPO, send your e-mailed receipt to lostdocs@freegovinfo.info. We welcome any item reported to GPO in the past month.
  2. Visit the blog at lostdocs.freegovinfo.info and comment on the listed items. Comments can include -- Did your library receive the item? Did you find it in the CGP? Do you think the item is out of scope for the CGP? Did you report the item as well and so on.
  3. Post the blog link to your website or share it on Facebook, Twitter, or other social media.
  4. Subscribe to the blog feed at lostdocs.freegovinfo.info/feed/
    or better yet incorporate the feed into your website or blog.

CRS report on Congressional Printing

Congressional Printing: Background and Issues for Congress, by R. Eric Petersen,
Congressional Research Service, R40897 (November 5, 2009).

This report, which will be updated as events warrant, provides an overview and analysis of issues related to the processing and distribution of congressional information by the Government Printing Office. Subsequent sections address several issues, including funding congressional printing, printing authorizations, current printing practices, and options for Congress. Finally, the report provides congressional printing appropriations, production, and distribution data in a number of tables.

Sunlight calls on GPO to publish The Constitution Annotated in XML

220 Years Later, It’s Time to Publish the Constitution Annotated Online in XML, By Daniel Schuman, Sunlight Foundation, (09/17/09).

The Constitution Annotated has been written by the Library of Congress for nearly 100 years, and contains analysis of nearly 8,000 U.S. Supreme Court cases.

Over the decades, GPO has published print versions of this extraordinary resource every two years, with limited electronic versions available from 1992 edition onward. Although the Library of Congress has drafted the Constitution Annotated in XML for a number of years, that data is no longer present when it is published online by GPO.

8 more collections added to GPO's federal digital system (FDsys)

The Government Printing Office (GPO) has just released 8 new collections into the Federal Digital System (FDsys) -- http://www.fdsys.gov/. That brings the number of collections in FDsys to 21 -- very cool indeed. The new collections are:

The Congressional Directory, Congressional Record (Bound), United States Government Manual, and United States Statutes at Large will be available with authenticated digital signatures.

There is a capabilities release schedule with an API and several other useful functionalities scheduled to be operational in 2010, only a few months away.

Given all the hubbub about the GPO purl server crash over 2 weeks ago (and counting), I decided to re-read FDSys Releases and Capabilities version 5.0 (PDF). There's nothing in the document about the migration from purls to handles (which seems to have been put on some back burner in a back closet). There's mention of "System Backup/Restore" (section 4.6.13), but this being a "definitions" document, there's no discussion about *how* the system backup/restore will occur nor how the system "shall support an average peak time availability of 99.7%." I hope that information regarding system infrastructure backup and redundancy is soon forthcoming.

Critical GPO systems and the FDLP cloud

[Update: 10/13/09: I've revised my thinking on the cloud as the term is loaded and doesn't really mean what I'm describing. A friend from the San Diego Supercomputer Center said, "some greybeards are going back to the original metaphor: the grid" and suggested the term "shared digital libraries" which is good. But what I'm describing is more like a biological ecosystem, the FDLP ecosystem. jrj]

Last week's GPO purl server crash should be disconcerting to both the documents community and the public at large (in fact, although the hardware's been restored, resolution is ongoing as I write). I know GPO staff are just as worried about this and are doing everything they can to fix the purl server.


"The PURL Server is currently inaccessible. GPO is working with IT staff to restore service as soon as possible. We regret any inconvenience caused by the server problems. An updated listserv will be sent once service is restored."

But in the meantime, there are 1250+ library catalogs and innumerable links to government documents that are not working. The crash of a critical piece of GPO's infrastructure brings a couple of things to mind:

1) What worries me about this is that FDsys and it's supposed upgrade in hardware/software/systems design is for all intents and purposes the same as GPOaccess. That is, FDsys is a monolith where the failure of one piece can cause the whole system to ground to a halt. As our readers know, we've been advocating for a long time for a distributed digital FDLP (a *true* "digital depository" system!). We're heartened by what we see of FDsys so far, but we need to be building a system with built-in redundancies.

I envision a collaborative and distributed system of digital content, collaborative cataloging/metadata creation, as well as technical infrastructure. With this kind of system in place, a failed purl server will only cause a momentary blip in service as a backup purl server kicks on instead of a several week+ outage. How many system degradations (WAIS) and failures (purl server) until we shift our thinking from "client-server" (with libraries decidedly on the "client" side of the equation) to "Peer-to-peer" concepts and build systems with built-in redundancies that mirror what the FDLP has been for the last 150 years? How long before we build an FDLP cloud?

FDLP Cloud

(**made with IHMC Cmap tools**)

2) There was an interesting discussion of purl server outage on the code4lib list including a good workaround from a technological standpoint (pasted email below). It points to the fact/reminder that what we do within the FDLP has an affect on others in the wider library community (not to mention the public at large!) and that "our" content and the systems built to serve that content is critical for the work of others whether we know it or not. It also points to the need for us to reach out to those communities in order to build systems of use to both end-users as well as those building other systems, mashups, repositories etc. So I would highly recommend that we be *more* proactive in connecting with other communities within the library community (LITA, CODE4LIB, WEB4LIB, ACRL, state associations etc) as well as outside the FDLP (govt transparency community, historians and other academic communities, journalists etc).


------------------ CODE4LIB POST (with added info by James re MARC view) --------------------------------

Thanks to everyone who helped me confirm that the GPO PURL server is down. An official announcement on the GPO Listserv said:

"The PURL Server is currently inaccessible. GPO is working with IT staff to restore service as soon as possible. We regret any inconvenience caused by the server problems. An updated listserv will be sent once service is restored."

While the server is down, here is one workaround (thanks to Patricia Duplantis):

  1. Copy the purl link listed in your library catalog
  2. Go to http://catalog.gpo.gov/
  3. Click "Advanced Search"
  4. Search for word in "URL/PURL", enter the PURL
  5. Click "Go"
  6. In MARC view, the original URL at the time of cataloging should appear in a 53x note.

This incident, however, illuminates a weakness in PURL systems: access is broken when the PURL server breaks, even though the documents are still online at their original URLs.

Maybe someone more familiar with PURL systems can tell me... is there any way to harvest data from a PURL server, so that a backup/mirror can be available?

Keith

--that is all.

Internet Archive proposal for mass digitization

I had known that the Internet Archive had submitted a response to the GPO's RFP for mass digitization. A friend just sent me the link to the proposal submitted to GPO (embedded below and here's the link to the proposal and supporting documents).

As you can probably guess, we've been pulling for the Archive to get the bid, not least of which because the Archive is a 501(c)(3) non-profit library and we've stated on more than one occasion that privatization of public domain government information is a very bad idea. But also, we've been heartened by the quality of the Archive's scans to date, their openness and willingness to be collaborative in their processes and data access and sharing. Those qualities certainly come through in their proposal for mass digitization -- not to mention the fact that they've actually made their proposal public!

While the award has not been officially announced, we really hope that the Archive wins the award. Perhaps GPO will name them as an official depository library and work with them not only on the "legacy" collection (there needs to be a better description of the deep and rich collections of depository libraries than the somewhat pejorative "legacy" :-| ) but on digital deposit of government documents going forward.

--that is all.


GITCO VIRTUAL FORUM - JOIN US MONDAY, MAY 18TH - 2PM PST/3PM MST/4PM CST/5PM EST

see this link for chat logs and more details on GITCO

Scroll down and start typing to join the chat. or if it does not seems to connect join here
What is Meebo and how do I get set up?

TOPIC: GITCO committee structure and our impact within GODORT and beyond

Accessing government information electronically is now common in both US and international contexts. How can GITCO best position itself withing GODORT/ALA and beyond to provide leadership on issues associated with electronic government information?

This session is meant to be a brainstorm -- to collect ideas and examples, rather than to follow each contribution to its conclusion. The room will be open after the session if you would like to add things after the planned session. There is also a brief participant survey which includes a place for feedback.

Agenda for Today's Forum:

* introductions
* logistics
* reflections on past projects
* reflections on committee structure within GODORT
*take the survey

http://www.meebo.com/rooms

Syndicate content Syndicate content