Home » post » OpenHouse Project Op-Ed on Databases

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

OpenHouse Project Op-Ed on Databases

The latest in a series of op-eds in The Hill from the OpenHouse Project is essential reading for depository librarians. Using the case of legislative databases, it explains clearly and simply, with excellent examples, how the government must make information available directly to the public in formats that are usable and reusable.

Tauberer speaks with authority and experience. He is the creator of GovTrack.us and he is the author of the “Legislation Database” chapter of the recent Open House Project report1. He says that “Congress should open up its legislative databases to the public. This does not just entail creating a ‘searchable and sortable’ website; the raw information should be made available to be downloaded so that others can transform it into new uses.”

Here at FGI, we hear some librarians say that there is no need for the government to take the responsibility of depositing usable and re-usable information in Federal Depository Library Program (FDLP) libraries. Those who say this rely on one of two arguments. First, they say, “Everyone can get everything they need from government web sites. Why should I go to the expense and work to duplicate that information in my library?” This is the “one size fits all” attitude toward government information. Second, they say, “By refusing to deposit digital materials in depository libraries, GPO has effectively made the FDLP obsolete. Therefore, libraries will have to create computer programs to spider government web sites and harvest information since the government refuses to deliver the information to us.”

Tauberer addresses both these false arguments eloquently from his practical experiences of dealing with what the government does today and from his perspective of addressing the needs of citizens who want government information.

He says of us relying only on government web sites, “no one view of Congress is useful for everyone.”

Take a hypothetical citizen who is interested in following legislation about immigration. He or she would need to visit a host of websites to get the facts: one site for the legislation itself, another for voting records, a third, fourth and fifth for committee documents, hearings information and campaign contributions.

The “one size fits all” approach will reduce service, access, and functionality.2 The solution to this is for the government to create and distribute information that can be re-used and re-purposed, not lock it into databases with limited access.

As for relying on computer programs to try to gather everything instead of insisting that the government take responsibility for actively distributing and depositing information, Tauberer speaks from hard-earned experience building GovTrack.us. He says that web sites like OpenSecrets.org and GovTrack.us that try to re-use existing government information have to create their own databases…

…by ‘spidering’ and ‘screen-scraping’ the bits and pieces of information that can be found on official websites. But, as the names suggest, these methods are of the last resort because they leave gaps and sometimes errors, unbeknownst to the users of the site. For instance, on GovTrack.us, a member of Congress will on occasion be left off of a record of a vote, or bill status information will be out of date.

One easy way to facilitate the process that Tauberer advocates is to insist that the government create digital information in reusable formats and actually provide it for deposit in FDLP libraries.3

The Tauberer piece is one of a series of Op-Eds written by contributors to the OpenHouse Project‘s recent report on congressional information. The earlier Op-Eds are:


Footnotes

1. The complete report is available as a pdf document: Congressional Information & the Internet (The OpenHouse Project, May 8, 2007).

2. Another example of the problem of one-size-fits-all approach to government information is the redesign of Thomas that is being done now. Those who are doing this are looking to create a “less featured search system” because they are trying to “appeal to a different audience.” (See Will GPO charge for a Bill Summary Database?) While this may make Thomas better for some it will make it worse for others. (As someone wrote recently, “…search is not simple, particularly when completeness is important…” See Federated Search Systems For Government Information.)

3. Spidering will always be important and even necessary. The scope of what we could spider is enormous. Every bit of information that the government deposits with FDLP libraries will lessen the burden on projects that have to pick and choose what to spider.

CC BY-NC-SA 4.0 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Archives

%d bloggers like this: