Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

Numbers Aren’t Enough: Providing Context

In my first post, I wrote about making information useful for ordinary people. It’s been a pleasure and an honor to guest blog here for the past month, and as the month of October is nearly gone, I figure it seems fitting to come back to this subject as my reign as “Blogger of the Month” comes to an end.

Large numbers in particular are difficult to comprehend and the world of government information is full of them: earmarks range from hundreds of thousands to tens of millions of dollars, Barack Obama’s fundraising totals have eclipsed six-hundred million dollars, and the $700 billion dollar bailout package had pundits scrambling to describe things that cost $700 billion. The difficulty of explaining just how big some of these numbers are was seen to an absurd end when CNN presented a calculation as to how many McDonald’s apple pies could be purchased for each US citizen with such a sum.

One of the most useful ways of putting information in context that I’ve seen involving government information or anything else are the sparklines at watchdog.net:

These graphics show the statistics of each lawmaker in context, as well as the general shape of the distribution of Congress as a whole. Knowing that a congressperson requested $147 million in earmarks may sound like a lot, but seeing that it puts them outside of the top 100 may provide some useful and much needed context to these numbers. The shape of the line also shows if there is a smooth trend or a sharp jump with a small handful of lawmakers raising or spending drastically more than others.

Hopefully more and more presentations of government information will follow the lead of the terrific watchdog.net and attempt to surround information with relative context so that government information isn’t simply available, but understandable.

The Words They Used

The Words They Used, by MATTHEW ERICSON, New York Times, September 4, 2008. “The words that speakers used at the two political conventions show the themes that the parties have highlighted.”

This is a bubble graph of number of times words were used per 25,000 words spoken and a list of which speakers used which words. Ericson has done a good job of looking at phrases as well as individual words, of combining similar words and phrases, and of noting phrases that have very little or no use by one or both parties. Another good example of how, when we have access to the “raw data” (as opposed to transaction-based, search-and-retrieve, one-page-at-a-time access), the data can be used, re-used, and analyzed.

Text Visualization Tools

What would it be like if we had true open access to large quantities of government text? We would be able to do much more than retrieve a page of the Congressional Record and read it. Researchers would be able to analyze the text and create new, innovative ways of discovering, browsing, searching, and reading text-based information.

Clifford Lynch has written eloquently about this in the realm of scholarly literature (Clifford A. Lynch, “Open Computation: Beyond Human-Reader-Centric Views of Scholarly Literatures,” Open Access: Key Strategic, Technical and Economic Aspects, Neil Jacobs Ed., Oxford: Chandos Publishing, 2006, pp. 185-193.).

I was reminded of these issues this morning when looking at Visualization Strategies: Text & Documents on Tim Showers Web Design Blog (August 20th, 2008). Tim lists more than a dozen examples of techniques and tools. One of my favorites is the visualization of the 2008 Democratic primary debates offered by the New York Times. You can hear the debate, search for keywords and see where they appear, browse a transcript, and more.

Shouldn’t we have free, open, access to large bodies of all government texts (not just search-and-retrieve access to bits-and-pieces) so that we can easily create corpora that can be indexed, browsed, and analyzed?

Thanks and a tip of the hat to Tim Dennis!