Announcing the Congress API, By Andrei Scheinkman and Derek Willis, New York Times, January 8, 2009.
The initial release exposes four types of data: a list of members for a given Congress and chamber, details of a specific roll-call vote, biographical and role information about a specific member of Congress, and a member’s most recent positions on roll-call votes.
The four work together, so you can start by retrieving a list of members, find the one(s) you’re interested in and then fetch additional details through other calls. We built this service to work with other publicly available data sources, so you can identify members of Congress with a seven-character code from the Biographical Directory of the United States Congress. For individual member responses, we included the numeric ID assigned by GovTrack, a free and open-source service that monitors legislative activity.
Our data comes directly from the U.S. House and Senate Web sites, and is updated throughout the day while Congress is in session....
You have to register for an api-key to use the system, but it is free (for now). Check it out here!
(Note that this an API and returns XML so that you can build live data applications. You agree not to "archive any of the API content for access by users at any future date after you have finished using the service...." It is for building interactive applications.)
Bush Signing Statements Will Retire With Their Author, by Christopher Weaver, ProPublica, January 7, 2009.
"They will mean nothing" once Bush leaves office, said Stephen Saltzburg, a law professor at George Washington University and member of an American Bar Association task force that studied, and ultimately condemned, the practice of using signing statements to reject statutes. Presidents should veto laws they believe are unconstitutional, the task force said.
The National Coalition for History has the story: Presidential Records Reform Act is the First Bill Passed by the New House.
The end may finally be in sight to the seven-year battle historians and archivists have waged to overturn President Bush’s Executive Order 13233 of November 2001 that restricted access to presidential records. On January 7, 2009, the House of Representatives approved H.R. 35, the “Presidential Records Act Amendments of 2009,” by an overwhelmingly bi-partisan vote of 359-58. H.R. 35 was chosen by the House leadership as the first piece of substantive legislation passed in 2009 as a symbol of government transparency.
The 2009 edition of the Statistical Abstract of the United States is now available from two sites:
- The 2009 Statistical Abstract. PDF and Excel
- Statistical Abstract of the United States, 2006-2009. PDF only.
It mystifies me that, with the problems of relying on proprietary formats and distributing data on CD-ROMs that became effectively unusable after a relatively short time, the Census Bureau is not making this essential publication available in a software neutral format such as CSV.
NARA Cannot Assure Complete Transfer of Bush Records, by Steven Aftergood, Secrecy News, January 5, 2009.
Steven has a link to “National Archives Oversight: Protecting Our Nation’s History for Future Generations,” hearing before the Senate Committee on Homeland Security and Governmental Affairs, May 14, 2008 and comments on the the integrity of the process of transferring the records to the National Archives.
Although the President is supposed to obtain the written views of the Archivist prior to any proposed destruction of non-permanent records, "the final disposal authority rests with the incumbent president… regardless of the Archivist’s views."
SEC Approves Interactive Data for Financial Reporting by Public Companies, Mutual Funds, Securities and Exchange Commission, Press Release 2008-300, Washington, D.C., Dec. 18, 2008.
The SEC plans to phase out the EDGAR system and replace it with its Interactive Data Electronic Applications (IDEA) database. The IDEA system is based on eXtensible Business Reporting Language (XBRL), one of a number of XML markup languages which are used to encode documents and serialize data.
The Press Release says that, "With interactive data, all of the facts in a financial statement are labeled with unique computer-readable 'tags,' which function like bar codes to make financial information more searchable on the Internet and more readable by spreadsheets and other software. Investors will be able to instantly find specific facts disclosed by companies and mutual funds, and compare that information with details about other companies and mutual funds to help them make investment decisions."
Test Drive Interactive Data!. SEC's Interactive Financial Report Viewer! The Viewer allows you to interact with XBRL filings made as part of the SEC's Voluntary Filing Program.
A report by the Federal Web Managers Council provides some useful suggestions about how to make government information more useful.
- Putting Citizens First: Transforming Online Government A White Paper Written for the 2008 – 2009 Presidential Transition Team by the Federal Web Managers Council, November 2008.
Among their findings and suggestions:
There are approximately 24,000 U.S. Government websites now online (but no one knows the exact number).
Only a minority of agencies have developed strong web policies and management controls. Some have hundreds of "legacy" websites with outdated or irrelevant content.
We have too much content to categorize, search, and manage effectively, and there is no comprehensive system for removing or archiving old or underused content.
Agencies should be required and funded to conduct regular content reviews, to ensure their online content is accurate, relevant, mission-related, and written in plain language. They should have a process for archiving content that is no longer in frequent use and no longer required on the website.
The report solicits comments, so I wrote the following to one of the co-chairs, Sheila Campbell:
I am writing to comment on and make a suggestion for
Putting Citizens First: Transforming Online Government A White Paper Written for the 2008 – 2009 Presidential Transition Team by the Federal Web Managers Council, November 2008 http://www.usa.gov/webcontent/documents/Federal_Web_Managers_WhitePaper....
May I suggest that, as you work with Federal Web Managers and with Congress for information dissemination requirements, that you keep in mind two things:
1. Long-term preservation and usability of and access to even "out of date" government-created information is essential in a democracy. (We need an accurate *record* of government, not just a snapshot of what is current.)
2. The *primary* information role of the government is the creation and initial communication of information; government agencies will need help to ensure long-term preservation of information. (Agencies may cease to exist, or get merged with other agencies, or change their missions, or simply lack funding for providing long-term access to older information. Even the National Archives does not have a mandate to preserve everything that needs to be preserved.)
In keeping these two assumptions in mind, I suggest you promote two simple procedures:
1. Agencies should always, at the time information products are created, instantiate their information in open, preservable, formats (e.g., not proprietary, commercial formats).
2. Agencies should always publicly announce and describe information products and make their digital information available through the Federal Depository Library Program (FDLP) and the Government Printing Office (GPO), where appropriate. GPO and the more than 1000 FDLP libraries can help preserve your digital information and keep it available for the long-term.
Finally, I realize that the day-to-day requirements of e-government and creating reliable transaction-based information services for citizens may seem to conflict with the long-term usability requirements of instantiating information in preservable, open formats. But there are successful models of doing both. For example, the Census Bureau makes its statistical information available through a transaction-based service (American Factfinder (http://factfinder.census.gov/), while, at the same time making its raw data available in an operating-system-neutral, software-neutral format for researchers. There are many archivists and librarians and technical experts who can help agencies with these issues.
Thank you for your thoughtful report. I hope these comments help.
Here is an interesting series of interrelated posts in which there is a discussion a proposal that twitter should allow search filtering. Whether you twitter or not, whether this was a good idea or not, the ideas are interesting and pertain to government information as well.
Here we find reference to 3 kinds of power:
1. the ability to force you to do what you don’t want to do;
2. the ability to stop you doing something that you want to do; and
3. the ability to shape the way you think.
Number 2 has an obvious analogy to the no-distribution FDLP because without distribution of the raw information it is difficult for libraries -- or anyone else -- to do interesting things like build full text indexes and specialized collections that combine government and non-government information or build mash-ups. But Number 3 is just as suggestive, because the interfaces that GPO and other government agencies provide give them the power to shape what questions we can ask and what answers we get.
I'm not proposing a conspiracy of thought control, I am just saying that no one agency can provide all the possible views and interfaces and functionality that the richness of government information deserves.
GPO continues to insist that making information "publicly accessible [at an] Internet site" (SUPERINTENDENT OF DOCUMENTS POLICY STATEMENT 301) is adequate in the digital age. Distributing the raw data would do so much more and allow so much more extensive and better use of the data. Why not allow that? Is it about "power"...?
Happy new year to all of you. Whether you are on vacation and peeking at the news, or reading this as you just get back to work, here is something interesting and fun to see:
Wattenberg is a computer scientist and new media artist. He is the founding manager of IBM’s Visual Communication Lab, which researches new forms of visualization and how they can enable better collaboration.
Check out his many projects (e.g., Name Voyager, the Baby Name Wizard with data from the Social Security Administration, or history flow, visualizing the editing history of Wikipedia pages, or Many-Eyes, an experiment in open, public data visualization and analysis).
This is another good example of what interesting things can be done when we have complete access to information. When the raw data are free, we can do so much more than the single views of data provided by government agencies.
Read more about Wattenberg here:
He creates ways of seeing information, by Billy Baker, Boston Globe, December 29, 2008.
The Diverse and Exploding Digital Universe: An Updated Forecast of Worldwide Information Growth Through 2011, An IDC White Paper - sponsored by EMC, John F. Gantz, Project Director, March 2008.
This report estimates the dimensions of the digital information explosion. With figures like 281 billion gigabytes (the size of the "digital universe" in 2007, which is a million times the amount of digital data hosted by the Library of Congress in 2008 -- see Berman, Francine. Got data?: a guide to data preservation in the information age. Commun. ACM 51, no. 12 (2008): 50-56) and estimates like "By 2011, the digital universe will be 10 times the size it was in 2006" the report has sobering implications for digital preservation. In fact, it notes that:
As forecast, the amount of information created, captured, or replicated exceeded available storage for the first time in 2007. Not all information created and transmitted gets stored, but by 2011, almost half of the digital universe will not have a permanent home.