open access

Statistical Abstract of the United States, 2009 edition

The 2009 edition of the Statistical Abstract of the United States is now available from two sites:

It mystifies me that, with the problems of relying on proprietary formats and distributing data on CD-ROMs that became effectively unusable after a relatively short time, the Census Bureau is not making this essential publication available in a software neutral format such as CSV.

SEC to use XBRL for Financial Reporting

SEC Approves Interactive Data for Financial Reporting by Public Companies, Mutual Funds, Securities and Exchange Commission, Press Release 2008-300, Washington, D.C., Dec. 18, 2008.

The SEC plans to phase out the EDGAR system and replace it with its Interactive Data Electronic Applications (IDEA) database. The IDEA system is based on eXtensible Business Reporting Language (XBRL), one of a number of XML markup languages which are used to encode documents and serialize data.

The Press Release says that, "With interactive data, all of the facts in a financial statement are labeled with unique computer-readable 'tags,' which function like bar codes to make financial information more searchable on the Internet and more readable by spreadsheets and other software. Investors will be able to instantly find specific facts disclosed by companies and mutual funds, and compare that information with details about other companies and mutual funds to help them make investment decisions."

See also:

SEC To Replace EDGAR With 'IDEA'

Test Drive Interactive Data!. SEC's Interactive Financial Report Viewer! The Viewer allows you to interact with XBRL filings made as part of the SEC's Voluntary Filing Program.

Power and Influence and Authority

Here is an interesting series of interrelated posts in which there is a discussion a proposal that twitter should allow search filtering. Whether you twitter or not, whether this was a good idea or not, the ideas are interesting and pertain to government information as well.

Here we find reference to 3 kinds of power:

1. the ability to force you to do what you don’t want to do; 
2. the ability to stop you doing something that you want to do; and 
3. the ability to shape the way you think.

Number 2 has an obvious analogy to the no-distribution FDLP because without distribution of the raw information it is difficult for libraries -- or anyone else -- to do interesting things like build full text indexes and specialized collections that combine government and non-government information or build mash-ups. But Number 3 is just as suggestive, because the interfaces that GPO and other government agencies provide give them the power to shape what questions we can ask and what answers we get.

I'm not proposing a conspiracy of thought control, I am just saying that no one agency can provide all the possible views and interfaces and functionality that the richness of government information deserves.

GPO continues to insist that making information "publicly accessible [at an] Internet site" (SUPERINTENDENT OF DOCUMENTS POLICY STATEMENT 301) is adequate in the digital age. Distributing the raw data would do so much more and allow so much more extensive and better use of the data. Why not allow that? Is it about "power"...?

Happy New Year!

Happy new year to all of you. Whether you are on vacation and peeking at the news, or reading this as you just get back to work, here is something interesting and fun to see:

Martin Wattenberg. Data visualization. Media art. Collective intelligence.

Wattenberg is a computer scientist and new media artist. He is the founding manager of IBM’s Visual Communication Lab, which researches new forms of visualization and how they can enable better collaboration.

Check out his many projects (e.g., Name Voyager, the Baby Name Wizard with data from the Social Security Administration, or history flow, visualizing the editing history of Wikipedia pages, or Many-Eyes, an experiment in open, public data visualization and analysis).

This is another good example of what interesting things can be done when we have complete access to information. When the raw data are free, we can do so much more than the single views of data provided by government agencies.

Read more about Wattenberg here:

He creates ways of seeing information, by Billy Baker, Boston Globe, December 29, 2008.

LCSH.info RIP

LCSH.info, RIP, December 22, 2008, Tim, LibraryThing.

I am not as up on or enthusiastic about Ed's Semantic-Web intentions, but the open-data implications are clear: the Library of Congress just took down public data. I didn't think things could get much worse after the recent OCLC moves, but this is worse.

The time has come to get serious. The library world is headed in the wrong direction. It's wrong for patrons—and taxpayers. And it's wrong for libraries.

See also (if it is still there):

uncool uris,
Posted on December 19, 2008, 10:32 pm, by Ed Summers.

On December 18th I was asked to shut off lcsh.info by the Library of Congress. As an LC employee I really did not have much choice other than to comply.

The lcsh.info domain was registered by me in order to demonstrate how the Library of Congress Subject Headings could be represented as a Semantic Web application using SKOS....

LC is still considering running a service like lcsh.info at loc.gov, but it’s not there for me to link to yet.

Carl Malamud Featured in Wired Magazine

There is a great article about Carl Malamud and PACER over at Wired Magazine:

"Online Rebel Publishes Millions of Dollars in U.S. Court Records for Free" by Ryan Singel.

Malamud says he's looking forward to the day he doesn't have to game the system. "If I had $10 million, I'd make a copy of all the documents and be done."

I hear ya, Carl. I hear ya.

Obama Transition Team To Reveal Documents and Meetings with Groups

Your Seat at the Table, by Dan McSwain, Change.gov, December 5, 2008.

In a memo released today, Obama-Biden Transition Project Co-chair John D. Podesta announced that all policy documents from official meetings with outside organizations will be publicly available for review and discussion on Change.gov.

There are some really interesting documents on the Your Seat at the Table page of Change.gov.

See Also:

CFAC and MAPLight sue for public access

CFAC and MAPLight sue for public access to state’s legislative database, California First Amendment Coalition, December 03, 2008.

The California First Amendment Coalition and MAPLight.org have filed suit against the California Legislative Counsel’s office seeking a copy of California’s full legislative database--the texts of bills, amendments, votes, dates, etc.---for all legislation.

This is a particularly interesting lawsuit because, as the press release above notes, "Although the public currently can access this info one bill at a time through the state’s official website, that does not allow computer-assisted analysis of the data." CFAC calls these "government-controlled databases" and they are indeed controlled by the government.

This case is an excellent example of why we cannot rely only on the government to provide the access we need to government information. What we want is to wrest control from the government so that government information can be used in ways that any particular government agency does not (or cannot) provide.

Principles for an Open Transition

Lawrence Lessig launched the website An Open Transition which offers President-elect Obama three principles to "guide the transition in its objective to build upon the very best of the Internet to produce the very best for government".

These principles include:

- No Legal Barrier to Sharing
- No Technological Barrier to Sharing
- Free Competition

Read more about these principles, view the video, and sign the petition at open-government.us.

Should Obama ditch YouTube?

Chris Soghoian at CNet makes some interesting arguments for Why Obama should ditch YouTube (November 24, 2008).

He says that the "use of YouTube and Google Analytics by the Obama transition team violates the privacy of Web site visitors and possibly even violates federal rules banning the use of permanent tracking cookies on government sites" and that the government should host its own videos.

...when a regular YouTube user views a video embedded in a blog or other third-party site, the user's cookie is automatically sent to YouTube's servers--even without the user clicking the play button. Given the widespread use of embedded videos, this gives Google, which owns YouTube, an even better idea of the surfing habits of millions of people around the world.

He suggests: "By all means, use streaming video to reach the masses, but let the bits flow from government-owned servers (preferably without privacy-invading cookies). If bloggers wish to embed YouTube videos of the speech on their own sites, that is fine. But Obama shouldn't."

I would go a step further by challenging the new President to use open-format standards for video and audio so that they can be preserved.

Obama's Technological Promises

Ok, Mr. President...fulfill your technological promises! I am very excited about some of his proposals, especially in regards to government information transparency and access.


Mashable.com posted "A Final Look at Presidential Technology Policy" earlier this week and they had this to say about Obama vs. McCain's plans:

Rather than focusing on anti-trust and and subsidies, as Barack Obama intends to do, what would be better would be focusing on creating an environment where corporate taxes were lowered, and other tax incentives were emphasized for start-ups who focus on better information infrastructure. Senator McCain’s tax plan is moderately favorable towards this theory, though it is likely simply a coincidence convenient to this argument rather than a well thought out technology policy.

When it comes to the basics, both presidential candidates are generally on the right track, and are generally in agreement as well. I’ve outlined above where they differ, though, and I think history has shown that Barack Obama’s desired policy directions would be more detrimental to innovation and growth for the tech sector.

Interesting that they believe Obama's desired policies may be detrimental to technology. I'm not well versed enough on the issues of Broadband/Anti-trust & subsidies to know whether or not I agree. What do you think?

Mashable also has a great blog post on "Government 2.0: The Presidential Transition". I agree with the author's sentiment that the new President must look to the needs of the entire nation, and we need to giver our input too.

...citizens should be engaged in the transition process,...In an increasingly fragmented media and information society, that level of engagement requires more than a press release and newspaper coverage. It means full multimedia engagement using blogging, speeches, informal gatherings, mobile technologies, podcasts, online video, and widgets. The outreach should also use social tools that allow bidirectional conversation, increasing citizen participation and interest in government.

Free E-Book on Copyright

Cory Doctorow, co-editor at boingboing.net, Fellow for the Electronic Frontier Foundation, and contributor to Wired, Popular Science, the New York Times, etc., has published a book called Content: Selected Essays on Technology, Creativity, Copyright, and the Future of the Future and it's available for download on his website...for free! Cory is an advocate of the Creative Commons organization, using some of their licenses for his own books.

Here is an excerpt:

Back in 1985, the Senate was ready to clobber the music industry for exposing America’s impressionable youngsters to sex, drugs, and rock-and-roll. Today, the Attorney General is proposing to give the RIAA legal tools to attack people who attempt infringement.

Through most of America’s history, the U.S. government has been at odds with the entertainment giants, treating them as purveyors of filth. But not anymore: today, the U.S. Trade Rep is using America’s political clout to force Russia to institute police inspections of its CD presses. (Savor the irony: post-Soviet Russia forgoes its hard-won freedom of the press to protect Disney and Universal!)

How did entertainment go from trenchcoat pervert to top trade priority? I blame the “Information Economy.”

No one really knows what “Information Economy” means, but by the early ’90s, we knew it was coming...

What the Next President Needs to Do for the Internet

There is a great blog post over at the Center for Democracy & Technology's Policy Beta Blog:

"Innovation, the Open Internet, and the Next President".

It gives an overview of what our new President should do (or not do!) in regards to encouraging innovation and openness of the internet. Some points include:

One of the new president’s first tasks will be to select top officials for executive branch positions. The FCC, the FTC, DoJ, NTIA, and the new Intellectual Property Enforcement Coordinator (created by recently passed legislation) all will have a hand in policies with potentially significant impact on the Internet...

The president also should avoid new copyright policies that fail to protect emerging forms of free expression in the digital realm...

If the next president wants to encourage innovation, preserving the open character of the broadband Internet should be a top priority, right up there with the commonly cited goal of continuing to improve the nation’s broadband infrastructure.

I would also add that our new President needs to support digital preservation technologies and standards, as well as digital authentication of documents online.

Here is another post on a similar vein: "Next President Has 'Open' Opportunity".

The Center for Democracy & Technology also has a page entitled "The Internet in Transition" with a blueprint for keeping the internet open, innovative, and free.

CRS Reports to the People!

Now that a new administration will be coming into office soon, it is more important than ever to encourage our Government to make Congressional Research Service (CRS) Reports publicly accessible online. Here at FGI, the topic of CRS Reports has been written about often, but I was inspired to create this blog post and take action after seeing Starr Hoffman’s DLC conference presentation last week (click on "Search Document" and enter "Starr Hoffman". Her PowerPoint, "Encouraging An Informed Citizenry" will come up as a PDF to download).

Starr is responsible for maintaining University of North Texas's Congressional Research Service (CRS) Reports Archive. In her presentation, she gives tips for writing to Congressmen and lists some past legislative efforts (Bills that never passed both houses of Congress) to make CRS Reports publicly accessible. I have gathered some other Bills, as well as all the contact information for the sponsoring Congressmen and have included them in my Delicious.com "CRS" tag as well as in this list:

1998 H.R. 3131, S. 1578
1999 H.R. 654, S. 393
2000 H.R. 4582
2001 S. Res. 21
2003 H.R. 3630, S. Res. 54
2007 H.R. 2545, S. Res. 401

Senator John McCain
Introduced S. 1578, S. 393, S.Res. 21, S. Res. 54, & co-sponsored S. Res. 401
Senator Mike Enzi
Co-sponsored S. 393
Senator Leahy
Co-sponsored S. 393, S. Res. 21, S. Res. 54, and S. Res. 401.
Senator Tom Coburn
H.R. 4582 co-sponsor when he was in the House.
Senator Jim DeMint
Introduced H.R. 4582 when he was in the House.
Senator Joe Lieberman
Introduced S. Res. 401 and co-sponsored S. Res. 21 and S. Res. 54
Senator Tom Harkin
Co-sponsored S. Res. 54 and S. Res. 401
Senator Susan M. Collins
Co-sponsored S. Res. 401
Senator John Cornyn
Co-sponsored S. Res. 401

Congressman David Price
Co-sponsor for H.R. 3131, H.R. 654, H.R. 3630, and H.R. 2545
Congressman Christopher Shays
Introduced H.R. 3131, H.R. 654, H.R. 3630, and H.R. 2545
Congressman John Campbell
Co-sponsored H.R. 654
Congressman Tom Tancredo
Co-sponsored H.R. 4582
Congressman Jay Inslee
Co-sponsored H.R. 3630 and H.R. 2545

And you can find and contact your local Senator and your Representatives too.

James A. Jacobs did a Google search this past June for "Received through the CRS Web" OR "CRS Report for Congress" combined with site:house.gov and then again for site:senate.gov and got around 600 hits with each. For example, here are some domains he found that you can search within for CRS Reports or to search for those in Congress who may support public access to CRS Reports: bartlett.house.gov, holt.house.gov, radanovich.house.gov, weldon.house.gov, bennelson.senate.gov, carper.senate.gov, lugar.senate.gov, murray.senate.gov, etc.

For more information on CRS Report legislation efforts, visit this site which contains a "Campaign for Online Access" section.

Spread the word about this post and good luck in writing to your Congressmen! If you have other ideas, please share them in the comments.

Obstacles to the dream of universal access

This paper, while examining issues around open access to digital information from museums and cultural heritage institutions, touches on issues that are relevant to government information:

Crofts, of the Museés d'art et d'histoire, Switzerland, argues that one of the biggest obstacles to universal access is the commercial interests and desire by museums to "brand" their "assets."

To put it bluntly, universal access may be in conflict, or at least may be perceived to be in conflict, with an institution's commercial interests....

In the current economic climate there is strong pressure on museums of all sorts, both public and private, to maximise their performance - to turn a profit or, at least, to cut costs - and to demonstrate their relevance in terms of number of visitors. A museum's collections are its major "asset". Access to the collection and derived products can be commercialised directly or, in a not-for-profit organisation, leveraged so as to shine by whatever performance criteria are in place. In this context, allowing free unrestricted access to these assets may be seen simply as undermining the institution's potential or, more cannily, as a form of advertising....

Incorporated into a common search engine, digital assets tend to become fungible and anonymous, just part of an immense result set, or worse still, they may become identified with the search engine itself....

Copyright notices and other restrictions on institutional websites generally prevent or at least discourage reuse.

This reminds me of GPO and other government agencies that are forced through legislation, skimpy budgets, and OMB regulations to attempt to commercialize their "assets" -- what we might call "charging the public for information it has already paid for."

Different agencies attack these problems differently. I was particularly reminded of the PACER courts information project, when I read this in Crofts' paper:

...for many institutions, the accounting costs associated with charging for use of images far exceeds any revenue....

While making cultural material freely available is part of their mission, and therefore a goal that they are obliged to support, it may still come into conflict with other factors, notably commercial interests

Stephen Schultze examined the profits being made by the PACER project in his recent seminar at the Berkman Center (see Lunchtime Listen: Open Access to Government Documents).

FDLP librarians have seen this approach tried over and over again. When GPO first launched GPO Access it charged for access while at the same time providing free access inside FDLP libraries. Libraries responded by creating gateways that provided free access to GPO Access. GPO eventually cooperated with this grass-roots effort (GPO Access Gateways Project) and finally dropped its effort to charge for access to GPO Access. More recently, we have seen agencies using licensing restrictions to restrict access (GPO details onerous restrictions on digital materials) and agencies cooperating with the private sector to commodify their resources (The NARA/TGN contract as a bad precedent). And, with the PACER project, we see a return to the old model of limiting free access to certain facilities (Pilot Project: Free Access to Federal Court Records at 16 Libraries).

When legislative bodies skimp on the budgets for public dissemination of public information and create regulations that favor the private sector over the public sector for dissemination (rather than relying on both equally), they create obstacles to access. When agencies seek to commercialize their information and control access to it, they set up barriers to access. These obstacles are not in the interest of the the government or the people.

I would like to think that such efforts are doomed to failure the way charging for GPO Access failed. But when agencies use licenses to prevent free access and when libraries fail to take the initiative to demand free access and cooperate with projects that limit free access, it is difficult to imagine how free access will survive.

Syndicate content