transparency

USPTO latest agency looking to outsource their data

[UPDATE: Michael Keller, University Librarian at Stanford University (and my boss), wrote a letter to USPTO as well. Thanks Carl for posting it to scribd.]

Carl Malamud made me aware (see his letter to USPTO CIO John Owens below) of a posting on FedBizOpps of a Request for Information (RFI) from the US Patent and Trademark Office:

"This RFI seeks to obtain information from interested parties, including the vendor community, about potential opportunities to acquire patent and trademark data in bulk (my emphasis) and to provide such data to the public without cost. The USPTO is seeking comments on the identified problem and solutions that will make the data available to the public without charge."

While there is mention in the RFI of IP data being easily accessible to the public, there's no mention of data.gov. This seems to be purely a "no-cost" way for USPTO to upgrade their IT infrastructure by giving away public domain information.

This is worrisome on so many levels as it is just one more example of a government agency looking to outsource and privatize public domain information *and* its IT infrastructure -- see for example the Thomson West contract with the GAO to digitize their legislative histories. Additionally, in a vague nod to transparency, USPTO will be holding 1 (yes only 1) vendor information meeting on Sept 24. I'm not sure how USPTO thinks that a 2 week notice for a meeting held in DC will help the cause of transparency. Shouldn't they have several meetings in different geographic locations to talk about such a huge and important public resource (nearly 2 petabytes of data!!)?

Luckily, this is only at the RFI stage, not RFP stage. USPTO is currently only looking for information on how to do this. This is the time for the government information/transparency communities to submit ideas for how the USPTO could make their patent information available *without* giving it away to vendors. Please contact the USPTO at the addresses below and give them ideas for making their data open, standardized and freely available in bulk.

Public Meeting:
Thursday, 24 September 2009
9:00 a.m. -11:00 a.m.
USPTO Campus in the Madison Auditorium
600 Dulany Street
Alexandria, VA, 22314

Contracting Office Address:
P. O. Box 1450 - Mail Stop 6
600 Dulany Street, MDE, 7th Floor
Alexandria, Virginia 22313-1450

Primary Point of Contact.:
publicdatadissemination@uspto.gov

Secondary Point of Contact:
V. Anne Tugbang,
Contracting Officer
vanne.tugbang@uspto.gov
Phone: 5712726550
Fax: 5712736550


Letter to John B. Owens, II, September 15, 2009

Transparency camp west

Just a quick note to let folks know that I'm currently at Transparency Camp this weekend. It's a great meeting of activists and technologists concerned with all kinds of transparency and government. To follow what's happening, you can track on the twitter hashtag #tcamp09. Since it's an unconference, talks can be proposed on the fly. I gave a talk on the FDLP which got a lot of interest. Contact me if you've got other ideas for presentations. There are still slots available for sunday.

The Power of Versioning: Climate Change Bill

Our friends at Open Congress recently provided a concrete example of the benefit of being able to work with government provided data. In a July 1, 2009 blog posting titled See all the Last-Minute Changes to the Climate Change Bill blogger Donny Shaw notes:

We may never get the details of the back-room negotiating that took place leading up to the bill’s passage in the House on Friday, but with OpenCongress’s legislative versioning tool we can see exactly what was changed in the bill in the process and then start to figure out why. Just go to the text of the bill as passed by the House and select “Show Changes.” You can scan the entire bill and see, with color-coded text, exactly what was changed – red, stuck-out text denoting changed or removed sections in the bill, and green text denoting sections that were inserted or modified.

Donny spent about 30 minutes scanning through the bill's changes and documented what he found. What can you find?

This sort of quick work at finding rush changes is only possible because copyright-free federal legislation is available to transparency organizations like OpenCongress to put into their change revision software. This gives regular citizens specialized access to legislation that was formerly only available to subscribers to expensive premium services. This is a good thing. The Government Printing Office's talks with the Library of Congress about bulk distribution of legislative data will only make things easier.

Sunlight Foundation's Transparency Corps Recruits People Amazon Turk Style

The Sunlight Foundation recently announced the creation of the Transparency Corps. Modeled after Amazon’s Mechanical Turk, the Transparency Corps aim to make it easy to harness small efforts by enthusiastic volunteers to move forward efforts to improve government transparency.

From the June 30, 2009 Sunlight Foundation press release:

“Inspired by Amazon’s Mechanical Turk, Sunlight created Transparency Corps as a new way for people to volunteer to make government transparency a reality,” said Ellen Miller, executive director and co-founder of the Sunlight Foundation. “Now, when people ask ‘how can I help?’ Sunlight and future partners can provide micro-tasks that when aggregated, help solve research and data analysis problems when computers alone cannot properly scrutinize government information.”

Right now there are two projects:

Each time you complete a task, you get points. Those points add up and are how you move up the transparency leader board. I joined up to see what a task would look like. For the earmarks task I was presented with a PDF of a letter requesting funding for a local project and a form to the right of the letter to be filled in with data such as the quantity requested, title of the project and other requester information. You can see an example of one of the letters on ScribDB.

I am curious to see how big they can grow their corps & see what projects they target over the next year. I love that they are grabbing structured data. This particular task is part transcription and part encoding and reminds me of some of the work being done over on Freebase.com. For an example of one of the datasets they are building, take a look at their U.S. National Register of Historic Places base or the Government Commons.

More Gov Info Presentations @ ALA Annual

If you are going to the ALA Annual 2009 Conference in Chicago next week, please come to the "ALA Unconference" where I will be leading a broad discussion on Friday, July 10th from 11:10-12:00 on the library's role in current & emerging trends of civic engagement, transparency, preservation and access to Government information. The supporting materials and presentation will be linked in the Unconference wiki.

Also, please come to the LITA BIGWIG Social Software Showcase to discuss and learn about Government Information Mashups! I will be presenting on this topic and would love to have you help out and/or join in on the conversation! The presentation will be posted on their website but the face to face portion of the BIGWIG Showcase presentations will take place Monday, July 13th from 10:30am - 12:30pm in the McCormick Convention Center West, Room W-184.

How Congress Uses Twitter (Research Preview)

[UPDATE 7/12/10: I updated the link to the paper from Justin's site to the umd site where the paper was officially published. jrj]

I thought I would give the readers of FGI the first scoop on some early research that is coming out of the University of Maryland on how members of Congress are using Twitter.

Twitter Use by the U.S. Congress (currently under review)

Abstract: Twitter is a microblogging service boasting over 7 million members and growing at a tremendous rate. With the buzz surrounding the service have come claims of its ability to transform the way people interact and share information, and calls for public figures to start using the site. In this study, we examine the way Twitter is being used by legislators, particularly by members of the United States Congress. We read and coded over 4,500 posts from all members of Congress using the site. Our analysis shows that Congresspeople are primarily using Twitter to post information, particularly links to news articles about them and their blog posts, and to report on their simple activities. These tend not to provide new insights into government or the legislative process or to improve transparency; rather, they are vehicles for self-promotion. However, Twitter is also facilitating direct communication between Congresspeople and citizens, though this is a less popular activity. In this paper, we report on our results, analysis, and provide suggestions for how Twitter can be used by Congresspeople in ways that benefit the citizens, not just the PR machines of the legislators themselves.

From the results of this study we found that Twitter is being used effectively in some spaces and not as effectively in others. In particular, Twitter has created opportunities for increased communication between citizens and Congresspeople, but the majority of posts contained information or location and activities which were being used for outreach and self promotion rather than to provide information that is helpful to citizens.

* Note this paper has been submitted for an upcoming conference but has NOT been accepted, peer-reviewed, or published. Please DO NOT CITE this article but if you are interested feel free to contact me.

Data.gov Goes Live!

Data.gov is now live and ready for you to explore!

The purpose of Data.gov is to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government.

You have a say in the future of Data.gov by suggesting datasets to include and suggest improvements/enhancements to the website.

Data.gov has a searchable data catalog that gives access to data through the "raw" data catalog and by using tools. "The Raw Data Catalog provides an instant download of machine readable, platform-independent datasets while the Tools Catalog provides hyperlinks to tools that allow you to mine datasets."

Please note that by accessing datasets or tools offered on Data.gov, you agree to the Data Policy, which you should read before accessing any dataset or tool.

Here is an excerpt from the policy that we need to read closely:

Secondary Use
Data accessed through Data.gov do not, and should not, include controls over its end use. However, as the data owner or authoritative source for the data, the submitting Department or Agency must retain version control of datasets accessed. Once the data have been downloaded from the agency's site, the government cannot vouch for their quality and timeliness. Furthermore, the US Government cannot vouch for any analyses conducted with data retrieved from Data.gov.

Citing Data
The agency's preferred citation for each dataset is included in its metadata. Users should also cite the date that data were accessed or retrieved from Data.gov. Finally, users must clearly state that "Data.gov and the Federal Government cannot vouch for the data or analyses derived from these data after the data have been retrieved from Data.gov."

What do you think? Is the policy fair? Any suggestions for improvement we could make to Data.gov?

For more information, visit their FAQ and Tutorial.

Also, check out Sunlight Lab's "Apps for America 2: The Data.gov Challenge"!

Just as the federal government begins to provide data in Web developer-friendly formats, we're organizing Apps for America 2: The Data.gov Challenge to demonstrate that when government makes data available it makes itself more accountable and creates more trust and opportunity in its actions. The contest submissions will also show the creativity of developers in designing compelling applications that provide easy access and understanding for the public while also showing how open data can save the government tens of millions of dollars by engaging the development community in application development at far cheaper rates that traditional government contractors.

Now, let's go play around with this new site and make suggestions, shall we?

Transparency 2.0

There is a lot of talk about transparency in government, and it seems that the Obama team is investing a lot of honest effort in making government more transparent. But the obstacles are big. This article gives a good overview of the scale of the problem, concentrating particularly on HUD and recovery.gov.

NPR Discusses Government 2.0: "Transparency Kills Apathy"

NPR.org has a brief article and audiocast entitled "21st Century Crowbars Help Pry Open Government" by Andrea Seabrook. It is the second of a two-part series, of which the first part is entitled "Follow the Money: Web Site Tracks Stimulus Dollars".

Both stories highlight several "watchdog" websites such as OpenCongress.org, OpenSecrets.org, Filibusted.us, and Legistalker.org. Filibusted.us recently won the "Apps for America" contest hosted by the Sunlight Foundation.

Clay Johnson, described by NPR as a 21st century government watchdog, of Sunlight Labs states:

We live in a society now where if it's not on the Internet, it doesn't exist. The more transparent we make government, the more people can participate in it. And when people participate in it, they're no longer apathetic about it. So transparency kills apathy.

OOGL: Open Our Government List

The Sunlight Foundation has a new website called OOGL: Open Our Government List, for you to vote and submit ideas for what the Open Government Directive should address.

Shortly after President Obama's inauguration, he issued a memo on transparency directing his top officials to develop plans for an Open Government Directive to promote transparency, participation, and collaboration. The Sunlight Foundation has created this page in order to add a public element to the crafting of this Open Government Directive that is itself transparent, participatory, and collaborative.

So far, the highest vote goes to Ethics Information, APIs & Bulk Data Access, and Procedural Information.

Spread the word and vote!

Syndicate content Syndicate content