The White House has issued a new Executive Order on open data:
- Making Open and Machine Readable the New Default for Government Information. EXECUTIVE ORDER, May 09, 2013.
To promote continued job growth, Government efficiency, and the social good that can be gained from opening Government data to the public, the default state of new and modernized Government information resources shall be open and machine readable. Government information shall be managed as an asset throughout its life cycle to promote interoperability and openness, and, wherever possible and legally permissible, to ensure that data are released to the public in ways that make the data easy to find, accessible, and usable. In making this the new default state, executive departments and agencies (agencies) shall ensure that they safeguard individual privacy, confidentiality, and national security. [emphasis added]
- Open Data Policy-Managing Information as an Asset. Memorandum For The Heads Of Executive Departments And Agencies M-13-13, Office of Management and Budget (May 9, 2013). [pdf. 12 pages]
- Landmark Steps to Liberate Open Data. by Todd Park and Steve VanRoekel White House Blog (May 09, 2013)
John Wonderlich at the Sunlight Foundation has an excellent analysis and commentary:
- Open Data Executive Order Shows Path Forward, by John Wonderlich, Sunlight Foundation Blog (May 9, 2013).
[T]he new policies take on one of the most important, trickiest questions that these policies face -- how can we reset the default to openness when there is so much data? How can we take on managing and releasing all the government's data, or as much as possible, without negotiating over every dataset the government has?
How can the public (or policymakers) request what they don't know exists? How can CIOs manage what they haven't surveyed?
...Today's Executive Order demonstrates a new approach to open data, moving beyond rhetoric and aspiration, requiring agencies to publicly report on what data can be made public, building a new backbone for federal open data policy, and setting an example for other governments to follow. [emphasis added]
- New Open Data Memorandum almost defines open data, misses mark with open licenses. by Joshua Tauberer (May 9th, 2013).
- President Obama’s New E.O.: Open Data, Not Government Transparency by Jim Harper, Cato Institute (May 9, 2013).
Sunlight Foundation's OpenGov Champion of the month is Sandra Moscoso. Sandra is a mom of two public school students in Washington DC, and a member of the Capitol Hill Public School Parent Organization (CHPSPO) -- oh and she just happens to manage an open data portal at the World Bank’s financial sector.
...she and other CHPSPO members were able to collect data to show how the schools that had a full time librarian had better test score results than those who had lost theirs due to budget cuts. The group was able to use that figure as an effective basis for their request to the city to restore funding for librarians.
Man, this week is Sunshine-week-alicious! The Sunlight Foundation has long advocated for -- along with FGI, library- and open govt organizations -- the free public access to Congressional Research Service (CRS) reports. CRS Reports are commonly not available to the public as CRS has this arcane and outdated rule that CRS reports are privileged communication between Congress and CRS. But CRS reports ARE available randomly online and Proquest, Penny Hill Press and other commercial publishers have long published them for a fee (I've even heard that CRS subscribes to Proquest to get access to their own reports historically!).
But this all may change. According to the Sunlight Blog, Representatives Leonard Lance (R-NJ) and Mike Quigley (D-IL) have reintroduced the bipartisan House Resolution 110 "Public Access to Congressional Research Service Reports Resolution of 2013" (text not received by GPO yet so not publicly available on Thomas). The Resolution would direct the Clerk of the House of Representatives to provide members of the public with Internet access to certain Congressional Research Service publications. Easy-peasy right?!
More than 30 organizations -- including Sunlight Foundation and FGI -- have signed on to a letter supporting the resolution. Please consider contacting your Representative and ask them to support H.Res. 110!
[UPDATE 4/2/13: We've had some questions about the meaning of "ALL." Please read the comment thread for clarification. We don't mean "records" (which fall under FOIA) and we don't mean classified information. We mean public domain documents, publications, reports, data, statistics and the like. JRJ]
A convergence of several things -- the White House's new policy on Open Access to federally funded scientific information, the NAPA Report on the GPO, the CASSANDRA Letter to the Public Printer, and Sunshine Week among them -- has led us to create a petition on the White House's We the People petition site. If you believe in free permanent public access to authentic government information, we hope you'll sign the petition and forward on to all your friends and social networks to help us reach our goal of 100,000 signatures by April 11, 2013! Thanks in advance!!
WE PETITION THE OBAMA ADMINISTRATION TO:
Require free online permanent public access to ALL federal government information and publications.
1. Assure that GPO has the funds to continue to maintain and develop the Federal Digital System (FDsys).
2. Raise ALL Congressional, Executive & Judicial branch information, publications & data to the level of federally funded scientific information & publish ALL government information as "Open Access."
3. Mandate the free permanent public access to other Federal information currently maintained in fee-based databases - including the Public Access to Court Electronic Records (PACER), the National Technical Reports Library (NTRL), & USA Trade Online.
4. Establish an interagency, govt-wide strategy to manage the entire lifecycle of digital government information w/ FDLP Libraries - publication, access, usability, bulk download, long-term preservation, standards & metadata.
The National Academy of Public Administration (NAPA) completed an operational review of the Government Printing Office (GPO) mandated under the 2012 Consolidated Appropriations Act (Public Law 112-74). The NAPA report, “Rebooting the Government Printing Office: Keeping America Informed in the Digital Age,” acknowledged the obligation Congress has to establish an interagency government-wide strategy to manage the lifecycle of digital government information. The report also acknowledged the vital role GPO plays in providing free permanent public access to authentic government information in tangible formats through its Federal Depository Library Program (FDLP) and to authentic government information in electronic formats via GPO’s Federal Digital System (FDSys).
However, Recommendation 4 states: “GPO and Congress should explore alternative funding models for the Federal Digital System in order to ensure a stable and sufficient funding source.” Among the models recommended are “…reimbursement for services; fees for end users; dedicated appropriations; and/or an automatic charge to agencies, depending on size, to encourage agencies to take advantage of GPO’s existing infrastructure and cover the cost of the services being provided by GPO.”
Just as the Obama Administration supports the public’s right to “free access over the Internet to scientific journal articles arising from taxpayer-funded research,” the Administration must support the creation of “stable and sufficient funding” to ensure free permanent public access to authentic government information arising from the work of taxpayer-funded Executive, Congressional, and Judicial Branch agencies.
- NAPA report, “Rebooting the Government Printing Office: Keeping America Informed in the Digital Age.”
- CASSANDRA Letter to US Public Printer in response to the NAPA Report.
- Expanding Public Access to the Results of Federally Funded Research. John P. Holdren, Director of the White House Office of Science and Technology Policy (OSTP).
- White House response to "We The People" petition "Increasing Public Access to the Results of Scientific Research"
- Government Accountability Office (GAO), Information Management: National Technical Information Service's Dissemination of Technical Reports Needs Congressional Attention. GAO-13-99, November 19, 2012. Context on the GAO report from FGI.
- GPO's Federal Digital System (FDsys): http://fdsys.gov
- PACER: http://www.pacer.gov
- National Technical Reports Library (NTRL): http://ntrl.ntis.gov
- USAtrade: https://www.usatradeonline.gov
- Federal Depository Library Program (FDLP). http://fdlp.gov
The Sunlight Foundation just put out their Open Legislative Data Report Card. California received a D grade :-| Find out how your state is doing. Below is the methodology that they used to grade state legislatures.
Each state was evaluated in six categories based largely on the Ten Principles For Opening Up Government Information. Each score is based on at least two members of staff and a volunteer during our state survey. Additionally, state legislatures were contacted (unless noted in their score) to ensure that our information on bulk data availability and timeliness was as accurate as possible.
The specific criteria for each category are as follows:
We evaluated each state on the data collected by Open States: bills, legislators, committees, votes and events. We also took note if a state went above and beyond to provide this information and other relevant contextual information such as supporting documents, legislative journals and schedules. Points were deducted for missing data, often roll call votes.
- 0 State provides full breadth of legislative artifacts Open States collects: bills, legislators, votes, and committees.
- -1 State does not provide stand-alone roll call votes.
Legislative information is most relevant when it happens, and many states are publishing information in real time. Unfortunately, there are also states where updates are more infrequent and showing up days after a legislative action took place. States were dinged if data took more than 48 hours to go online.
- 1 Multiple updates throughout the day, real time or as close to it as systems will allow.
- 0 Site updates once or twice daily, typically at the end of the legislative day.
- -1 Updates take longer than 24 hours to appear on the site, often up to a week.
EASE OF ACCESS
For many sites, the Open States team wrote scrapers to collect legislative information from the website code—a slow, tedious and error prone process. We collected data faster and more reliably when data was provided in a machine-readable format such as XML, JSON, CSV or via bulk downloads. If a state posted PDF image files or scanned documents, it received the lowest score possible.
- 2 Essentially all data can be found in machine-readable formats.
- 1 Lots of data in machine readable format but substantial portions that still required scraping HTML.
- 0 No machine readable data but standard screen scraping techniques applied.
- -2 Site had information that was unaccessible to Open States due to use of scanned PDFs.
USE OF COMMONLY OWNED STANDARDS
Because our ability to access most of a state’s data is represented by the above “Machine Readability” metric, we decided to use this provision to measure how a state made their bill text available. Making text available in HTML or PDF is the norm, and was considered an acceptable commonly owned standard (PDFs are a commonly owned standard, but it would be certainly nice to see alternative options where bill text is only available via PDF). States that only make documents available in Microsoft Word or Wordperfect formats require an individual to purchase expensive software or rely on free alternatives that may not preserve the correct formatting. It is worth noting, all states except for two met the common criteria of providing HTML and/or PDF only, one state (Kansas) went above and beyond and another (Kentucky) did not even meet this threshold.
- 1 State made an effort to go above and beyond.
- 0 State provided bills in PDF and/or HTML format and nothing better (plaintext, ODT, etc.).
- -1 State only provided bills in a proprietary format.
Many states move or remove information when a new session starts, much to the dismay of citizens seeking information on old proposals and researchers that may have cited a link (e.g. http://somelegislature.gov/HB1 vs http://somelegislature.gov/2011/HB1) only to see it point to a different bill in the following session. Tim Berners-Lee, inventor of the World Wide Web, wrote an article declaring Cool URIs Don’t Change and we agree.
This poses a particular challenge to us since every page on OpenStates.org points to the page we collected data from, but if a state changes their site then users lose the ability to check us against the original source. Most (but not all) states are good about at least preserving bill information, but few were equally as good about preserving information about out-of-office legislators and historical committees, equally important parts of the legislative process.
- 2 All information is avaialble in a permanent location and data goes back a reasonable amount of time (a decade or so).
- 1 Almost all information has a permanent location but a single data set doesn't. (Or a recent change to the site has wiped out historical links but information appears to be preservable going forward.)
- 0 Legislator & committee information lacks a permanent location (such as committees and legislators) but most is acceptable.
- -1 Ability to link to old information is badly damaged and and/or there is less than a decade of historical information.
- -2 Vital information like bills or versions lack a permanent location.
The great technology publisher O'Reilly is making its Open Government book files available for free for anyone to download, read and share. The files are posted on the O’Reilly Media GitHub account as PDF, Mobi, and EPUB files for now.
- We're releasing the files for O'Reilly's Open Government book: A #PDFtribute to Aaron Swartz (announcement) by Laurel Ruma, O'Reilly Radar (January 18, 2013).
- Open Government: Collaboration, Transparency, and Participation in Practice, Edited By Daniel Lathrop, Laurel Ruma, Foreward by Don Tapscott. O'Reilly Media (February 2010).
Be sure to check out Chapter 25, "When Is Transparency Useful?" by Aaron Swartz.
Shinjoung and I were stunned when we heard the news early yesterday morning that our friend -- and supreme friend of libraries and the Internet! -- Aaron Swartz left this world late friday evening. Aaron was deeply committed to and passionate about internet freedom and making information and knowledge as available as possible. To those ends, he worked on many projects large and small in his short but influential life. He was 26.
The *many* heartfelt remembrances from communities as diverse as journalism, law and open source tech -- witness Rick Perlstein, Lawrence Lessig, Glenn Greenwald, Karl Fogel -- attest to Aaron's supreme impact on the world at large (and that's no hyperbole!).
Before I had even heard of his tragic demise, a few colleagues and I were in the midst of writing letters of support for Aaron's nomination for this year's James Madison award from the American Library Association (ALA). This award, named in honor of President James Madison, was established by the ALA in 1986 to honor individuals or groups who have championed, protected and promoted public access to government information and the public’s “right to know” on the national level. I hope now that ALA will award Aaron posthumously!
We're helping Archive-it staff harvest a Web archive of Aaron's work, writings, images, videos, and remembrances. If you've got a URI that you'd like to be included in the archive, please paste it to this Google Doc.
Remembrances of Aaron, as well as donations in his memory, can be submitted at http://rememberaaronsw.com
The world will miss you Aaron. Be at peace my friend!
The Sunlight Foundation recently named Liz Barry and her group at the Public Laboratory for Open Technology and Science (PLOTS) as OpenGov Champions. Sunlight highlights these champions for their work and ingenuity in furthering govt transparency.
Ms Barry and the PLOTS team is perhaps best known for using kites and helium balloons to map the BP Deepwater Horizon Oil Spill in 2010, the only high resolution images out in the media at the onset of the catastrophe. PLOTS uses "mapping and other scientific DIY methods to empower local residents and activists to issue their own data sets to better engage with their local governments in environmental and other issues in their communities."
Be sure to check out their many maps available in the PLOTS open data archive. And for all of you DIY scientists, you can chip in to the PLOTS DIY spectrometry kit kickstarter campaign and help them build a spectrum-sharing wiki.
Building off of last week's post on the Obama Administration's new digital government strategy, I came across this analysis over at TechPresident: "White House Rolls Out New Plan for Digital Government".
Among the changes called for in the plan:
- Within six months, the Office of Management and Budget will release new government-wide standards for open data, content, and web application programming interfaces. Agencies will have another six months to make sure they are following those policies. They are also going to be asked to take two customer-facing online services and expose the information it delivers through APIs to "appropriate audiences," meaning some set of developers will be able to build applications around them without necessarily working in close concert with the agency providing the data.
- Agencies will be asked to publish ever more data through APIs and as structured data, which are the building blocks of modern web design and mobile-ready websites. The White House line on this is that it will also encourage outside developers to build new businesses on top of government data.
- The General Services Administration will establish a Digital Services Innovation Center to work with agencies to modernize how they interact with citizens on the web.
- The White House will begin releasing its own source code on GitHub and launch a "presidential innovation fellowship" program to bring developers from the private sector into government for six-to-12-month projects.
- The federal government will work to develop "MyGov," a prototype central hub for citizens to access all the services and information they're looking for from government online.
- Through programs like one intended to encourage small businesses to compete for government business, the White House will work to change IT procurement practices and cut down on the number of high-dollar, low-output contracts. Other procurement-related initiatives include a government-wide vehicle for mobile device and wireless service contracting and government-wide guidance on bring-your-own-device policies.
- Data.gov, the federal repository for government data available online, will transition away from being a hub for data files and towards a central clearing house of government APIs that developers can incorporate into web applications.
While we're excited that the White House is continuing to espouse the importance of open government principles, our concern is that the plan (PDF) does not address digital preservation or authenticity, two critical issues for librarians in guaranteeing long-term FREE access to government information -- and issues we addressed in a 2010 letter to then deputy CTO for Open Government Beth Noveck.
It's all well and good to talk about IT reform, shared IT infrastructure and services, APIs etc, but who's going to manage all of this cool digital stuff for the long-term? And where will the funding (or RE-funding) come from to keep Data.gov afloat in order to manage all of the APIs? In an era where GPO's FY2012 request for $6million to fund continuing development of their Federal Digital System (FDsys) is met with $0 funding by the House and only slightly less catastrophic $500,000 by the Senate, talk is all well and good. Digital infrastructure and services, and more importantly the staff to manage them, costs $$ -- arguably much more $$ than distribution and preservation of paper collections in the FDLP. We need a government and politicians who won't short-change open government and transparency. We need them and the public to realize that "online" does NOT equal "free beer" but "free kittens!"
A new paper distinguishes between open government data that makes the government as a whole more transparent and politically neutral public data that have nothing to do with public accountability.
- Yu, Harlan and Robinson, David G., The New Ambiguity of 'Open Government' (February 28, 2012). Princeton CITP / Yale ISP Working Paper. Available at SSRN.
Today a regime can call itself "open" if it builds the right kind of web site -- even if it does not become more accountable or transparent. This shift in vocabulary makes it harder for policymakers and activists to articulate clear priorities and make cogent demands.
This essay proposes a more useful way for participants on all sides to frame the debate: We separate the politics of open government from the technologies of open data.
Open Government vs. Open Data, By Joseph Marks, NextGov