Daniel Schuman, Policy Counsel and Director, Advisory Committee on Transparency of the Sunlight Foundation, writes that Reps. Mike Quigley and Leonard Lance are leading the charge in the House of Representatives to make CRS Reports publicly accessible. They've introduced (or RE-introduced) H.Res.110 - Congressional Research Service Electronic Accessibility Resolution of 2013. Hopefully this will be the year that Congress decides to share.
Former Senator Daniel Patrick Moynihan famously said that "everyone is entitled to his own opinion, but not his own facts." In 1914, an uncharacteristically foresighted Congress spent $25,000 to establish a fact-finding arm whose mission was to gather "data ... bearing upon legislation, and to render such data serviceable to Congress." A century later, the Congressional Research Service generates hundreds of analytical non-partisan reports on legislative issues each year.
CRS reports often inform public debate. A recent analysis, which found no correlation between economic growth and cutting tax rates for the wealthy, set off a re-appraisal of long-held orthodoxy about tax policy. A 2006 analysis questioning the legal rationale supporting the Bush administration's warrantless wiretapping policy caused many to look at the issue with fresh eyes. CRS analyses are routinely cited in news reports, by the courts, in congressional debate, and by government watchdogs.
However, unlike its sister agencies that investigate federal spending and analyze the budgetary effects of legislation, CRS does not release its reports to the public on a regular basis. This was not always so, and even now CRS routinely shares its reports with officials in the executive and judicial branches and with the press upon request. Congressional offices also act to disseminate the reports, publishing some on their websites, frequently sending others to constituents in response to requests, and giving them to reporters (often to help push a political narrative.)
But for a member of the public, it's difficult to access reports generated by the 600-person $100 million-a-year agency in any comprehensive way. Efforts by non-profit organizations to gather and re-publish the reports online have met with limited success. The private sector has stepped in, selling access to the reports at $20 a pop, but the premium accentuates the gap between the elites and everyone else.
Experts called the third floor of the White House "an outstanding example of a firetrap." The result of a federally commissioned report found the mansion's plumbing "makeshift and unsanitary," while "the structural deterioration [was] in 'appalling degree,' and threatening complete collapse." The congressional commission on the matter was considering the option of abandoning the structure altogether in favor of a built-from-scratch mansion, but President Truman lobbied for the restoration.
"It perhaps would be more economical from a purely financial standpoint to raze the building and to rebuild completely," he testified to Congress in February 1949. "In doing so, however, there would be destroyed a building of tremendous historical significance in the growth of the nation."
So it had to be gutted. Completely. Every piece of the interior, including the walls, had to be removed and put in storage. The outside of the structure-reinforced by new concrete columns-was all that remained.
The Shell of the White House during the Renovation, 05/17/1950
Original Caption: The Shell of the White House during the Renovation, 05/17/1950
Created By: National Archives and Records Administration. Office of Presidential Libraries. Harry S. Truman Library. (04/01/1985 - )
From: Series: Photographs Relating to the Administration, Family, and Personal Life of Harry S. Truman, compiled 1957 - 2004, documenting the period 1849 - 2004
Contact: Harry S. Truman Library (NLHST), 500 West U.S. Highway 24, Independence, MO, 64050-1798. PHONE: 816-268-8272; FAX: 816-268-8295; EMAIL:email@example.com.
Production Dates: 05/17/1950
Scope and Content Note: Window openings provide bursts of light into the cavernous interior of the White House, supported only by a web of temporary steel supports. The exterior walls rest on new concrete underpinnings, which allow earth-moving equipment to dig a new basement.
Persistent URL: arcweb.archives.gov/arc/action/ExternalIdSearch?id=6982099
Truman Library URL: www.trumanlibrary.org/photographs/view.php?id=22
Access Restrictions: Unrestricted
Use Restrictions: Unrestricted
99% Invisible is one of my favorite podcasts. Roman Mars talks about architecture and design in a really thoughtful and compelling way. He had a recent episode about razzle dazzle ships' camouflage in which he included images from the Rhode Island School of Design (RISD)'s Fleet Library (which is NOT connected to the Navy in any way :-)). Check out this fascinating listen about ships and camouflage.
(Erik Gould, courtesy of the Fleet Library at RISD, Providence, RI..)
(Erik Gould, courtesy of the Fleet Library at RISD, Providence, RI.)
Becoming invisible with your surroundings is only one type of camouflage. Camofleurs call this high similarity or blending camouflage. But camouflage can also take the opposite approach.
(Erik Gould, courtesy of the Fleet Library at RISD, Providence, RI.)
[Update 3/15/13: Here's the ALA announcement.]
It was just announced that Aaron Swartz will be awarded the American Library Association's James Madison Award awarded annually to "honor individuals or groups who have championed, protected and promoted public access to government information and the public’s “right to know” on the national level." It is fitting that Aaron win the award -- and be presented by Rep. Zoe Lofgren (D-CA), a strong advocate for digital rights in Congress who won the award last year and who introduced Aaron's Law to try and amend the Computer Fraud and Abuse Act (CFAA).
The ceremony will be webcast live tomorrow (Friday March 15, 2013) at 8:30am eastern time. We'll post the video as soon as its made available.
[Editor's note: This is a guest post from Amanda Wakaruk, Government Information Librarian at the University of Alberta Libraries.]
Over the past week, the British Columbia Freedom of Information and Privacy Association (FIPA) wrote about and then provided the public with access to documentation outlining a Web Renewal Action Plan that calls for the reduction of Government of Canada (GoC) websites from roughly 1500 down to 1 (see FIPA’s blog entries, linked below). This plan appears to exacerbate the problems I noted in an FGI blog post last year: Government of Canada Publications -– It’s About Access, Not Format. For example, there is no publicly available evidence that the GoC has implemented or plans to implement a comprehensive web archiving plan before reducing its web footprint.
As a practitioner, I run into the problem of missing (i.e., unarchived) born digital content on a regular basis. (And no, Library and Archives Canada is not collecting websites for public consumption – these programs stopped in 2009.) The question I lost sleep over last year is more pressing than ever: who is archiving the web content of the GoC?
A group of institutions is working hard to setup a LOCKSS network that will help preserve the content of the Depository Services Program’s (DSP) e-archive (see the nascent CGI-PLN Wiki – email me if you would like to become a member or can help with funding to try and make this content accessible in the event that we lose access to the DSP website). Our first collection -- as important and impressive as it is at over 110,000 pdfs -- only represents a fraction of the content produced by the GoC. (As you might recall, the DSP does not collect html, only pdfs… and the latter format is discouraged by current GoC web protocols).
I am proud of the fact that the University of Alberta Libraries, my home institution, was able to capture select GoC websites using a fee-based (and US-based) Archive-IT account but no single academic institution can afford to act as steward for the output of the federal government. Happily, we have a colleague in the University of Toronto Libraries, who started capturing GoC web content using Archive-IT a few weeks ago as part of a joint “rescue mission” to save the contents of the Aboriginal Portal of Canada before it was deleted from government servers (the results of these crawls are accessible here and here).
The bigger question, of course, is this: If not the government, then who is responsible for collecting and preserving the born digital content of the GoC? If it *is* the academic sector’s responsibility then where will the funding come from? Recent provincial budget cuts in Ontario and Alberta have been hard on this sector, to say the least. If there is a White Knight out there, now would be a great time to step forward!
The elimination of print publications coupled with a lack of web archiving and a directive to make only ‘current’ information available online marks an incalculable loss. Countless students describe the sessional papers as “life changing” and scholars from all walks of life routinely draw on statistical information produced by their governments to help make sense of our place in the world and inform ways to improve it (as an aside, Statistics Canada plans to remove publications more than a few years old from their website). It is unthinkable that future generations will not have access to information produced by their government today… information that should be informing our cultural narrative.
Reaction to Web Renewal Action Plan
- Harper Government Centralizing, Slashing Federal Web Info
- Federal Open Government Minister Not a Fan of Open Government. Vincent Gogolek, Executive Director, Freedom of Information and Privacy Association Huffington Post Canada, British Columbia
- first post includes links to the Web Renewal Action Plan
- Stephen Harper asked Tony Clement to ‘significantly reduce’ number of government websites, says document. Mike de Souza, Edmonton Journal.
- Historical letters not wanted at Library and Archives Canada, critics say. Joseph Hall, Toronto Star
- Tories Restrict Online Data Mining, But Not for Social Media. Globe and Mail
Man, this week is Sunshine-week-alicious! The Sunlight Foundation has long advocated for -- along with FGI, library- and open govt organizations -- the free public access to Congressional Research Service (CRS) reports. CRS Reports are commonly not available to the public as CRS has this arcane and outdated rule that CRS reports are privileged communication between Congress and CRS. But CRS reports ARE available randomly online and Proquest, Penny Hill Press and other commercial publishers have long published them for a fee (I've even heard that CRS subscribes to Proquest to get access to their own reports historically!).
But this all may change. According to the Sunlight Blog, Representatives Leonard Lance (R-NJ) and Mike Quigley (D-IL) have reintroduced the bipartisan House Resolution 110 "Public Access to Congressional Research Service Reports Resolution of 2013" (text not received by GPO yet so not publicly available on Thomas). The Resolution would direct the Clerk of the House of Representatives to provide members of the public with Internet access to certain Congressional Research Service publications. Easy-peasy right?!
More than 30 organizations -- including Sunlight Foundation and FGI -- have signed on to a letter supporting the resolution. Please consider contacting your Representative and ask them to support H.Res. 110!
[UPDATE 4/2/13: We've had some questions about the meaning of "ALL." Please read the comment thread for clarification. We don't mean "records" (which fall under FOIA) and we don't mean classified information. We mean public domain documents, publications, reports, data, statistics and the like. JRJ]
A convergence of several things -- the White House's new policy on Open Access to federally funded scientific information, the NAPA Report on the GPO, the CASSANDRA Letter to the Public Printer, and Sunshine Week among them -- has led us to create a petition on the White House's We the People petition site. If you believe in free permanent public access to authentic government information, we hope you'll sign the petition and forward on to all your friends and social networks to help us reach our goal of 100,000 signatures by April 11, 2013! Thanks in advance!!
WE PETITION THE OBAMA ADMINISTRATION TO:
Require free online permanent public access to ALL federal government information and publications.
1. Assure that GPO has the funds to continue to maintain and develop the Federal Digital System (FDsys).
2. Raise ALL Congressional, Executive & Judicial branch information, publications & data to the level of federally funded scientific information & publish ALL government information as "Open Access."
3. Mandate the free permanent public access to other Federal information currently maintained in fee-based databases - including the Public Access to Court Electronic Records (PACER), the National Technical Reports Library (NTRL), & USA Trade Online.
4. Establish an interagency, govt-wide strategy to manage the entire lifecycle of digital government information w/ FDLP Libraries - publication, access, usability, bulk download, long-term preservation, standards & metadata.
The National Academy of Public Administration (NAPA) completed an operational review of the Government Printing Office (GPO) mandated under the 2012 Consolidated Appropriations Act (Public Law 112-74). The NAPA report, “Rebooting the Government Printing Office: Keeping America Informed in the Digital Age,” acknowledged the obligation Congress has to establish an interagency government-wide strategy to manage the lifecycle of digital government information. The report also acknowledged the vital role GPO plays in providing free permanent public access to authentic government information in tangible formats through its Federal Depository Library Program (FDLP) and to authentic government information in electronic formats via GPO’s Federal Digital System (FDSys).
However, Recommendation 4 states: “GPO and Congress should explore alternative funding models for the Federal Digital System in order to ensure a stable and sufficient funding source.” Among the models recommended are “…reimbursement for services; fees for end users; dedicated appropriations; and/or an automatic charge to agencies, depending on size, to encourage agencies to take advantage of GPO’s existing infrastructure and cover the cost of the services being provided by GPO.”
Just as the Obama Administration supports the public’s right to “free access over the Internet to scientific journal articles arising from taxpayer-funded research,” the Administration must support the creation of “stable and sufficient funding” to ensure free permanent public access to authentic government information arising from the work of taxpayer-funded Executive, Congressional, and Judicial Branch agencies.
- NAPA report, “Rebooting the Government Printing Office: Keeping America Informed in the Digital Age.”
- CASSANDRA Letter to US Public Printer in response to the NAPA Report.
- Expanding Public Access to the Results of Federally Funded Research. John P. Holdren, Director of the White House Office of Science and Technology Policy (OSTP).
- White House response to "We The People" petition "Increasing Public Access to the Results of Scientific Research"
- Government Accountability Office (GAO), Information Management: National Technical Information Service's Dissemination of Technical Reports Needs Congressional Attention. GAO-13-99, November 19, 2012. Context on the GAO report from FGI.
- GPO's Federal Digital System (FDsys): http://fdsys.gov
- PACER: http://www.pacer.gov
- National Technical Reports Library (NTRL): http://ntrl.ntis.gov
- USAtrade: https://www.usatradeonline.gov
- Federal Depository Library Program (FDLP). http://fdlp.gov
Last month the National Association of Public Administration (NAPA) released a report entitled "Rebooting the Government Printing Office: Keeping America Informed in the Digital Age" -- FGI responded with an analysis of the report and were particularly disturbed by recommendation #4 which said that GPO should consider "cost recovery" for access to FDsys.
A group of long-time government information librarians writing under the moniker of CASSANDRA (Concerned Government Information Professionals), have co-written a letter to Public Printer Davita Vance-Cooks offering their strong support for NAPA's conclusion that "free access to government information is both an important tenet of a democracy and a critical responsibility" while calling into question the same recommendation #4.
With CASSANDRA's permission (FYI, both Jim Jacobs and James Jacobs are signatories to this letter), we've posted the letter here for public knowledge and so that others may also write letters to the Public Printer and cite this letter in support of free permanent public access to authentic government information now and in the long-term.
The Sunlight Foundation just put out their Open Legislative Data Report Card. California received a D grade :-| Find out how your state is doing. Below is the methodology that they used to grade state legislatures.
Each state was evaluated in six categories based largely on the Ten Principles For Opening Up Government Information. Each score is based on at least two members of staff and a volunteer during our state survey. Additionally, state legislatures were contacted (unless noted in their score) to ensure that our information on bulk data availability and timeliness was as accurate as possible.
The specific criteria for each category are as follows:
We evaluated each state on the data collected by Open States: bills, legislators, committees, votes and events. We also took note if a state went above and beyond to provide this information and other relevant contextual information such as supporting documents, legislative journals and schedules. Points were deducted for missing data, often roll call votes.
- 0 State provides full breadth of legislative artifacts Open States collects: bills, legislators, votes, and committees.
- -1 State does not provide stand-alone roll call votes.
Legislative information is most relevant when it happens, and many states are publishing information in real time. Unfortunately, there are also states where updates are more infrequent and showing up days after a legislative action took place. States were dinged if data took more than 48 hours to go online.
- 1 Multiple updates throughout the day, real time or as close to it as systems will allow.
- 0 Site updates once or twice daily, typically at the end of the legislative day.
- -1 Updates take longer than 24 hours to appear on the site, often up to a week.
EASE OF ACCESS
For many sites, the Open States team wrote scrapers to collect legislative information from the website code—a slow, tedious and error prone process. We collected data faster and more reliably when data was provided in a machine-readable format such as XML, JSON, CSV or via bulk downloads. If a state posted PDF image files or scanned documents, it received the lowest score possible.
- 2 Essentially all data can be found in machine-readable formats.
- 1 Lots of data in machine readable format but substantial portions that still required scraping HTML.
- 0 No machine readable data but standard screen scraping techniques applied.
- -2 Site had information that was unaccessible to Open States due to use of scanned PDFs.
USE OF COMMONLY OWNED STANDARDS
Because our ability to access most of a state’s data is represented by the above “Machine Readability” metric, we decided to use this provision to measure how a state made their bill text available. Making text available in HTML or PDF is the norm, and was considered an acceptable commonly owned standard (PDFs are a commonly owned standard, but it would be certainly nice to see alternative options where bill text is only available via PDF). States that only make documents available in Microsoft Word or Wordperfect formats require an individual to purchase expensive software or rely on free alternatives that may not preserve the correct formatting. It is worth noting, all states except for two met the common criteria of providing HTML and/or PDF only, one state (Kansas) went above and beyond and another (Kentucky) did not even meet this threshold.
- 1 State made an effort to go above and beyond.
- 0 State provided bills in PDF and/or HTML format and nothing better (plaintext, ODT, etc.).
- -1 State only provided bills in a proprietary format.
Many states move or remove information when a new session starts, much to the dismay of citizens seeking information on old proposals and researchers that may have cited a link (e.g. http://somelegislature.gov/HB1 vs http://somelegislature.gov/2011/HB1) only to see it point to a different bill in the following session. Tim Berners-Lee, inventor of the World Wide Web, wrote an article declaring Cool URIs Don’t Change and we agree.
This poses a particular challenge to us since every page on OpenStates.org points to the page we collected data from, but if a state changes their site then users lose the ability to check us against the original source. Most (but not all) states are good about at least preserving bill information, but few were equally as good about preserving information about out-of-office legislators and historical committees, equally important parts of the legislative process.
- 2 All information is avaialble in a permanent location and data goes back a reasonable amount of time (a decade or so).
- 1 Almost all information has a permanent location but a single data set doesn't. (Or a recent change to the site has wiped out historical links but information appears to be preservable going forward.)
- 0 Legislator & committee information lacks a permanent location (such as committees and legislators) but most is acceptable.
- -1 Ability to link to old information is badly damaged and and/or there is less than a decade of historical information.
- -2 Vital information like bills or versions lack a permanent location.
The Archive-it team announced today the publication of their White Paper Web Archiving Life Cycle Model. The model offers a thorough description of the entire process of Web archiving. Whether you've been Web archiving for 7 years or mulling about jumping in to the fray, this model will put you in a good headspace to do this critical work. Thanks Molly Bragg, Kristine Hanna, Lori Donovan, Graham Hukill, and Anna Peterson!
The Archive-It team is excited to publish our first white paper: The Web Archiving Life Cycle Model. With this paper we hope to share web archiving best practices and processes with organizations interested in developing and/or expanding their web archiving initiatives.
This white paper is the product of a collaboration between members of the Archive-It team as well as the larger Archive-It partner community. Several partners took part in in-depth interviews regarding their experiences using Archive-It and web archiving in general, and others helped with the design iteration phase of the model and read preliminary drafts of the paper.
The Web Archiving Life Cycle Model encompasses the following web archiving processes:
• Vision and Objectives
• Resources and Workflow
• Risk Management
• Appraisal and Selection
• Data Capture
• Storage and Organization
• Quality Assurance and Analysis