The Sunlight Foundation just put out their Open Legislative Data Report Card. California received a D grade :-| Find out how your state is doing. Below is the methodology that they used to grade state legislatures.
Each state was evaluated in six categories based largely on the Ten Principles For Opening Up Government Information. Each score is based on at least two members of staff and a volunteer during our state survey. Additionally, state legislatures were contacted (unless noted in their score) to ensure that our information on bulk data availability and timeliness was as accurate as possible.
The specific criteria for each category are as follows:
We evaluated each state on the data collected by Open States: bills, legislators, committees, votes and events. We also took note if a state went above and beyond to provide this information and other relevant contextual information such as supporting documents, legislative journals and schedules. Points were deducted for missing data, often roll call votes.
- 0 State provides full breadth of legislative artifacts Open States collects: bills, legislators, votes, and committees.
- -1 State does not provide stand-alone roll call votes.
Legislative information is most relevant when it happens, and many states are publishing information in real time. Unfortunately, there are also states where updates are more infrequent and showing up days after a legislative action took place. States were dinged if data took more than 48 hours to go online.
- 1 Multiple updates throughout the day, real time or as close to it as systems will allow.
- 0 Site updates once or twice daily, typically at the end of the legislative day.
- -1 Updates take longer than 24 hours to appear on the site, often up to a week.
EASE OF ACCESS
For many sites, the Open States team wrote scrapers to collect legislative information from the website code—a slow, tedious and error prone process. We collected data faster and more reliably when data was provided in a machine-readable format such as XML, JSON, CSV or via bulk downloads. If a state posted PDF image files or scanned documents, it received the lowest score possible.
- 2 Essentially all data can be found in machine-readable formats.
- 1 Lots of data in machine readable format but substantial portions that still required scraping HTML.
- 0 No machine readable data but standard screen scraping techniques applied.
- -2 Site had information that was unaccessible to Open States due to use of scanned PDFs.
USE OF COMMONLY OWNED STANDARDS
Because our ability to access most of a state’s data is represented by the above “Machine Readability” metric, we decided to use this provision to measure how a state made their bill text available. Making text available in HTML or PDF is the norm, and was considered an acceptable commonly owned standard (PDFs are a commonly owned standard, but it would be certainly nice to see alternative options where bill text is only available via PDF). States that only make documents available in Microsoft Word or Wordperfect formats require an individual to purchase expensive software or rely on free alternatives that may not preserve the correct formatting. It is worth noting, all states except for two met the common criteria of providing HTML and/or PDF only, one state (Kansas) went above and beyond and another (Kentucky) did not even meet this threshold.
- 1 State made an effort to go above and beyond.
- 0 State provided bills in PDF and/or HTML format and nothing better (plaintext, ODT, etc.).
- -1 State only provided bills in a proprietary format.
Many states move or remove information when a new session starts, much to the dismay of citizens seeking information on old proposals and researchers that may have cited a link (e.g. http://somelegislature.gov/HB1 vs http://somelegislature.gov/2011/HB1) only to see it point to a different bill in the following session. Tim Berners-Lee, inventor of the World Wide Web, wrote an article declaring Cool URIs Don’t Change and we agree.
This poses a particular challenge to us since every page on OpenStates.org points to the page we collected data from, but if a state changes their site then users lose the ability to check us against the original source. Most (but not all) states are good about at least preserving bill information, but few were equally as good about preserving information about out-of-office legislators and historical committees, equally important parts of the legislative process.
- 2 All information is avaialble in a permanent location and data goes back a reasonable amount of time (a decade or so).
- 1 Almost all information has a permanent location but a single data set doesn't. (Or a recent change to the site has wiped out historical links but information appears to be preservable going forward.)
- 0 Legislator & committee information lacks a permanent location (such as committees and legislators) but most is acceptable.
- -1 Ability to link to old information is badly damaged and and/or there is less than a decade of historical information.
- -2 Vital information like bills or versions lack a permanent location.
The Archive-it team announced today the publication of their White Paper Web Archiving Life Cycle Model. The model offers a thorough description of the entire process of Web archiving. Whether you've been Web archiving for 7 years or mulling about jumping in to the fray, this model will put you in a good headspace to do this critical work. Thanks Molly Bragg, Kristine Hanna, Lori Donovan, Graham Hukill, and Anna Peterson!
The Archive-It team is excited to publish our first white paper: The Web Archiving Life Cycle Model. With this paper we hope to share web archiving best practices and processes with organizations interested in developing and/or expanding their web archiving initiatives.
This white paper is the product of a collaboration between members of the Archive-It team as well as the larger Archive-It partner community. Several partners took part in in-depth interviews regarding their experiences using Archive-It and web archiving in general, and others helped with the design iteration phase of the model and read preliminary drafts of the paper.
The Web Archiving Life Cycle Model encompasses the following web archiving processes:
• Vision and Objectives
• Resources and Workflow
• Risk Management
• Appraisal and Selection
• Data Capture
• Storage and Organization
• Quality Assurance and Analysis
This week in his weekly Time to Wake Up speech, Senator Sheldon Whitehouse (D-RI) speaks about climate change making the Government Accountability Office's (GAO) High Risk List for the first time this year. GAO's High Risk List is published every two years at the start of every new Congress since 1990. GAO "calls attention to agencies and program areas that are high risk due to their vulnerabilities to fraud, waste, abuse, and mismanagement, or are most in need of transformation."
"According to GAO, and I’ll quote again, “The nation’s vulnerability can be reduced by limiting the magnitude of climate change through actions to limit greenhouse gas emissions. . . . While implementing adaptive measures may be costly, [GAO continues] there is a growing recognition that the cost of inaction could be greater and—given the government’s precarious fiscal position—increasingly difficult to manage given expected budget pressures.”'
"Mr. President, Congress has been asleep long enough. We have a tradition in this body of taking the accounting of GAO, our non-partisan watchdog, seriously, and of taking GAO’s High Risk List seriously. GAO now joins our defense and intelligence communities, our scientific research communities, and our state and local governments, and major sectors of private industry, who have all elevated climate change from their “to-do” list to their “must-do” list. Mr. President, it is time for Congress to wake up to its duties, and to get to work."
According to the Air Force Times, the Air Force has reversed their policy of sharing monthly statistics on the number of airstrikes launched from drones (aka remotely piloted aircraft (RPA)). In the interest of access and transparency, we've posted the original statistics from December '12, January '13, and February '13.
As scrutiny and debate over the use of remotely piloted aircraft (RPA) by the American military increased last month, the Air Force reversed a policy of sharing the number of airstrikes launched from RPAs in Afghanistan and quietly scrubbed those statistics from previous releases kept on their website.
Last October, Air Force Central Command started tallying weapons releases from RPAs, broken down into monthly updates. At the time, AFCENT spokeswoman Capt. Kim Bender said the numbers would be put out every month as part of a service effort to “provide more detailed information on RPA ops in Afghanistan.”
The Air Force maintained that policy for the statistics reports for November, December and January. But the February numbers, released March 7, contained empty space where the box of RPA statistics had previously been.
Additionally, monthly reports hosted on the Air Force website have had the RPA data removed — and recently.
Those files still contained the RPA data as of Feb. 16, according to archived web pages accessed via Archive.org. Metadata included in the new, RPA-less versions of the reports show the files were all created Feb. 22.
Lunchtime listen: Lawrence Lessig's Furman lecture titled "Aaron's Laws: Law and Justice in a Digital Age."Submitted by jrjacobs on Wed, 2013-03-06 13:18.
This will be well worth your time! Listen, grok, act!
On Tuesday, Feb. 19, Lawrence Lessig marked his appointment as Roy L. Furman Professor of Law and Leadership at Harvard Law School with a lecture titled "Aaron's Laws: Law and Justice in a Digital Age." The lecture honored the memory and work of Aaron Swartz, the programmer and activist who took his own life on Jan. 11, 2013 at the age of 26. Swartz spent the last two years fighting federal charges that he violated the Computer Fraud and Abuse Act.
On his blog, Lessig wrote, “When a law professor is given a “chair” s/he gives a lecture in honor of the honor. … After Aaron’s death, I asked the Dean to let me reschedule the lecture. But after some more thought, I’ve decided to make the lecture about Aaron, and about how we need to honor his work.”
I was signed up for the Help! webinar on Homeland Security Digital Library, but unfortunately was unable to make the session. But luckily, all sessions are recorded and posted along with slides for future access on their site. This was a particularly interesting session presented by Greta Marlatt, the Outreach and Collection Development Manager for the Naval Postgraduate School’s Dudley Knox Library and the Content Manager for the Homeland Security Digital Library (HSDL). Greta pointed out several interesting aspects to the HSDL site:
- Compile hearing transcripts, prepared testimonies and video links from Committee pages
- Get permissions for hosting publications from other agencies and organizations (similar to our Everyday Electronic Materials (EEMs) project described earlier)
- Weekly email alerts for targeted search strategies
- Post CRS reports
- Homeland security related blogs aggregated
I think it's especially interesting that Greta and her team are compiling govt information and hosting digital files from other agencies and organizations. I highly recommend going back and listening to this presentation and ALL of the past Help! webinars!!
Kudos to Lynda Kellam and the rest of the group of North Carolina librarians putting out these interesting and informative Help! I'm an Accidental Government Information Librarian Webinars!
The Government Resources Section of the North Carolina Library Association welcomes you to a series of webinars designed to help us all do better reference work by increasing our familiarity with government information resources, and by discovering the best strategies for navigating them.
The Homeland Security Digital Library (HSDL) is the nation's premier research collection of open-source resources related to homeland security policy, strategy and organizational management. The HSDL is sponsored by the Naval Postgraduate School Center for Homeland Defense and Security and the U.S. Department of Homeland Security’s National Preparedness Directorate, FEMA.
Greta Marlatt is the Outreach and Collection Development Manager for the Naval Postgraduate School’s Dudley Knox Library and the Content Manager for the Homeland Security Digital Library (HSDL). She has over 30 years of experience working in libraries in various capacities. Ms. Marlatt has published several articles and is the author of a number of bibliographies and help guides for topics relating to Intelligence, Information Warfare, Special Operations, Homeland Security, Mine Warfare, Directed Energy Weapons, NBC Terrorism and more. She has given numerous presentations on topics related to conducting research in the homeland security and military arenas. Ms. Marlatt holds a Bachelor of Arts degree in English from Arizona State University, a Master of Library Science degree from the University of Arizona and a Master of Arts degree in National Security Studies from California State University, San Bernardino.
Thanks in part to a We the People petition signed by 65,000 people(!), President Obama's science advisor, John Holdren, issued a directive on Friday to all research funding agencies to develop plans to make the results of federally-funded research publically available free of charge within 12 months of publication. It also requires that scientists receiving taxpayer dollars to improve upon the management and sharing of scientific data. This is huge! By my rough count, that means that approximately 20 US agencies will now make the science they fund available to the public. The only thing better would be for President Obama to support FREE access to ALL federal govt publications by assuring that FDsys remains freely available (one of the recommendations of the recent NAPA report was the tremendously backward and short-sighted suggestion that GPO charge for access to their FDsys database!)
See the policy memorandum, Expanding Public Access to the Results of Federally Funded Research
The Obama Administration is committed to the proposition that citizens deserve easy access to the results of scientific research their tax dollars have paid for. That’s why, in a policy memorandum released today, OSTP Director John Holdren has directed Federal agencies with more than $100M in R&D expenditures to develop plans to make the published results of federally funded research freely available to the public within one year of publication and requiring researchers to better account for and manage the digital data resulting from federally funded scientific research. OSTP has been looking into this issue for some time, soliciting broad public input on multiple occasions and convening an interagency working group to develop a policy. The final policy reflects substantial inputs from scientists and scientific organizations, publishers, members of Congress, and other members of the public—over 65 thousand of whom recently signed a We the People petition asking for expanded public access to the results of taxpayer-funded research.
To see the new policy memorandum, please visit: http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_publi...
To see Dr. Holdren’s response to the We the People petition, please visit: https://petitions.whitehouse.gov/response/increasing-public-access-resul...
Michael Stebbins is Assistant Director for Biotechnology at OSTP
If you're an aspiring (or accidental) data librarian, or just want to know more about Inter-university Consortium for Political and Social Research (ICPSR), then here's a webinar for you!
Space is limited.
Reserve your Webinar seat now at:
This session will cover effective search strategies, ICPSR’s bibliography of data-related literature, our growing tools associated with the social science variables database, and more!
This session is for those who are searching for research data or teaching tools and those who are helping others to find data or teaching tools.
Title: Hands-Ons with ICPSR - Discovering ICPSR Data
Date: Monday, February 25, 2013
Time: 2:00 PM - 3:00 PM EST
After registering you will receive a confirmation email containing information about joining the Webinar.
The Sunlight Foundation has just released Open States for all 50 states, the District of Columbia and Puerto Rico. The site helps the public find their state legislators, review their votes, search upcoming legislation, and track bill progress. Open States gets their Bill, legislator, committee and event data from official sources, linked at the bottom of each legislator, bill, vote, committee or event page. Check out their methodology for more. They rely primarily on scraping data from sites. Wouldn't it be awesome of all state legislatures had bulk data feeds so that 1000 sites like Open States could bloom? Join the Webinar on February 22nd to learn more about Open States.
After more than four years of work from volunteers and a full-time team here at Sunlight we're immensely proud to launch the full Open States site with searchable legislative data for all 50 states, D.C. and Puerto Rico. Open States is the only comprehensive database of activities from all state capitols that makes it easy to find your state lawmaker, review their votes, search for legislation, track bills and much more.
If you're interested in your state lawmaker, you'll be able to get notifications for their actions, a map of their district, voting records, committee assignments, campaign finance records from Influence Explorer, local news articles and contact information. If you're curious about a particular piece of legislation, Open States allows you to check on its status, find the sponsors, break down votes, view bill text and all supporting documents. Our powerful search capabilities allow you to find similar topics across states and view overview pages for each state, chamber and committee.