Last month the National Association of Public Administration (NAPA) released a report entitled "Rebooting the Government Printing Office: Keeping America Informed in the Digital Age" -- FGI responded with an analysis of the report and were particularly disturbed by recommendation #4 which said that GPO should consider "cost recovery" for access to FDsys.
A group of long-time government information librarians writing under the moniker of CASSANDRA (Concerned Government Information Professionals), have co-written a letter to Public Printer Davita Vance-Cooks offering their strong support for NAPA's conclusion that "free access to government information is both an important tenet of a democracy and a critical responsibility" while calling into question the same recommendation #4.
With CASSANDRA's permission (FYI, both Jim Jacobs and James Jacobs are signatories to this letter), we've posted the letter here for public knowledge and so that others may also write letters to the Public Printer and cite this letter in support of free permanent public access to authentic government information now and in the long-term.
The Sunlight Foundation just put out their Open Legislative Data Report Card. California received a D grade :-| Find out how your state is doing. Below is the methodology that they used to grade state legislatures.
Each state was evaluated in six categories based largely on the Ten Principles For Opening Up Government Information. Each score is based on at least two members of staff and a volunteer during our state survey. Additionally, state legislatures were contacted (unless noted in their score) to ensure that our information on bulk data availability and timeliness was as accurate as possible.
The specific criteria for each category are as follows:
We evaluated each state on the data collected by Open States: bills, legislators, committees, votes and events. We also took note if a state went above and beyond to provide this information and other relevant contextual information such as supporting documents, legislative journals and schedules. Points were deducted for missing data, often roll call votes.
- 0 State provides full breadth of legislative artifacts Open States collects: bills, legislators, votes, and committees.
- -1 State does not provide stand-alone roll call votes.
Legislative information is most relevant when it happens, and many states are publishing information in real time. Unfortunately, there are also states where updates are more infrequent and showing up days after a legislative action took place. States were dinged if data took more than 48 hours to go online.
- 1 Multiple updates throughout the day, real time or as close to it as systems will allow.
- 0 Site updates once or twice daily, typically at the end of the legislative day.
- -1 Updates take longer than 24 hours to appear on the site, often up to a week.
EASE OF ACCESS
For many sites, the Open States team wrote scrapers to collect legislative information from the website code—a slow, tedious and error prone process. We collected data faster and more reliably when data was provided in a machine-readable format such as XML, JSON, CSV or via bulk downloads. If a state posted PDF image files or scanned documents, it received the lowest score possible.
- 2 Essentially all data can be found in machine-readable formats.
- 1 Lots of data in machine readable format but substantial portions that still required scraping HTML.
- 0 No machine readable data but standard screen scraping techniques applied.
- -2 Site had information that was unaccessible to Open States due to use of scanned PDFs.
USE OF COMMONLY OWNED STANDARDS
Because our ability to access most of a state’s data is represented by the above “Machine Readability” metric, we decided to use this provision to measure how a state made their bill text available. Making text available in HTML or PDF is the norm, and was considered an acceptable commonly owned standard (PDFs are a commonly owned standard, but it would be certainly nice to see alternative options where bill text is only available via PDF). States that only make documents available in Microsoft Word or Wordperfect formats require an individual to purchase expensive software or rely on free alternatives that may not preserve the correct formatting. It is worth noting, all states except for two met the common criteria of providing HTML and/or PDF only, one state (Kansas) went above and beyond and another (Kentucky) did not even meet this threshold.
- 1 State made an effort to go above and beyond.
- 0 State provided bills in PDF and/or HTML format and nothing better (plaintext, ODT, etc.).
- -1 State only provided bills in a proprietary format.
Many states move or remove information when a new session starts, much to the dismay of citizens seeking information on old proposals and researchers that may have cited a link (e.g. http://somelegislature.gov/HB1 vs http://somelegislature.gov/2011/HB1) only to see it point to a different bill in the following session. Tim Berners-Lee, inventor of the World Wide Web, wrote an article declaring Cool URIs Don’t Change and we agree.
This poses a particular challenge to us since every page on OpenStates.org points to the page we collected data from, but if a state changes their site then users lose the ability to check us against the original source. Most (but not all) states are good about at least preserving bill information, but few were equally as good about preserving information about out-of-office legislators and historical committees, equally important parts of the legislative process.
- 2 All information is avaialble in a permanent location and data goes back a reasonable amount of time (a decade or so).
- 1 Almost all information has a permanent location but a single data set doesn't. (Or a recent change to the site has wiped out historical links but information appears to be preservable going forward.)
- 0 Legislator & committee information lacks a permanent location (such as committees and legislators) but most is acceptable.
- -1 Ability to link to old information is badly damaged and and/or there is less than a decade of historical information.
- -2 Vital information like bills or versions lack a permanent location.
The Archive-it team announced today the publication of their White Paper Web Archiving Life Cycle Model. The model offers a thorough description of the entire process of Web archiving. Whether you've been Web archiving for 7 years or mulling about jumping in to the fray, this model will put you in a good headspace to do this critical work. Thanks Molly Bragg, Kristine Hanna, Lori Donovan, Graham Hukill, and Anna Peterson!
The Archive-It team is excited to publish our first white paper: The Web Archiving Life Cycle Model. With this paper we hope to share web archiving best practices and processes with organizations interested in developing and/or expanding their web archiving initiatives.
This white paper is the product of a collaboration between members of the Archive-It team as well as the larger Archive-It partner community. Several partners took part in in-depth interviews regarding their experiences using Archive-It and web archiving in general, and others helped with the design iteration phase of the model and read preliminary drafts of the paper.
The Web Archiving Life Cycle Model encompasses the following web archiving processes:
• Vision and Objectives
• Resources and Workflow
• Risk Management
• Appraisal and Selection
• Data Capture
• Storage and Organization
• Quality Assurance and Analysis
This week in his weekly Time to Wake Up speech, Senator Sheldon Whitehouse (D-RI) speaks about climate change making the Government Accountability Office's (GAO) High Risk List for the first time this year. GAO's High Risk List is published every two years at the start of every new Congress since 1990. GAO "calls attention to agencies and program areas that are high risk due to their vulnerabilities to fraud, waste, abuse, and mismanagement, or are most in need of transformation."
"According to GAO, and I’ll quote again, “The nation’s vulnerability can be reduced by limiting the magnitude of climate change through actions to limit greenhouse gas emissions. . . . While implementing adaptive measures may be costly, [GAO continues] there is a growing recognition that the cost of inaction could be greater and—given the government’s precarious fiscal position—increasingly difficult to manage given expected budget pressures.”'
"Mr. President, Congress has been asleep long enough. We have a tradition in this body of taking the accounting of GAO, our non-partisan watchdog, seriously, and of taking GAO’s High Risk List seriously. GAO now joins our defense and intelligence communities, our scientific research communities, and our state and local governments, and major sectors of private industry, who have all elevated climate change from their “to-do” list to their “must-do” list. Mr. President, it is time for Congress to wake up to its duties, and to get to work."
According to the Air Force Times, the Air Force has reversed their policy of sharing monthly statistics on the number of airstrikes launched from drones (aka remotely piloted aircraft (RPA)). In the interest of access and transparency, we've posted the original statistics from December '12, January '13, and February '13.
As scrutiny and debate over the use of remotely piloted aircraft (RPA) by the American military increased last month, the Air Force reversed a policy of sharing the number of airstrikes launched from RPAs in Afghanistan and quietly scrubbed those statistics from previous releases kept on their website.
Last October, Air Force Central Command started tallying weapons releases from RPAs, broken down into monthly updates. At the time, AFCENT spokeswoman Capt. Kim Bender said the numbers would be put out every month as part of a service effort to “provide more detailed information on RPA ops in Afghanistan.”
The Air Force maintained that policy for the statistics reports for November, December and January. But the February numbers, released March 7, contained empty space where the box of RPA statistics had previously been.
Additionally, monthly reports hosted on the Air Force website have had the RPA data removed — and recently.
Those files still contained the RPA data as of Feb. 16, according to archived web pages accessed via Archive.org. Metadata included in the new, RPA-less versions of the reports show the files were all created Feb. 22.
Lunchtime listen: Lawrence Lessig's Furman lecture titled "Aaron's Laws: Law and Justice in a Digital Age."Submitted by jrjacobs on Wed, 2013-03-06 13:18.
This will be well worth your time! Listen, grok, act!
On Tuesday, Feb. 19, Lawrence Lessig marked his appointment as Roy L. Furman Professor of Law and Leadership at Harvard Law School with a lecture titled "Aaron's Laws: Law and Justice in a Digital Age." The lecture honored the memory and work of Aaron Swartz, the programmer and activist who took his own life on Jan. 11, 2013 at the age of 26. Swartz spent the last two years fighting federal charges that he violated the Computer Fraud and Abuse Act.
On his blog, Lessig wrote, “When a law professor is given a “chair” s/he gives a lecture in honor of the honor. … After Aaron’s death, I asked the Dean to let me reschedule the lecture. But after some more thought, I’ve decided to make the lecture about Aaron, and about how we need to honor his work.”
A bill in the Minnesota legislature would allow government agencies to post official notices on their web sites instead of in newspapers and would require a "permanent record" of publications to be "maintained." Included would be publication of transportation projects, proceedings, official notices, and summaries of meetings. The bill apparently does not designate who will preserve the information nor does it specify how to preserve the information except for the caveat that the records must be in "a form accessible by the public."
- H.F. No. 1286, as introduced - 88th Legislative Session (2013-2014) Posted on Mar 05, 2013.
Subd. 4. Record retention. A political subdivision that publishes notice on its Web site under this section must ensure that a permanent record of publication is maintained in a form accessible by the public.
We would, of course, like to see a bit more detail of the implementation, perhaps even including requirements for deposit of records in a Trusted Repository, provisions for discovery, access, use, and bulk download, and, ideally, a state-law-compliant deposit into libraries.
One section of the bill does specify that print copies of "documents" published on the web must be made available at all public libraries within the jurisdiction. This is not a bad requirement, but it does seem to us to be short-sighted to require deposit of paper copies and not require deposit of digital copies. Libraries could provide enhanced access and service over what the government could provide and could provide redundant digital preservation.
Subd. 5. Print copies. When a political subdivision publishes exclusively on the Web site, it must also make print copies of all published documents available at the main office of the political subdivision, any other government offices designated by the political subdivision, all public libraries within the jurisdiction, and by mail upon request.
Activity this week at the State Agency Databases project (http://wikis.ala.org/godort/index.php/State_Agency_Databases) was focused on renovations to the Maryland page. More on that in a moment.
ORPHANS - WE'RE STILL STUCK AT THREE, PLEASE HELP OR SPREAD THE WORD
Our tally of states in need of volunteer document specialists is stuck at three as no one came forward this week to claim any of the following:
If you are interested in adopting one of these pages, please read our volunteer guide and make sure you can accept the responsibilities of a project volunteer. Then contact project coordinator Daniel Cornwall at firstname.lastname@example.org with a statement of interest and your favorite database from the page you are adopting.
If you're NOT interested in adopting one of the states above, would you please forward this note or blog about the opportunity? Maybe someone you know would like to share or deepen their knowledge of Hawaii, Minnesota or Oklahoma produced databases. Thank you in advance for your forwards and reblogs.
As mentioned at the top, all of the database activity this week can be attributed to Siu Min Yu, volunteer for Maryland. Some of the databases added to Maryland this week include:
Appellate Court Opinions in PDF - From the website, "Only reported opinions are available here. Reported opinions appearing on this website may not be the final, official version of the text of Appellate Court opinions/orders. Only the bound volumes of the Maryland Reports and Maryland Appellate Reports contain the final, official texts of the reported opinions of the Maryland Court of Appeals and the Maryland Court of Special Appeals."
General Assembly, Code of Public General Laws of Maryland - From the website, "The Code is arranged by subject matter and organized into “Articles” (e.g. Transportation Article), which are further subdivided into “titles”, “subtitles”, “sections”, “subsections”, “paragraphs”, subparagraphs”, etc."
Maps and Map Data - Search the GIS Map and Map Data Portal.
Water Quality Mapping Data - A searchable database containing hundreds of water quality maps.
Trademark Search - Search the Trademarks & Service Marks Database in the State of Maryland. From the website, "The list which results from your search is made up of summary information filed with the Office of the Secretary of State of Maryland. This information may not reflect the most recent information on file. The last updated date below does not necessarily pertain to the database information to be searched."
I was signed up for the Help! webinar on Homeland Security Digital Library, but unfortunately was unable to make the session. But luckily, all sessions are recorded and posted along with slides for future access on their site. This was a particularly interesting session presented by Greta Marlatt, the Outreach and Collection Development Manager for the Naval Postgraduate School’s Dudley Knox Library and the Content Manager for the Homeland Security Digital Library (HSDL). Greta pointed out several interesting aspects to the HSDL site:
- Compile hearing transcripts, prepared testimonies and video links from Committee pages
- Get permissions for hosting publications from other agencies and organizations (similar to our Everyday Electronic Materials (EEMs) project described earlier)
- Weekly email alerts for targeted search strategies
- Post CRS reports
- Homeland security related blogs aggregated
I think it's especially interesting that Greta and her team are compiling govt information and hosting digital files from other agencies and organizations. I highly recommend going back and listening to this presentation and ALL of the past Help! webinars!!
Kudos to Lynda Kellam and the rest of the group of North Carolina librarians putting out these interesting and informative Help! I'm an Accidental Government Information Librarian Webinars!
The Government Resources Section of the North Carolina Library Association welcomes you to a series of webinars designed to help us all do better reference work by increasing our familiarity with government information resources, and by discovering the best strategies for navigating them.
The Homeland Security Digital Library (HSDL) is the nation's premier research collection of open-source resources related to homeland security policy, strategy and organizational management. The HSDL is sponsored by the Naval Postgraduate School Center for Homeland Defense and Security and the U.S. Department of Homeland Security’s National Preparedness Directorate, FEMA.
Greta Marlatt is the Outreach and Collection Development Manager for the Naval Postgraduate School’s Dudley Knox Library and the Content Manager for the Homeland Security Digital Library (HSDL). She has over 30 years of experience working in libraries in various capacities. Ms. Marlatt has published several articles and is the author of a number of bibliographies and help guides for topics relating to Intelligence, Information Warfare, Special Operations, Homeland Security, Mine Warfare, Directed Energy Weapons, NBC Terrorism and more. She has given numerous presentations on topics related to conducting research in the homeland security and military arenas. Ms. Marlatt holds a Bachelor of Arts degree in English from Arizona State University, a Master of Library Science degree from the University of Arizona and a Master of Arts degree in National Security Studies from California State University, San Bernardino.