Home » post

Category Archives: post

Archives

Senate to publish bulk data in XML

Big News!

Senate Joins House In Publishing Legislative Information In Modern Formats, by Daniel Schuman. Congressional Data Coalition (December 18, 2014).

There’s big news from today’s Legislative Branch Bulk Data Task Force meeting. The United States Senate announced it would begin publishing text and summary information for Senate legislation, going back to the 113th Congress, in bulk XML. It would join the House of Representatives, which already does this. Both chambers also expect to have bill status information available online in XML format as well, but a little later on in the year.

There is more good news, too. Read Daniel’s complete report at the link above.

What makes a “fugitive document” a fugitive?

First off, I’d like to thank GPO (now the Government Publishing Office!) for posting about this Historic Fugitive Document Available through the CGP. I’d like to give a little context and parse out what makes a fugitive document — a document that is within scope of the Federal Depository Library Program (FDLP) but for whatever reason is not distributed by GPO to depository libraries — a fugitive?

Fugitives are a rapidly growing problem as, according to GPO, 97% of all US documents are now born-digital, and most federal agencies are now publishing born-digital documents on their own .gov sites, thus cutting GPO out of the publishing process — and eroding the national bibliography that is the Catalog of Government Publications (CGP) (BTW, my colleague Jim Jacobs (yes there are two of us!) and I will be giving a “Help! I’m an Accidental Government Information Librarian” webinar on fugitives next month so stay tuned for the announcement!).

In the case of the 1991 “Report on Semiconductors, Fiber Optics, Superconducting Materials, and Advanced Manufacturing”, an emeritus professor gifted this document to my colleague Stella Ota, our physics and astronomy bibliographer, who passed it along to me. I thought for sure we’d have this stand-alone or in the [United States Congressional Serial Set], the long-standing official collection of Congressional reports and documents near and dear to many govt information librarians’ hearts — and if you’re particularly nerdy, there’s a great book recently published about the Serial Set by Andrea Sevetson and Mary Lou Cumberpatch!

But the more I looked, the less I found. It was announced as transmitted to Congress in the Congressional Record (137 Cong Rec S 4449) and in the Public Papers of the President. But it didn’t show up in the Serial Set or in my wider net of the CGP, FDsys, or Monthly Catalog (another gem, the precursor to the CGP published since 1895). It shows up as a stub in Google Books, but nothing in Hathitrust. No libraries are listed in the WorldCat record. It simply hadn’t been published, though it was announced that it had. (pro tip: don’t always believe the Congressional Record when they say something has been published, check all the sources to make sure!).

I don’t know how this Stanford emeritus professor came to have the document in his possession, but it had clearly fallen through the FDLP cracks. Thanks to Astrid Smith, one of our fine staff that work in the Stanford Library digitization lab in Digital Library Systems and Services (DLSS), it was quickly and expertly digitized, OCR’d, and stored in our Stanford Digital Repository, and also made physically available in the library.

So there you have it, a day in the life of 1 fugitive US publication.

Historic Fugitive Document Available through the CGP

Details
Last Updated: December 18 2014
Published: December 18 2014
The 1991 report prepared by the Technology Administration, U.S. Department of Commerce, “Report of the President to the Congress on Federal Policies, Budgets, and Technical Activities in Semiconductors, Fiber Optics, Superconducting Materials & Advanced Manufacturing,” is now available through GPO’s Catalog of U.S. Government Publications.

OCLC Number: 898189404
CGP System Number: 000938821
SuDoc Class: C 1.202:SE 5
Item Number: 0129-B (EL)
PURL: http://purl.fdlp.gov/GPO/gpo53991

GPO thanks James Jacobs and the staff at Stanford University for collaborating with GPO to provide the public with access to this historic fugitive document.

GPO Is Now The Government Publishing Office

Press Release from GPO

GPO Is Now The Government Publishing Office

FOR IMMEDIATE RELEASE: December 17, 2014 No. 14-27

GPO IS NOW THE GOVERNMENT PUBLISHING OFFICE

WASHINGTON – An agency whose mission has been producing, publishing, and recording our Nation’s history has made some history of its own. Section 1301 of H.R. 83, the legislation providing consolidated and further continuing appropriations for FY 2015 that was recently passed by Congress and signed into law last night by President Barack Obama, changes the name of the Government Printing Office to the Government Publishing Office. Publishing reflects the increasingly prominent role that GPO plays in providing access to Government information in digital formats through the agency’s Federal Digital System, apps, eBooks, and related technologies. The information needs of Congress, Federal agencies, and the public have evolved beyond only print and GPO has transformed itself to meet its customers’ needs.

Link to H.R. 83: http://www.gpo.gov/fdsys/pkg/BILLS-113hr83enr/pdf/BILLS-113hr83enr.pdf

“This is a historic day for GPO. Publishing defines a broad range of services that includes print, digital, and future technological advancements. The name Government Publishing Office better reflects the services that GPO currently provides and will provide in the future,” said Davita Vance-Cooks, who now holds the title of Director of the Government Publishing Office, the agency’s chief executive officer. “I appreciate the efforts of the Members of Congress for their support and understanding GPO’s transformation. GPO will continue to meet the information needs of Congress, Federal agencies, and the public and carry out our mission of Keeping America Informed.”

GPO opened its doors on March 4, 1861, the same day Abraham Lincoln was sworn into office as President of the United States. Since that day, GPO employees have produced our country’s most important documents such as the preliminary version of the Emancipation Proclamation, The Warren Commission Report, The 9-11 Commission Report, the U.S. passport, the Federal Budget, and all Congressional materials.

GPO is the Federal Government’s official, digital, secure resource for producing, procuring, cataloging, indexing, authenticating, disseminating, and preserving the official information products of the U.S. Government. The GPO is responsible for the production and distribution of information products and services for all three branches of the Federal Government, including U.S. passports for the Department of State as well as the official publications of Congress, the White House, and other Federal agencies in digital and print formats. GPO provides for permanent public access to Federal Government information at no charge through our Federal Digital System (www.fdsys.gov), partnerships with approximately 1,200 libraries nationwide participating in the Federal Depository Library Program, and our secure online bookstore. For more information, please visit www.gpo.gov

Senate Torture Report: the Senate Speaks

Our friend Daniel Schuman from Citizens for Responsibility and Ethics in Washington (CREW) (nee Sunlight) has put together a helpful ebook “Senate Torture Report: The Senate Speaks” (archived copy as ePub, PDF, full text, etc.). The ebook pulls together the speeches on the floor of the Senate of several senators, including the Intelligence Committee Chair Diane Feinstein explaining their views and findings. “These speeches are a helpful, succinct introduction to what is now being called the Torture Report.”

On December 9, 2014, the Senate Intelligence Committee published a report severely criticizing CIA interrogation practices as brutal and ineffective. The committee released to the public a redacted version of the report’s executive summary—nearly 500 pages long—the culmination of seven years’ work. It includes the views of the majority of committee members, an additional statement by Senator Jay Rockefeller, and the views of dissenting committee members. The full report is classified and runs nearly 6,700 pages.

via Daniel Schuman.

The Official Senate CIA Torture Report

Update


GPO has released an official version of the “THE SENATE CIA REPORT” as Senate Report 113-228. The digital version is available on GPO’s Federal Digital System (FDsys):
http://www.gpo.gov/fdsys/pkg/CRPT-113srpt288/pdf/CRPT-113srpt288.pdf
The print version is available for purchase at GPO’s retail and online bookstore for $29.
http://bookstore.gpo.gov/products/sku/052-071-01571-0

This is a single-volume, 712 page version. It contains:

Letter of Transmittal to Senate from Chairman Feinstein — i
Foreword of Chairman Feinstein — iii
Findings and Conclusions — x
Executive Summary — 1
Additional Views of Senator Rockefeller — 500
Additional Views of Senator Wyden — 503
Additional Views of Senator Udall of Colorado — 506
Additional Views of Senator Heinrich — 510
Additional Views of Senator King — 512
Additional Views of Senator Collins — 515
Minority Views of Vice Chairman Chambliss, Senators Burr, Risch, Coats, Rubio, and Coburn — 520
Minority Views of Senator Coburn, Vice Chairman Chambliss, Senators Burr, Risch, Coats, and Rubio — 678
Minority Views of Senators Risch, Coats, and Rubio — 682

GPO Press Release:

FOR IMMEDIATE RELEASE: December 15, 2014

GPO RELEASES THE OFFICAL DIGITAL & PRINT VERSIONS OF THE SENATE CIA REPORT

WASHINGTON – – The U.S. Government Printing Office (GPO) makes available the official and authentic digital and print versions of the Report of the Senate Select Committee on Intelligence Committee Study of the Central Intelligence Agency’s Detention and Interrogation Program, together with a forward by Chairman Feinstein and Additional and Minority Views (Senate Report 113-288).

This document comprises the declassified Executive Summary and Findings and Conclusions, including declassified additional and minority views. The full classified report will be maintained by the Committee and has been provided to the Executive Branch for dissemination to all relevant agencies.


The release of the Senate’s Study of the CIA’s Detention and Interrogation Program presents some interesting issues for government documents collections.

Issues

There are 3 separate documents and they are easily findable on the web on different web sites, but not all sites have all 3 documents and the the different copies of the individual documents are not the same.

The “official” copies are (at least today) listed on the home page of Senate Committee’s web site [see below)], but are not listed on the Committee’s Publications Page or its Press Release page – perhaps because the report is not an official committee document with an assigned “Document” or “Report” number. Presumably it will not be in FDsys unless or until it gets an official Document or Report designation.

(Why isn’t it “official”? The report was initially intended to be a full committee report. In 2009 the Committee voted 14–1 to initiate the study. But in 2009 Republicans on the Committee withdrew from active participation in the study.)

My speculation is that the different PDF files that you can find on the web are slightly different because each one was produced by scanning a paper copy with different software. I do not know if the Committee only distributed a paper copy but I do know that even its own PDF copy is (apparently) a scanned copy. (You can tell because, if you try to copy the text from the PDF, you will discover that it is badly OCR’d (optical character recognition) text. For example, the digital text of names of Senators is sometimes badly converted: Chambliss becomes “CHAMBUSS” and Rubio becomes “Rvbio”). The official copies were created using Adobe PDF Scan Library 3.1 and ScandAll PRO V2.0.12.

Official Reports and Statements

The Senate Select Committee on Intelligence currently has links to three documents on its home page.

The CIA has its own responses to the report, currently listed on its Reports page.

Other official statements.

Unofficial Copies

A web search for the title of the title (“Committee Study of the Central Intelligence Agency’s Detention and Interrogation Program”) leads to many sites with copies. Many of these are, apparently directly from the Committee site, but at least one news organization (the New York Times) evidently made its own scanned copy and digitized text version of the main report.

  • The New York Times has a PDF copy [108.4MB, 528 pages] and a plain text copy. The PDF version was created using Acrobat 11.0.9 Paper Capture Plug-in and Xerox WorkCentre 5150. Both are stored with an Amazon cloud service.

Timeline

ProPublica has created a useful timeline to put the report in perspective.

FDLP Library Actions

What can FDLP Libraries (or any library) do to ensure that their uses will be able to find and get unaltered, official, copies in the future? Just relying on the web may not be adequate, secure, consistent, transparent, or guaranteed. There are several issues. The existing links to even the official documents may not be stable. The official digital copies are only digital surrogates of the original paper copy. There are already other alternative digital surrogates available. The quality of the surrogates varies and the links to those copies may also not be stable.

I suggest the following actions by libraries:

  • Get copies of the official digital versions directly from the Committee web site as soon as possible (see links above).
  • Create a digital “hash” or “checksum” of the documents you download. (See a list of various tools and a discussion of checksums for preservation, if you are unfamiliar with the concepts.)
  • Catalog your copies and include them in your OPAC or other official library inventory and discovery databases. Include adequate metadata that describes how, when, and where you got your copies.
  • Ideally, you should store your copies in a Trusted Digital Repository. Unfortunately, there are, as yet, very few certified TDRs. Short of that, be sure that you have copies stored in more than one geographic location and that you have a way of verifying over time (using the checksum) that the files you stored have not been altered or corrupted.