- U.S. Government Printing Office Selects SDL Technology to Digitally Manage and Publish U.S. Congressional Legislation, September 12, 2012 09:32 ET.
SDL (LSE:SDL), the leading provider of Global Information Management solutions, today announced that one of the world's largest digital information facilities, the U.S. Government Printing Office (GPO), has selected SDL to automate the publishing process for printing and accessing select Congressional and Federal agency legislation. GPO provides the three branches of the U.S. federal government with expert publishing and printing services and awarded SDL the Composition System Replacement (CSR) contract following a rigorous search and evaluation process.
All U.S. Congressional legislation will be published using SDL XML Professional Publisher (XPP™), an automated XML publishing engine for the production of high-volume and complexly formatted publications. SDL XPP software will integrate with GPO's Federal Digital System and be the central point for composition of content for print and online access. SDL XPP replaces a proprietary system that was developed internally but could not scale to support the growth of the GPO.
- Government Printing Office adopts internal XML system, By Joseph Marks, Government Executive (September 12, 2012).
The Government Printing Office is adopting a new system that will manage and publish congressional bills and other publications entirely in a pared down and machine-readable XML format, the company providing the system announced Wednesday.
GPO plans to launch a “proof of concept” for the new system with congressional bills before expanding it to other publications such as the Federal Register and the Congressional Record, Chief Technology Officer Ric Davis told Nextgov.
FGI just signed the letter below written by the Sunlight Foundation asking Congress to improve public access to legislative information by directing the Library of Congress to make their Thomas database accessible in bulk format. If you and/or your organization believe that free access to Congressional information is of critical importance, please please consider adding your name to the list of signatories on the letter. Daniel Schuman, Sunlight Foundation's policy counsel and director of the Advisory Committee on Transparency, requests that people sign on by COB on Monday April 2nd. Interested people may also email Daniel at email@example.com) with how they would like to be identified on the letter. Daniel thanks you and so do we!
We are writing to ask you to improve public access to legislative information by directing the Library of Congress to publish the THOMAS database online. Congress created THOMAS with the mission of making federal legislation freely available to the public. While times have changed, and technologies have changed, THOMAS has not kept up.
As a result, millions of Americans access basic information about legislation and congressional actions through online information providers like GovTrack, OpenCongress, and Washington Watch. These free non-governmental websites are forced to rely on brittle programs to harvest information from THOMAS’s complex website. This harvesting is imperfect, expensive, and time consuming. The better approach -- which has been adopted by industry and many in government -- is to publish legislative information "in bulk" in addition to other means.
Bulk access would in essence make the entire legislative database available for download, instead of requiring users to gather information by visiting hundreds or thousands of web pages. It would make it easier for third parties to build innovative new tools, and ensure that Americans have the most accurate information at their fingertips. Congress already expressed its support for bulk access downloads in 2009, but the Library of Congress, which oversees THOMAS, has not acted. In the meantime, GPO, the executive branch, and the House of Representatives are already publishing information online in bulk.
The time has come for action. In this year's legislative branch appropriations bill, we urge you to direct the Library of Congress to implement bulk access to THOMAS within 120 days. The Library should also immediately create an advisory committee on improving public access to legislative information composed of people inside and outside of government. Congress should ensure that THOMAS lives up to its potential of making the legislative branch more open and transparent.
For more information, please contact Daniel Schuman, policy counsel, the Sunlight Foundation, at 202-742-1520 x 273 or firstname.lastname@example.org
I'm going to reprint James' comment from Wednesday on the Michigan digitization project here because I think it merits some serious discussion. There were a series of comments on the way the government documents have been cataloged in the Michigan catalog because the variance in cataloging has caused a lot of the documents to be barred from public viewing "due to copyright":
I see a collaborative project! it'd be great to be proactive on UMich's govt pubs. Rather than having to submit a form when an item is found that should be accessible/in the public domain, wouldn't it be cool if UMich put up a list of all their documents (in a wiki?) and let the community/public have at it to verify "public domainness" of documents. Documents classes could assign reviewing as well.
There is precedent for this kind of collaborative project. In 2006, the federal government set up a Web site to make public a vast archive of Iraqi documents captured during the war (which was later shut down because detailed accounts of Iraq’s secret nuclear research were available publicly! oops!!). A site called LibriVox has volunteers who read chapters of public domain books, many of which have been digitized by Project Gutenberg.The point is, let's leverage the power of the internet to help get govt information out to the public!
As it happens, right after I read James' comment I was in a meeting where I found out about another project occurring at the law school library at Rutgers University. The project, Congressional Documents Online, is a full-text archive of Congressional Hearings and Committee Prints from the Rutgers law library collection. The Law Library is in the process of digitizing its print collection of Congressional documents and the website says that there are "7064 documents available, totalling: 1581950 pages, 238814558897 total bytes, as of: Wed Dec 19 14:47:37 EST 2007". All are freely available online. There's a simple search box and a browseable list of the documents.
The Sunlight Foundation has just issued a press release about its new Open House Project. The goal of the project is to explore ways that the workings of the U.S. House of Representatives can be made more transparent using the Internet. The most encouraging news: the project has the support of Speaker Pelosi. The initial list of participants includes high-profile names such as Markos Moulitsas-Zuniga of Daily Kos and my favorite masher of congressional info, Josh Tauberer of Govtrack.us. The group intends to make a report to Congress in March 2007.
The Center for Democracy and Technology asked the public to identify categories of data that should be on the Web. From thousands of the lists, they narrowed it down to the 10 most wanted government documents.
1) Congressional Research Service (CRS) reports (Congress) -- CRS uses taxpayer dollars to produce reports on public policy issues ranging from foreign affairs to agriculture to health care. All of the reports are posted online, but access is available only to congressional offices through an intranet system. Citizens can order paper copies of the reports through their Member of Congress, but only by mail. Moreover, the general public cannot search through past reports, and a comprehensive index of the reports is not available online. (Some Members have posted some CRS reports online.) In the CDT/OMB survey, the CRS reports were the category of documents most frequently listed sought after by researchers, students, librarians, government employees, and citizens alike.
2) Supreme Court Web site (including opinions and briefs) (Judiciary) -- The Supreme Court of Mongolia has its own official Web site, but the U.S. Supreme Court doesn't. Instead, the Court refers people to one or more of 10 different unofficial Web sites, which publish various subsets of opinions, updated with varying frequency. While Court officials have said that they are exploring the possibility of creating a Web site, there is no official source of information from the highest court in the land. In addition to opinions, the Court should post briefs, at least in cases accepted for oral argument. CAPTURED!
3) State Department's Daily Briefing Book (State) -- Nearly every day, the State Department prepares for its press secretary a book of answers to every question that might be asked during the daily press conference. These briefing books represent considerable effort on the part of Department officials and constitute the best overview of American foreign policy positions on breaking issues at any given time. All the material is cleared for public consumption, yet if a reporter doesn't ask a question on a particular topic, the information doesn't get released.
4) Pesticide Safety Database (EPA) -- Under the Federal Insecticide, Fungicide, and Rodenticide Act (FIFRA), the EPA is required to maintain an extensive database on pesticides and pesticide "incidents" by location. This information concerns the health of millions of Americans. Right now, individuals can make a paper request for information about a particular pesticide or area of the country, but the information is not searchable online and cannot be compared across communities. Internet tools could assist in understanding and analyzing this data. Providing this information online in the form of a a searchable database, as the EPA has done with similar data sets, would enhance the public's understanding of the pesticide risks in local communities.
5) Full Text of all Congressional Hearings (Congress) -- Prompt access to written statements and hearing transcripts is essential to the public's participation in the legislative process. Printed records of hearings are often not available until a year or more after a hearing, sometimes long after legislation has been enacted or the term has ended. Some Committees regularly place witness statements and the full text of hearings online; others do not. Congressional committees should be more consistent in automating the process of posting witness statements and hearing transcripts to ensure speedy public access. Moreover, as transcripts are all word-processed, the Government Printing Office (GPO) could easily make them permanently available online (if they were provided to GPO).
6) Court Briefs (DOJ) -- The public deserves to know how the government interprets the laws. The Justice Department lawyers represent the US government and therefore are the people's lawyers. Their briefs are public documents presenting the position of the US government. Since these documents are word-processed, they could very easily be put online, starting with significant criminal and civil cases.
7) Congressional votes in searchable database (Congress)-- Congress has made roll-call votes available online in XML format, but has not yet provided a way to search votes by Member's name. Public accountability would be greatly enhanced if citizens could find out how their Members of Congress voted through an online, searchable database of recorded votes.
8) Endangered Species Recovery Plans (DOI) -- These documents detail how the government plans to defend endangered species and eventually get them off of the endangered species list. The Fish and Wildlife Service has told us that it plans to put these 700+ documents online eventually; meanwhile, researchers, students, and concerned citizens have to pay to have them sent in paper. C A P T U R E D !
9) Official Gazette of Trademarks (DOC) -- The Official Gazette of the United States Patent and Trademark Office (USPTO) is the official journal relating to patents and trademarks. It has been published weekly since January 1872. In searching for a reason why this publication is not online, USPTO said it was up to GPO. GPO said that it will put online anything USPTO or any other agency asks it to.
10) Circuit Court Web Sites (Judiciary) --The federal Circuit and District Courts have been slow to embrace the Web. Only 5 of the 12 Circuit Courts of Appeals have Web sites providing access to opinions at no cost. While a number of law schools have stepped in to fill the gap, all circuit courts should have official sites providing the public with free access to court opinions. If five can do it, why can't the rest?
My first take on the THOMAS beta is now available online at LLRX.com -- specifically at http://www.llrx.com/columns/govdomain23.htm. (The full January issue of LLRX is not up yet, so you have to go directly to this URL.) The Library of Congress is looking for feedback, so be sure to use the THOMAS beta comment page if you have something to say.