jajacobs's blog

Lots of interesting information and ideas at APDU 2012

The annual meeting of the Association of Public Data Users was held recently in Washington and it produced interesting discussions and insights into the current state of and future directions for official statistics. Here are links to presentations from the conference and an excellent overview of the conference by Peggy Garvin.

Complications of the U.S. Public Domain

Here is a useful and informative article about copyright by Peter Hirtle, Senior Policy Advisor at Cornell University Library. It includes a section on "The confusing case of government works."

Think 1923 is the magic year? Think again! "Probably the oldest work still protected by copyright in the U.S. is a letter from John Adams to Nathan Webb written on Sept. 1, 1755."

Congress.gov, the new THOMAS, launched in beta

The Library of Congress unveiled a new Web search tool for bills and other Congressional records Wednesday that will eventually replace the 17-year-old Thomas.gov website.

  • Congress.gov. Also see: About page.

    Congress.gov makes federal United States legislative information freely available to the public. Launched Sept. 19, 2012, this version of the site is an initial beta release of Congress.gov, created as a successor to THOMAS.gov, the current public site for legislative information. The Congress.gov beta site contains legislation from the 107th Congress (2001) to the present, member of Congress profiles from the 93rd Congress (1973) to the present, and selected member profiles from the 80th through the 92nd Congresses (1947 to 1972). Over the next two years, Congress.gov will be adding information and features, eventually incorporating all of the information currently available on THOMAS.gov.

  • Smartphone friendly, congressional search site unveiled, By Joseph Marks, NextGov (Sep 19, 2012).
     
  • Congress launches THOMAS successor Congress.gov, by Daniel Schuman, Sunlight Foundation (Sept. 19, 2012)

    What's noticeable about this evolving beta website, besides the major improvements in how people can search and understand legislative developments, is what's still missing: public comment on the design process and computer-friendly bulk access to the underlying data.

Update:
Here is another story:

  • What Congress.gov Means for a Congressional API, by Nick Judd and Miranda Neubauer TechPresident (September 19 2012)

    "I'm impressed," said Josh Tauberer, whose GovTrack scrapes data from THOMAS to provide it in a machine-readable form for other websites like OpenCongress, in an email. "From its new faceted search to its mobile-friendly HTML, they really hit the technology on the nail. And there's more explanation for people who aren't legislative pros. They may be slowly catching up to GovTrack.

    "This new site shows that the LOC actually has the technical chops to implement raw data properly, which was a serious concern of mine before," Tauberer also wrote.

    That said, Tauberer pointed out that the new site offers "no new actual information." House leadership has promised to offer access to the underlying data that fuels THOMAS and has repeatedly expressed a commitment to doing it. They just haven't committed to doing it during this Congress. And the lack of action on something that seems to them to be eminently doable has advocates kind of frustrated.

    Gayle Osterberg, Director of Communications for the Library of Congress, seemed to indicate in an email that the Library of Congress is ready to cooperate. They just need Congress — meaning the House and Senate both — to give them the go-ahead.

Another update:

  • Congress.gov Beta: An Early Look at a New THOMAS, by Peggy Garvin, InfoToday, (September 27, 2012).

    The Congress.gov beta is still in the early stages of incorporating existing THOMAS content and implementing the improved search functions that THOMAS users have been waiting for. The Law Library of Congress, which is managing the transition, is anxious to get your feedback and suggestions via its form at http://beta.congress.gov/survey.

Sunlight on Thomas (Beta) and LIS and the future of Legislative Information

Looking Forward to the THOMAS Beta Website, by Daniel Schuman, Sunlight Foundation (Sept. 14, 2012).

In the near future, Congress is expected to release a major upgrade to its aging legislative information website THOMAS. The long-overdue update is part of a much larger effort to "enhance the effectiveness of mission-critical systems," a response to significant public and internal pressure to improve congressional efficiency and transparency. The launch of "THOMAS Beta" is the first step towards developing what the Library of Congress describes as a completely "modern legislative information system" that will replace THOMAS and Congress' more sophisticated internal legislative tracking website "LIS" in FY 2014. Both THOMAS and LIS will stay online alongside the beta website for several years.

While THOMAS Beta has been shown to stakeholders inside Congress, as far as I am aware there has been no formal engagement process with the public to identify specifications, discuss wireframes, or generally make sure the site meets the public's needs.

GPO Selects SDL Technology to Digitally Manage and Publish U.S. Congressional Legislation

Press Release:

  • U.S. Government Printing Office Selects SDL Technology to Digitally Manage and Publish U.S. Congressional Legislation, September 12, 2012 09:32 ET.

    SDL (LSE:SDL), the leading provider of Global Information Management solutions, today announced that one of the world's largest digital information facilities, the U.S. Government Printing Office (GPO), has selected SDL to automate the publishing process for printing and accessing select Congressional and Federal agency legislation. GPO provides the three branches of the U.S. federal government with expert publishing and printing services and awarded SDL the Composition System Replacement (CSR) contract following a rigorous search and evaluation process.

    All U.S. Congressional legislation will be published using SDL XML Professional Publisher (XPP™), an automated XML publishing engine for the production of high-volume and complexly formatted publications. SDL XPP software will integrate with GPO's Federal Digital System and be the central point for composition of content for print and online access. SDL XPP replaces a proprietary system that was developed internally but could not scale to support the growth of the GPO.

More here:

  • Government Printing Office adopts internal XML system, By Joseph Marks, Government Executive (September 12, 2012).

    The Government Printing Office is adopting a new system that will manage and publish congressional bills and other publications entirely in a pared down and machine-readable XML format, the company providing the system announced Wednesday.

    GPO plans to launch a “proof of concept” for the new system with congressional bills before expanding it to other publications such as the Federal Register and the Congressional Record, Chief Technology Officer Ric Davis told Nextgov.

Comparing LIS and Thomas

The Congressional Research Service has published an update to its handy guide for finding current legislation and regulations:

For those experienced in legislative and regulatory searching there won't be anything new or surprising here, but it is a handy introduction and reference.

One thing I particularly liked was the comparison on p. 13 of the "Legislative Information System," which provides access to legislative information to Members of Congress and their staff, and THOMAS, which makes information on federal legislation freely available to the public. That's right, one system for Congress and a separate system for us ordinary folk.

Here is a sample:

LIS THOMAS
Best used for Finding the most complete legislative information Best used for Working with constituents
Links from Bill Summary & Status display to CRS reports No CRS reports
Links to Capitol Hill and selected outside sources of floor and committee schedule information. Minimal links
Special advanced search capabilities Advanced search capabilities only in Bill Summary & Status database

Again, this won't be news to most of you, but it is a nice summary of what we are missing.

OMB issues Managing Government Records Directive

M-12-18, Managing Government Records Directive (August 24, 2012) (7 pages, 2.62 mb).

MEMORANDUM FOR THE HEADS OF EXECUTIVE DEPARTMENTS AND AGENCIES AND INDEPENDENT AGENCIES Office of Management and Budget From Jeffrey D. Zients, Acting Director, Office of Management and Budget, and David S.Ferriero, Archivist of the United States SUBJECT: Managing Government Records Directive.

This Directive creates a robust records management framework that complies with statutes and regulations to achieve the benefits outlined in the Presidential Memorandum. This Directive was informed by agency reports submitted pursuant to Sec. 2 (b) of the Presidential Memorandum and feedback from consultations with agencies, interagency groups, and public stakeholders.

This Directive requires that to the fullest extent possible, agencies eliminate paper and use electronic recordkeeping. It is applicable to all executive agencies and to all records, without regard to security classification or any other restriction.

This Directive also identifies specific actions that will be taken by NARA, the Office of Management and Budget (OMB), and the Office of Personnel Management (OPM) to support agency records management programs. In addition, NARA will undertake a review to update relevant portions of the Code of Federal Regulations to take into account the provisions of this Directive.

Census Bureau Research Data Products

The Census Bureau provides the Research Data Products page with links to new tools that make data more accessible and understandable. Bureau researchers also create new data products from existing data collections.

There are some very interesting services here! Check out the innovative "synthetic data" projects: Synthetic Survey of Income and Program Participation (a Beta version of synthetic microdata on individuals) and the Synthetic Longitudinal Business Database (Beta version of synthetic microdata on all U.S. establishments) as well as the more traditional: Small Area Income and Poverty Estimates Interactive Map Tool and Quarterly Workforce Indicators and much more!

  • Research Data Products
    • Demographic - People and Households
    • Economic - Businesses
    • Longitudinal Employer-Household Dynamics - Workforce

State Government Digital Documents: Providing Access and Preservation

The Colorado State Publications Library Digital Repository collects and preserves born digital publications from Colorado state agencies. Its mission is to provide Colorado residents with permanent public access to information produced by state government.

In a post to the Bestpractices mailing list today, Debbi MacLeod, the Director of the Colorado State Publications Library, says that library joined The Colorado Alliance of Research Libraries' Digital Repository (ADR) in 2008. There are now more than 9,500 documents in the ADR.

One of the benefits of this was increased exposure of the collection because ADR content is exposed to search engines. The collection went from an average of 1,000 to 10-15,000 hits per month.

Another benefit is shared preservation responsibility. MacLeod says:

[W]e no longer have to worry about a local catastrophic server failure. ADR staff keep track of the latest developments in digital preservation. They keep on top of server maintenance and periodic testing to ensure that files deposited in the system have not been corrupted. Also, there is an ongoing pilot with DuraCloud to test the pros and cons of a distributed back-up system using cloud technology.

This seems to be a successful example of building and sharing infrastructure and responsibilities in a way that leverages the strengths of cooperating organizations to accomplish more than any one could on its own.

Even better, MacLeod notes that ADR is willing to work with other states!

While The Alliance is located in Colorado, they are interested in expanding their base and having other state collections of documents or special collections join their consortium. Much of the ground work for the particulars to state documents has now been done and can be applied to other states. Robin Dean is the Director, of the Alliance Digital Repository. She can be reached at 303-759-3399 x110 or robin at coalliance.org to start a conversation.

Data from city governments on data.gov

The federal government's data portal data.gov has a space for American cities to make their data available: cities.data.gov. Data from four cities, Chicago, New York, San Francisco, and Seattle, are available so far.

  • Cities.Data.gov

    Showcasing the applications and opportunities for harnessing the power of open data across the nation. City officials and developers working together to help improve the information available to city residents. Data in Cities.Data.Gov is not federal data.

  • We Want You: City Data Edition, by Nate Berg, The Atlantic Cities (Aug 02, 2012).

    The new clearinghouse features thousands of openly accessible data streams, including information on building permits filed in these cities, a regularly updated feed of Seattle Fire Department 911 dispatches, budget documents and tons of maps of things like parks, film locations and building footprints.

    Chicago has 1,826 data feeds on the site, New York has 1,087, Seattle has 711, and San Francisco has 310. The federal government has made 6,560 of their own available.

Syndicate content