Home » Doc of the day (Page 2)

Category Archives: Doc of the day

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

Yay! Access to Congressionally Mandated Reports Act included in 2023 NDAA

This week the Senate passed the annual National Defense Authorization Act, which President Biden is expected to sign.

This year’s $858 BILLION bill is truly an omnibus bill as it provides a 4.6 percent pay increase for service members, increases the maximum allowable income to receive the Basic Needs Allowance, and adds funding to Basic Allowance for Housing. It addresses climate change and bolsters energy resiliency across the Department of Defense, gives new investments in the Historically Black Colleges and Universities, and offers new support for survivors of sexual assault in the military by further expanding reforms to the Uniform Code of Military Justice.

But most relevant for government information and for us here at FGI — who have been following and advocating for this for over 10 years! — the package includes the Access to Congressionally Mandated Reports Act, (the text starts on p. 3125 of this huge PDF).

The bill will require the Government Publishing Office (GPO) to create an online database for free public access to reports that agencies are required to submit to Congress, and requires agencies to provide copies of those reports to GPO for that purpose. GPO is directed to establish the database within one year, reusing existing systems to the extent possible. We assume these will live on govinfo.gov. This bill will go a long way toward solving (or at least relieving) the unreported documents issue that we have also been tracking on for many years. Executive branch reports are a particularly egregious problem as almost none of them make it into the Catalog of Government Publications (CGP) or are distributed to FDLP libraries.

This is an amazing early holiday present for the FDLP and for everyone who has been working these 10+ years to make this a reality!

Data is plural newsletter posts 2 amazing govinfo datasets: House Comm witnesses and 1900 census immigrant populations

I love my Data Is Plural newsletter, Jeremy Singer-Vine’s weekly newsletter of useful/curious datasets! You can check out his archive from 2015(!) to present and also explore the archive as a google spreadsheet or as Markdown files (a dataset of interesting datasets :-)).

Today’s edition was especially good on the govinfo front: 2 really awesome datasets on House Committee witnesses (1971 – 2016) and a high-resolution transcription and CSV file of the Census Bureau’s 1900 report on immigrant populations. Check them out, and don’t forget to subscribe to the Data is plural newsletter!

House committee witnesses. Political scientists Lauren C. Bell and J.D. Rackey have compiled a spreadsheet of 435,000+ people testifying before the US House of Representatives from 1971 to 2016. They began with a text file scraped from a ProQuest database, provided by the authors of a dataset that focused on social scientists’ testimony (DIP 2020.12.23). Then, they determined each witness’s first and last name; type of organization; the committee, date, title, and summary of the relevant hearing; and more.

Immigrant populations in 1900. The 1900 US Census’s public report includes a table counting the foreign-born residents of each state and territory — overall and disaggregated into a few dozen origins, which range from subdivisions of countries (Poland is split into “Austrian,” “German,” “Russian,” and “unknown” columns) to entire continents (“Africa”). It’s officially available as a low-resolution PDF. Reporters at Stacker, however, recently transcribed it into a CSV file for easier use.

Integrated 2020 Redistricting Data (PL94-171) from CISER

Thanks to the Cornell Center for Social Sciences (nee CISER) for posting the Census bureau’s Integrated 2020 Redistricting Data (PL94-171).

“On August 12, 2021, the Census Bureau released the Public Law 94-171 data, also known as Redistricting Data, in four (4) parts per state. Users who want to have the complete redistricting dataset for a state in one file have to integrate these four parts of the Census Bureau files.

We’ve integrated the four parts and made them available in convenient ready-to-use formats — SAS, SPSS, STATA, and CSV. We’ve also made available SAS, STATA, and SPSS programs to read the CSV files, label the variables, and assign variables their correct type (as per the data dictionary).”

DCinbox: amazing collection of Congressional e-newsletters

As many of our readers know, government information includes critical but often “grey” or ephemeral information including communications between our elected officials and their constituents. Here’s a very cool project called DCinbox, a database of Congressional e-newsletters. Lindsey Cormack, professor of politic at Stevens Institute of Technology, has been collecting Congressional e-newsletters since 2009. There are nearly 90,000 unique e-newsletters in the database — which is both searchable and available as a full dataset! This is a rich dataset that can help analyze partisan differences and ideology in all kinds of policy matters.

Congressional e-newsletters. For more than a decade, political scientist Lindsey Cormack’s DCinbox project has collected “every official e-newsletter sent by sitting members of the U.S. House and Senate.” You can search the corpus online and also download all the emails as a series of CSV files, grouped by month. For each of the 130,000+ mailings, the files provide the date, subject, body, and sender’s Bioguide ID. (April 2020 was the highest-volume month, with more than 2,300 messages, nearly all of them mentioning the coronavirus.)

HT to Data is Plural 2021.03.03 edition. Please subscribe to their weekly newsletter and see all of the datasets that they have highlighted in previous newsletters!

Library of Congress Completes Digitization of 23 Early Presidential Collections

This is awesome! The Library of Congress has just finished a 20 year(!) project to digitize the papers of the Presidents from George Washington to Calvin Coolidge. I hope GPO is going to catalog these collections so that the records get into library catalogs!

The Library of Congress has completed a more than two decade-long initiative to digitize the papers of nearly two dozen early presidents. The Library holds the papers of 23 presidents from George Washington to Calvin Coolidge, all of which have been digitized and are now available online.

The Library plans to highlight each presidential collection on social media in the weeks leading up to the next presidential inauguration on Jan. 20, 2021.

Full Set of Presidential Collections

Archives