Home » Posts tagged 'government data'
Tag Archives: government data
Data is plural newsletter posts 2 amazing govinfo datasets: House Comm witnesses and 1900 census immigrant populations
I love my Data Is Plural newsletter, Jeremy Singer-Vine’s weekly newsletter of useful/curious datasets! You can check out his archive from 2015(!) to present and also explore the archive as a google spreadsheet or as Markdown files (a dataset of interesting datasets :-)).
Today’s edition was especially good on the govinfo front: 2 really awesome datasets on House Committee witnesses (1971 – 2016) and a high-resolution transcription and CSV file of the Census Bureau’s 1900 report on immigrant populations. Check them out, and don’t forget to subscribe to the Data is plural newsletter!
House committee witnesses. Political scientists Lauren C. Bell and J.D. Rackey have compiled a spreadsheet of 435,000+ people testifying before the US House of Representatives from 1971 to 2016. They began with a text file scraped from a ProQuest database, provided by the authors of a dataset that focused on social scientists’ testimony (DIP 2020.12.23). Then, they determined each witness’s first and last name; type of organization; the committee, date, title, and summary of the relevant hearing; and more.
Immigrant populations in 1900. The 1900 US Census’s public report includes a table counting the foreign-born residents of each state and territory — overall and disaggregated into a few dozen origins, which range from subdivisions of countries (Poland is split into “Austrian,” “German,” “Russian,” and “unknown” columns) to entire continents (“Africa”). It’s officially available as a low-resolution PDF. Reporters at Stacker, however, recently transcribed it into a CSV file for easier use.
A new report from the Government Accountability Office (GAO) has found that Broadband access in tribal areas is likely even worse than previously thought because Federal Communications Commission data overstates deployment. Now it seems like we need to be worried about government data going away AND the veracity of government data.
“BROADBAND INTERNET: FCC’s Data Overstate Access on Tribal Lands”. GAO-18-630. September 2018.
The GAO report describes problems with the FCC’s Form 477 data collection, in which Internet providers submit deployment data to the commission twice a year.
The FCC provides subsidies to carriers to deploy broadband in areas where access is limited, such as through the Connect America Fund. Inaccurate data “could affect FCC’s funding decisions and the ability of tribal lands to access broadband in the future,” the GAO wrote.
“[The] FCC considers broadband to be ‘available’ for an entire census block if the provider could serve at least one location in the census block. This leads to overstatements of service for specific locations like tribal lands,” the GAO wrote.
Moreover, the “FCC does not collect information on several factors—such as affordability, quality, and denials of service—that FCC and tribal stakeholders stated can affect the extent to which Americans living on tribal lands can access broadband services,” the GAO wrote.
The FCC also “does not have a formal process to obtain tribal input on the accuracy of provider-submitted broadband data,” the report said. About half of tribal stakeholders interviewed by the GAO said it’s difficult to get information about broadband deployment directly from providers.
This is a very cool idea as well as an important policy statement. Sunlight Foundation and a diverse coalition of government transparency, data innovation, scientific groups and environment defense advocates have come together to advocate for the “Preserving Data in Government Act of 2017”, which was recently introduced in the Senate. Sunlight has put the bill up on Madison, the site that allows for public collaboration on policy documents. So here’s your chance to read the bill and add your comments and suggestions to make the bill better!
This bill, which was introduced in the U.S. Senate this spring, would require federal agencies to preserve public access to data sets and prevent the removal of those data sets from the Internet without sufficient public notice. The Sunlight Foundation, a national, nonpartisan nonprofit that advocates for open government, supports the bill — but we want to make it better. You can comment on the full text of the Preserving Data in Government Act of 2017 below. Well make sure the Senate staff that drafted the bill see your contributions.
This certainly seems to be the year when open government data really flowers. From NASA to Census to the Federal Elections Commission (FEC) and the Bureau of Labor Statistics (BLS) — not to mention data.gov! — across the Federal government, agencies are setting up developer sites with open APIs so that the public can reuse agency data and information. Just search “API site:*.gov” and you’ll find a bunch of agency open data sites.
With tens of thousands of datasets ranging from satellite imagery to material standards to demographic surveys, the U.S. Department of Commerce has long been in the business of Open Data. Through the Commerce Data Usability Project, go on a series of guided tours through the Commerce data lake and learn how you can leverage this free and open data to unlock the possible.
Unfortunately, there was a technological glitch and I didn’t get to finish my presentation on digital preservation at the 2013 House Legislative Data and Transparency conference. I’ve attached my presentation notes (PDF) in case anyone is interested. I’d be interested to hear comments.