Home » Articles posted by James R. Jacobs

Author Archives: James R. Jacobs

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

Lunchtime listen: “Storing Data Together” by Matt Zumwalt at Code4Lib2017

Drop everything and watch this presentation from the 2017 Code4Lib conference that took place in Los Angeles March 6-9, 2017. Heck, watch the entire proceedings because there is a bunch of interesting and thoughtful stuff going on in the world of libraries and technology! But in particular, check out Matt Zumwalt’s presentation “How the distributed web could bring a new Golden Age for Libraries” — after submitting his talk, he changed the new title to “Storing data together: the movement to decentralize data and how libraries can lead it” because of the DataRefuge movement.

Zumwalt (aka @FLyingZumwalt on twitter), works at Protocol Labs, one of the primary developers of IPFS, the Interplanetary File System (IPFS) — grok their tagline “HTTP is obsolete. It’s time for the distributed, permanent web!” He has spent much of his spare time over the last 9 months working with groups like EDGI, DataRefuge, and the Internet Archive to help preserve government datasets.

Here’s what Matt said in a nutshell: The Web is precarious. But using peer-to-peer distributed network architecture, we can “store data together”, we can collaboratively preserve and serve out government data. This resonates with me as an FDLP librarian. What if a network of FDLP libraries actually took this on? This isn’t some far-fetched, scifi idea. The technologies and infrastructures are already there. Over the last 9 months, researchers, faculty and public citizens around the country have already gotten on board with this idea. Libraries just have to get together and agree that it’s a good thing to collect/download, store, describe and serve out government information. Together we can do this!

Matt’s talk starts at 3:07:41 of the YouTube video below. Please watch it, let his ideas sink in, share it, start talking about it with your colleagues and administrators in your library, and get moving. Government information could be the great test case for the distributed web and a new Golden Age for Libraries!

This presentation will show how the worldwide surge of work on distributed technologies like the InterPlanetary File System (IPFS) opens the door to a flourishing of community-oriented librarianship in the digital age. The centralized internet, and the rise of cloud services, has forced libraries to act as information silos that compete with other silos to be the place where content and metadata get stored. We will look at how decentralized technologies allow libraries to break this pattern and resume their missions of providing discovery, access and preservation services on top of content that exists in multiple places.


CRS reports set to become public!

I can’t believe it’s finally happened, but today the House Appropriations Committee voted to “allow public access to all non-confidential CRS reports” as part of the FY 2018 Legislative Branch Appropriations bill. We’re one step closer to having public access to CRS reports! A bipartisan group of 40 nonprofit organizations (including FGI!) and 25 former CRS employees have been banging on Congress to do this, and the House today finally listened!

The issue of public access to Congressional Research Service (CRS) reports has been something for which librarians have advocated for at least 20 years. It’s been an uphill battle because some in Congress and the Library of Congress have long viewed CRS reports — which provide non-partisan analysis of important policy issues before Congress — as “privileged communication” between Congress and the CRS. And because of this narrow thinking about *public domain* government information, Libraries and the public have been forced to pay for these reports from private publishers, subscribe to expensive databases for access or find them serendipitously on the web.

Here is the appropriations report language:

“Public Access to CRS Reports: The Committee directs the Library of Congress’s Congressional Research Service (CRS) to make available to the public, all non-confidential reports. The Committee has debated this issue for several years, and after considering debate and testimony from entities inside the legislative branch and beyond the Committee believes the publishing of CRS reports will not impede CRS’s core mission in any impactful way and is in keeping with the Committee’s priority of full transparency to the American people. Within 90 days of enactment of this act CRS is directed to submit a plan to its oversight committees detailing its recommendations for implementing this effort as well as any associated cost estimates. Where practicable, CRS is encouraged to consult with the Government Publishing Office (GPO) in developing their plan; the Committee believes GPO could be of assistance in this effort.”

Read DemandProgress’ press release for more background.

Tweets of Congress, tweets of Trump archived and downloadable in bulk

The recently-launched Tweets Of Congress is collecting and publishing daily archives of tweets by congressional representatives, caucuses, and committees. The site only got up and running last week, so there are daily archives starting June 21, 2017. There’s also the Trump Twitter Archive, which has collected more than 30,000 of @realDonaldTrump’s tweets, which can be searched and downloaded in bulk.

But this points to a larger issue of the US government using commercial social media sites and tools to communicate with the public. This time around, the 2016 End of Term crawl included 9,000+ social media accounts (scraped from the .gov social media registry API) and included 44% FaceBook, 37% Twitter, 10% YouTube accounts. We also collected ~130 TB of .gov ftp sites that agencies use to serve out their collected data sets.

Tweets of Congress is my attempt to collate the entirety of Congress’ daily Twitter output using an automated process that checks Twitter on a fixed interval. Archives are available on this site and in JSON form. You can find JSON datasets linked in posts or in this site’s Github repo. Due to size constraints, archives will be limited at some tbd point. This site is open-source, so feel free to fork or whatever to your heart’s content. For any issues or other feedback, file an issue in the repo or send me an email.

via About – Tweets of Congress.

HT Data Is Plural 2017.06.28 edition. Don’t forget to subscribe to Jeremy Singer-Vine’s Data Is Plural weekly newsletter!

GAO adds 2020 census to its high risk list

The U.S. Constitution — Article I, Section 2, clause 3, as modified by Section 2 of the 14th Amendment — requires a population census every 10 years for apportioning seats in the House of Representatives. However, in the wake of US Census Bureau Director John Thompson’s abrupt resignation in May — which garnered a rash of editorials and news articles decrying his resignation at this critical time! — and the Trump administration and GOP-led Congress failing to fully fund the 2020 effort, the 2020 census could be “heading for a train wreck” as Terri Ann Lowenthal, the former co-director of the Census Project, put it so succinctly.

Accordingly, the Government Accountability Office has added the 2020 US census to its high risk list. Issues which raised the threat level for GAO include cancelled field tests for 2017, critical IT uncertainties, information security risks, and “unreliable” cost estimates which do not “conform to best practices.”

Strap in folks, we’re in for a bumpy couple of years for the census. If you have a Senator on the Senate Appropriations Committee or Representative on the House Appropriations Committee, please contact them early and often and ask — nay plead! — that they fully fund the US Census Bureau in order to complete the constitutionally mandated decennial census.

For more background on the US census, see this CRS Report “The Decennial Census: Issues for 2020.”

Every 2 years at the start of a new Congress, GAO calls attention to agencies and program areas that are high risk due to their vulnerabilities to fraud, waste, abuse, and mismanagement, or are most in need of transformation. The 2017 update identified 3 new High Risk areas and removed 1 area. The update is available below.

via U.S. GAO – High Risk List.

Montana Library Association passes Resolution to fund US govt publications preservation

Bernadine Abbott Hoduski, the grande dame of government documents — she’s got a GODORT award named after her for gosh sakes! — sent me this announcement. The Montana library Association, at its annual membership meeting in March, 2017, passed a packet of resolutions including their Resolution on Funding the Preservation of Federal Government Publications (text below). The resolution calls on the US Congress to “fully fund preservation of Federal government publications housed in federal depository libraries.”

The resolution has been sent to Montana’s US Senator Jon Tester, who happens to sit on the Senate Appropriations Committee. Please consider taking this text and passing the resolution at other state library associations, especially if your state’s senator sits on the Appropriations Committee. I’ve sent the text of this resolution to CA Senator Diane Feinstein.

Thanks bernadine for all your hard work on this and through the many years!

Resolution on Funding the Preservation of Federal Government Publications

Whereas, Democracy depends upon the public’s access to information from and about the United States federal government; and

Whereas, to preserve the historic record of our country, the United States Congress established a distributed system of Federal depository libraries to safeguard government information from dangers ranging from bit-rot to fire; and

Whereas, the United States Federal depository libraries provide public access to federal government publications and information without charge; and

Whereas, Federal depository libraries spend millions of dollars collecting, housing, cataloging, and providing public access to federal government information, and

Whereas, Federal depository libraries lack enough money to preserve millions of federal government publications in paper, microform, and digital formats; and

Whereas, the U. S. Government Publishing Office (GPO) established FIPNet (Federal Information Preservation Network) as part of the “National Plan for Access to U. S. Government Information” – a strategy for a collaborative network of information professionals working in various partner roles to ensure access to the national collection of government information for future generations. FIPNet contributes to the preservation of both tangible and digital government information, and elevates the public awareness and prestige of local initiatives, specific collections of government information, and the institutions and agencies that have stewardship over them; and

Whereas, GPO is not authorized to provide funding directly to depository libraries that agree to preserve federal government publications; and

Whereas, the United States Congress can authorize GPO to provide funding to depository libraries; and Whereas, GPO needs additional funding and staff to provide on-site support for libraries in the building of an inventory and catalog of all their federal government publications in order to plan for preservation;

Therefore, be it resolved that:

The Montana Library Association urges the U. S. Congress to fully fund preservation of Federal government publications housed in federal depository libraries; and

The Montana Library Association urges the U. S. Congress to authorize the U. S. Government Publishing Office to provide funds directly to libraries for the preservation of the federal government publications (paper, microform, and digital) housed in their libraries; and

The Montana Library Association urges Congress to provide funding to the Superintendent of Documents (GPO) so agency librarians can travel to depository libraries to advise librarians in preservation activities, including inventorying, cataloging, and planning for preservation of government publications.

Adopted by the Montana Library Association Membership March 31, 2017

Archives