Home » Posts tagged 'Lostdocs'
Tag Archives: Lostdocs
- Four easy steps to reporting “unreported” publications
- Strategies for finding “unreported” documents (more tips and tricks!)
- Historically “Unreported” materials of particular interest
- History of the problem
- Appendix: how to fill out the askGPO form
“Unreported” publications (which were, until recently, called “fugitive” publications) are those that are within scope of the Federal Depository Library Program (FDLP) but for various reasons have slipped through the cracks and not been collected and cataloged by the Government Publishing Office (GPO), distributed to FDLP libraries, or included in the “National Collection” (See a partial list of historically “unreported” publications below).
We here at FGI consider “unreported” publications as the paramount problem facing the FDLP today. FDLP librarians, with their critical information skills and expertise about the structure and publishing activities of the federal government, are a vital piece of the solution to this vexing problem. The National Collection is at the core of what FDLP libraries have done for the last 200+ years, so “unreported” publications erode that very foundation. During the spring 2021 virtual Depository Library Conference, I challenged every FDLP librarian to search for, find, and report to GPO five “unreported” documents every month. I’d like to reiterate that challenge here on FGI. If every one of the 1100+ FDLP librarians were to find and report 5 documents each month, through this iterative process we’d soon put a dent in this existential “unreported” documents problem.
To that end, we’d like to share some simple steps for how to find and report “unreported” documents to GPO:
- find an interesting federal document or information product like a report, data set, video, or slide deck (see the “strategies” section below for tips and tricks for finding documents);
- Search the Catalog of Government Publications (CGP) to see if GPO has cataloged it;
- If it’s NOT in the CGP, go to askGPO and fill in the “unreported document” form. See appendix for how to fill out the askGPO form;
- Rinse and repeat!
- Read the news with an eye toward those news items and sources which cover federal policies; (See for example, https://federalnewsnetwork.com, https://www.govexec.com, https://www.washingtonpost.com, etc.)
- Set up Google search and news alerts for publications from your favorite agency(ies), especially the Inspector Generals’ offices of those agencies (Inspector General reports are an especially critical and long-standing type of “unreported” document! Only a portion are even posted publicly on Oversight.gov);
- Find and report documents you use to answer reference/research consultations;
- Bookmark and visit the publications- and/or press release page of your favorite agency(ies);
- Follow on social media your favorite agency(ies), heads of agencies, your state’s Congressional delegation, known people within the executive branch, and Federal watchdog groups. New publications are often announced on government social media accounts.
- Agency Inspector General reports;
- Executive branch agency publications. See the LostDocs project for examples of documents that have been reported to GPO;
- Communication/Letters from members of Congress to executive branch agencies;
- Communication/Letters from federal officials to a Presidential administration;
- Public datasets;
- Congressional Research Service (CRS) reports* (*CRS reports were, until 2018, considered “privileged communication” between Congress and the Library of Congress and were therefore never released via the FDLP. Here’s the back story).
Since 1813 when the FDLP started, there have always been “unreported” documents which slipped through the cracks and were lost to the sands of time (until very recently, these were termed “fugitive” documents) [Footnote 1]. This problem has grown exponentially as executive agencies’ publishing operations have exploded, now that they can easily and freely distribute content online, and very few if any of them follow Title 44 regulations and send their documents to GPO as they are required to by law. Only a minuscule fraction of born-digital executive branch information is cataloged in the Catalog of Government Publications (CGP) or makes it into the “National Collection.” This means that every year, thousands — if not hundreds of thousands! — of Federal documents, datasets, maps, and other born-digital materials [Footnote 2] — are never preserved and are lost to the fog of history as websites are updated and historical content removed [Footnote 3].
Depository librarians reporting found publications are a critical part of a holistic solution to the “unreported” documents problem. By identifying federal information resources that are important to their local constituents, librarians are making sure that these documents will be cataloged, captured, and made accessible to a wider audience. Reporting documents also adds to a National Collection pipeline for long-term access and helps to make sure that what is collected and preserved reflects the needs and interests of the wide-ranging communities and the public which libraries serve.
Many hands make light work. Won’t you join in the effort? Please contact us if you have questions or comments at freegovinfo AT gmail DOT com.
1. See “‘Issued for Gratuitous Distribution:’ The History of Fugitive Documents and the FDLP.” James R. Jacobs. Article in special issue of Against the Grain: “Ensuring Access to Government Information”, 29(6) December 2017/January 2018.
2. My back of the napkin estimate is that well over 1/2 of the “National Collection” is unreported! The executive branch is far and away the largest portion of the National Collection, and is almost completely “unreported.” See slide 5 of my 2018 Canadian Govinfo presentation for some context. Jim Jacobs’ chart cites the 2008 End of Term crawl for context on how many born-digital government publications are on the Web. The 2016 End of Term crawl nearly doubled the 2008 crawl and went from 160 million URLs to 310 million URLs harvested. I expect the 2020 End of Term crawl happening at the time of this post’s publication to far surpass 310 million!
3. FGI has written about “link rot,” “content drift,” and other issues which make it difficult to collect and preserve born-digital information.
The AskGPO form can be used for single documents or for reporting multiple documents, for example, those listed on an agency’s publications index page. See below for the steps to filling out the askGPO form. If a site is extremely large and/or complex (eg., the Office of the Director of National Intelligence (ODNI) reports site) send the URL and description of the site to the GPO Web archiving team at FDLPwebarchiving AT gpo DOT gov.
- Log in to ask.gpo.gov (This will automatically fill in your contact information and depository library number in the form if you have used the system before);
- Click on “Federal Depository Library Program”;
- Select category “Fugitive Publications” (which will soon be changed to “unreported publications”);
- Choose single publication or multiple publications (there’s an excel template if you prefer to collect multiple documents and submit them all at once!);
- Enter title, publishing agency, publication URL, format (other fields are not required). Use your best guess if you are not sure;
- Upload PDF file as attachment (not required but helpful for GPO staff to have the document “in hand” when cataloging);
- Add any additional context that you think may aid GPO staff;
- Do the reCAPTCHA “I’m not a robot” test;
- Submit the document(s)!
“Documents obtained by InvestigateWest identify at least 46 reports from almost every program within the U.S. Energy Department’s energy efficiency and renewables office and seven national labs that have been stalled, downgraded or spiked.”
Here’s an oddity. On the Department of Labor’s blog, there was a post on september 6, 2016 titled “What is the ‘Real’ Unemployment Rate?” that described the “huge array of measures, which together provide a comprehensive picture of the state of job opportunities” in the US. As you’ll see if you click on that link, the post is now “404 page not found.” You’ll not find the post in the blog’s archive for September 2016 either. However, the post was archived by the Internet Archive on October 17, 2016, the last time that IA crawled the blog. So sometime between October, 2016 and today (February 16, 2017) that post was scrubbed from the Department of Labor’s blog.
What’s more strange is that the archived site showed 26 posts in September, 2016, but the live site’s blog’s archive for September 2016 shows only 10 posts. Unfortunately, IA didn’t crawl the monthly archive urls, so there’s no way to know what those missing 10 posts were about. There are also discrepancies for other months (eg, the archived site shows 30 posts in August 2016, while the live site shows 17 posts!).
There’s nothing that I can discern in this one found post that could be considered controversial. It’s not a CRS Report that found no correlation between the top tax rates and economic growth, thereby destroying a key tenet of conservative economic theory that was subsequently suppressed in 2012. It was written by Dr. Heidi Stierholz, the department’s chief economist.
So what gives? Why is the Department of Labor disappearing selective blog posts? We’ll let you know if we find out.
This is why US government information needs to be preserved off of .gov servers by FDLP libraries and other non-governmental organizations. It’s not enough that each agency has an Inspector General. Each agency should have one or more libraries collecting, preserving and giving access to its information *regardless* of political embarrassment or any other excuse for government information being deleted and lost.
The CIA inspector general’s office — the spy agency’s internal watchdog — has acknowledged it “mistakenly” destroyed its only copy of a mammoth Senate torture report at the same time lawyers for the Justice Department were assuring a federal judge that copies of the document were being preserved, Yahoo News has learned.Although other copies of the report exist, the erasure of the controversial document by the CIA office charged with policing agency conduct has alarmed the U.S. senator who oversaw the torture investigation and reignited a behind-the-scenes battle over whether the full unabridged report should ever be released, according to multiple intelligence community sources familiar with the incident.The deletion of the document has been portrayed by agency officials to Senate investigators as an “inadvertent” foul-up by the inspector general. In what one intelligence community source described as a series of errors straight “out of the Keystone Cops,” CIA inspector general officials deleted an uploaded computer file with the report and then accidentally destroyed a disk that also contained the document, filled with thousands of secret files about the CIA’s use of “enhanced” interrogation methods.
Docs of the week: Ferguson Grand Jury, 100 years of INS annual reports, and the historic Moynihan Report
Here at Stanford libraries, my colleague Kris Kasianovitz and I are busy putting context to the *massive* haystack that is the Internet — and we could use some help (want to be a lostdocs collector?!)! Below are just a few of the documents we’ve collected in the last week, stored in our Stanford Digital Repository and made accessible through our library catalog.
1)The Negro family, the case for national action AKA the Moynihan Report. This document came to me from a recent New Yorker article “Don’t Be Like That: Does black culture need to be reformed?” by Kelefa Sanneh. The article, a book review of a new anthology called “The Cultural Matrix: Understanding Black Youth,” contextualized the sociology and cultural history of being black in America, describing in detail the ground-breaking work of Daniel Patrick Moynihan, trained as a sociologist and well known later as the liberal Senator from NY. As Sanneh notes, the Moynihan Report — which was originally printed in a run of 100 with 99 of them locked in a vault — was leaked to the press causing the Johnson administration to release the entire document. Moynihan’s overarching theme was “the deterioration of the Negro family” and he called for a national program to “strengthen the Negro family.”
2) Annual Report of the Immigration and Naturalization Service. This one started out as a research consultation. A student wanted to analyze this report over the 100+ years that it’s been published. She found that the Immigration and Naturalization Service had digitized their historic run, but for some reason had taken the link down from their site and not restored it for over 2 weeks. I contacted INS and got the digitized documents restored, then downloaded them, deposited them in SDR and had the purl added to our bibliographic record. The added benefit to collecting this digital annual report is that it makes it easier for future users to access this important annual report chock full of important statistics — our paper collection is shelved in several different areas of the US documents collection as INS has shifted around over the years (causing its call# to change over time) among different agencies from Treasury (call# T21.1:) to Labor (call# L3.1: and L6.1:) to Justice (call# J21.1:) to Homeland Security (call# HS4.200).
3) Documents from the Ferguson Grand Jury. Ferguson has been in the news over the last year because of the fatal shooting of African American youth Michael brown by police officer Darren Wilson and the ensuing protests it sparked. This important historic series of 105 Missouri state documents from the Grand Jury were released via Freedom of Information requests from CNN. Some of our government information colleagues around the country wondered online how to collect and preserve these documents for posterity and future researchers. Luckily, SUL is one library able to collect and preserve historically important born-digital government documents.
The overwhelming majority of state, local, US and international government documents these days are born-digital. Here at Stanford libraries, we continue to look for ways to maintain and expand both our historic and born-digital documents collections. Self-deposit will no doubt be one strategy among several (including Web archiving, LOCKSS and future initiatives) as we look to serve the information needs of citizens, faculty, students and researchers.