Home » Posts tagged 'End of term archive'
Tag Archives: End of term archive
Call to arms: What government information librarians can do to help save critical federal information from being lost
What is a government information librarian to do during these times when the very public information we base our daily work around is being redacted, cleansed, and deleted? First, make yourself aware of all the work that is already being done (and has been being done since 2008 and before). Our friend and PEGI colleague Lynda Kellam has helpfully created a growing google document of the efforts currently underway to collect and preserve federal government information and data.
Then, what can each of us do, at our libraries, to make sure that government information, once published, is collected, described, preserved, and made freely and publicly available? Here are some things that EVERY government information librarian (regardless of the size of the organization they work for) can do.
1) Send in “unreported” documents to GPO. The executive branch is rife with unreported documents that should be part of the FDLP but have slipped through the ever-growing cracks. We should be absolutely flooding GPO with unreported documents for them to catalog and preserve. It’s quick and easy to do by following the directions on the FDLP website. And the form includes a space to attach a digital file so make sure to do that as well.
EVERY FDLP librarian should agree to track at least one federal agency and submit at least 10 unreported documents to GPO every week. We can’t assure long-term preservation of government information unless we ALL do this. Perhaps GPO or GODORT can help coordinate this? Maybe we can use govdoc-l to announce and update our commitments.
2) Use the Internet Archive’s “save page now” tool to save every .gov page that you visit. IA will crawl and preserve every one of these in the Wayback Machine. It’s quick and easy – and fun! – to copy/paste the url into the “save page now” tool and watch wayback do its work! And it’ll even save that page to your own personal web archive (if you’ve created a free “library card” and are logged in to the site!). You can create your own web archive of important websites. And you can install their free browser extensions to save web pages with a single click. In short, be a librarian! See something save something! Use every method open to you to participate in preserving government information that your users rely on. Dedicate time and energy (and the time and energy of your library) to long-term access to government information. GPO, LC, and NARA can’t do it by themselves.
3) Donate to the Internet Archive. (we are NOT IA staff!) It’s time to put our money where our livelihoods are. The Internet Archive does yeoman’s work to preserve the web. They have long put their valuable resources, infrastructure, technology, and staff time towards making sure the End of Term Archive is successful in collecting as much of the .gov/.mil web domain as they can. And they have started a new project called Democracy’s Library to collect the world’s born-digital web based government information and digitize historic government information. So you NEED to pitch in to help their efforts. Skip one or two Starbucks coffees and send them $10 a month. Every little bit helps them be able to continue to do their valuable work.
4) If you work for a library or organization that has an institutional repository and/or digital infrastructure, then advocate with your administration to put that repository and infrastructure toward the common good of hosting local copies of documents and mirroring important data sets.
5) And if your institution has some budgetary and infrastructure wherewithal (and especially if your institution is already a LOCKSS member!), please consider joining the LOCKSS-USDOCS project. The project just had its 16th birthday of distributed preservation of all content on GOVINFO (and FDsys and GPOaccess before that!).
In short, be a librarian! Use every method open to you to participate in preserving government information that your users rely on. Dedicate time and energy (and the time and energy of your library) to long-term access to government information.
These are short-term strategies for things that all of us can do RIGHT NOW and we still need to use this current historical moment as an opportunity to develop a long term strategy for building a Digital Preservation Infrastructure for government information.
Finally: GPO, if you’re listening, please store a copy of EVERY document you catalog and provide a link to your stored copy. Whole websites are being deleted from the web and the only way to assure long-term access is to store a copy. Don’t POINT to a document when you should be COLLECTING every document which is your legal and statutory purview.
Federal information scrubbing has begun. Please support the End of Term Archive and Environmental Data Governance Initiative (EDGI)
It seems that the scrubbing of public information and communication from Federal government websites has begun. But along with erasing information that the new administration does not like (mostly centered on climate change and the environment, science, health, DEI, civil rights, immigration, and the like), they have also signed a raft of executive orders overturning policies from the previous administration (here’s a track of all the executive orders signed in recent days) and purged up to 18 Inspector generals (IGs) from across the federal government. IG’s are meant to be independent government watchdogs who conduct investigations and audits into malfeasance, fraud, waste or abuse by government agencies and its personnel. So it seems pretty clear that the new administration wants to a) hide or delete information it doesn’t agree with; and b) make sure there are no watchdogs in place within agencies who could report on fraud, waste, or abuse by the new personnel being put in place by president Trump.
Luckily, there are librarians and NGO watchdog groups on the case. Ben Amata, Government Information Librarian at Sacramento State University, has started to track the issue in his new libguide Government Information: Eliminated, Suspended, Etc. His contact is on the libguide so please send him any news articles about the disappearance of federal information.
Our friends at the Environmental Data Governance Initiative (EDGI) are busy archiving public environmental data as they did in 2016 during the first Trump administration.
The End of Term Archive is once again harvesting and preserving the .gov/.mil web domain as it has done since 2008 regardless of each president’s political party.
And all kinds of non-profit organizations like the umbrella watchdog group Democracy2025 are gearing up to “analyze Trump-Vance administration actions, support legal challenges, and provide resources for the pro-democracy community.”
Here are but a few examples of news items I’ve seen in the last few days. Feel free to leave us a comment pointing to other examples.
Scope of the communications hold on federal health agencies expands. Chris Dall, January 23, 2025. Center for Infectious Disease Research & Policy CIDRAP, University of Minnesota.
The memo, sent to heads of operating divisions on January 21, orders recipients to “Refrain from publicly issuing any documents (e.g., regulation, guidance, notice, grant announcement) or communication (e.g., social media, websites, press releases, and communication using listservs) until it has been reviewed and approved by a presidential appointee,” through February 1.
The memo also bars participation in any public speaking engagements and sending documents intended for publication in the Office of the Federal Register.
Trump’s anti-DEI order yanks air force videos of Tuskegee Airmen and female pilots. Reuters (25 Jan 2025)
“…Donald Trump’s order halting diversity, equity and inclusion initiatives has led the US air force to suspend course instruction on a documentary about the first Black airmen in the US military, known as the Tuskegee Airmen, a US official said on Saturday.
Another video about civilian female pilots trained by the US military during the second world war, known as Women Airforce Service Pilots, or Wasps, was also pulled, the official said…”
Trump pardoned the January 6 convicts. Now his DOJ is wiping evidence of rioters’ crimes from the internet. Donie O’Sullivan and Katelyn Polantz, CNN (January 26, 2025)
“As President Donald Trump this week sought to rewrite the history of his supporters’ attack on the US Capitol, a database detailing the vast array of criminal charges and successful convictions of January 6 rioters was removed from the Department of Justice’s website.
The searchable database served as an easily accessible repository of all January 6, 2021, cases prosecuted by the US Attorney’s Office for the District of Columbia.
…Parts of the database were still accessible Sunday through the Internet Archive.
…The FBI — representing another leg of the Justice Department — also took offline its compendium of wanted Capitol rioters. Some of those individuals were fugitives or rioters who hadn’t been identified, and the FBI had posted images and other information of the suspects it was still seeking.
Thousands of pages that were part of the database now appear to be inaccessible. Details of January 6 cases are still accessible on the DOJ’s website in the form of press releases about charges and convictions. They are also still available through court records and services such as Pacer.”
Good Forbes piece on the End of Term crawl
Check out this recent piece in Forbes about the End of Term project (eotarchive.org). And if you’re so moved to help out, you can nominate any federal government url through our handy nomination tool. There’s still time to help us save democracy’s information!
Meet The Citizens Racing To Save Government Websites From Vanishing. Leslie Katz, Forbes, Oct 23, 2024
…With the Nov. 5 election just a week away, they’re harvesting vast amounts of government data before the White House welcomes new residents or former ones in January. The information will live on in the End of Term Web Archive, a giant repository of federal government websites preserved for the historical record as one administrative term ends and a new one begins. Librarians, archivists and technologists across the country join forces every four years to donate time, effort and resources to what they dub the end-of-term crawl, with the resulting datasets available to the public for free…
“Citizens have a right to access information about what their government is doing in their name,” says James Jacobs, a government information librarian at Stanford University, an End of Term Web Archive project partner. “That’s why libraries have long collected these materials to make sure they are organized, preserved and easily accessible for the long term.”
End of Term crawl 2024 is now underway!
Well it’s that time again. The 2024 End of Term web crawl of the federal .gov/.mil web space (and other domains 🙂 ) has begun. We have just posted our first public announcement on the Internet Archive blog.
As we have done since 2008 (NARA did the first comprehensive crawl in 2004), a group of volunteers from the Internet Archive, GPO, Library of Congress, NARA, University of North Texas, and Stanford will be doing a “comprehensive” web harvest of the Federal government’s web space. For more information and background on the project, see our home page at https://eotarchive.org/. These archives can be searched full-text via the Internet Archive’s collections search (https://web.archive.org/) and also downloaded as bulk data for machine-assisted analysis from the project site.
But MOST IMPORTANTLY, we need YOUR help! We are currently accepting nominations for websites to be included in the 2024 End of Term Web Archive. Submit a url nomination by going to our nomination tool (hosted by University of North Texas!) and clicking the big yellow “add a url” button in the top right:
https://digital2.library.unt.edu/nomination/eth2024/
We encourage you to nominate any and all U.S. federal government websites that you want to make sure get captured. We’re also interested in any and all urls of federal sites that are NOT hosted on .gov/.mil (there are lots of federal government sites hosted on .edu, .org, and even .com! That includes social media but also research labs and other private/public partnerships). We already have a solid list of top level domains (eg epa.gov, congress.gov, defense.mil etc). Nominating urls deep within .gov/.mil websites helps to make our web crawls as thorough and complete as possible. Prizes will be awarded for most url nominations by individuals and institutions!
So get to it! Help us do the most complete crawl we can and also assure that the sites/publications/videos/data etc that are most important to YOU make it into the archive!!
EDGI Releases Dataset of Federal Environmental Website Changes Under Trump
Thanks to the Environmental Data and Governance Initiative (EDGI) for releasing the Federal Environmental Web Tracker. This tool is a public dataset of searchable records of approximately 1,500 significant changes to federal agency environmental webpages under the Trump administration, these changes were almost always precursors or responses to policy changes. These changes came from a “list of 25,000 federal Web pages related to climate, energy, and the environment, including pages for 20 federal agencies such as EPA, NOAA, and NASA.” Here’s the Tracker’s explanatory page for more context and background.
EDGI continues to do important work in tracking the federal .gov Web domain. EDGI’s work goes hand in hand with the work of the End of Term Web Archive which has harvested the .gov/.mil Web space every 4 years since 2008 and is now deep into its 2020 harvest. And we’re still accepting nominations, so go to the End of Term Nomination Tool hosted by the University of North Texas (UNT) library. Help us collect a snapshot of the federal Web domain!
Today, the Environmental Data & Governance Initiative (EDGI) publishes searchable records of approximately 1,500 changes to federal agency environmental webpages under the Trump administration. For four years, EDGI’s website monitoring team has identified and catalogued significant changes to federal websites using their open source monitoring software. EDGI’s Federal Environmental Web Tracker makes records of significant changes publicly available.
The information that’s available on federal websites can have important policy implications. As EDGI has often reported over the past four years, changes to the information that’s available on federal websites are almost always precursors or responses to policy changes. Federal websites provide information that the public is likely to access before commenting on a proposed rule to learn about current regulatory efforts, the science underlying a new policy decision, or likely impacts of a proposed rule. The information found (or not found) on a federal website can impact public participation in regulatory processes.
In the weeks after Trump’s election in November 2016, newly-formed EDGI compiled a list of 25,000 federal web pages related to climate, energy, and the environment, including pages for 20 federal agencies such as EPA, NOAA, and NASA. First using proprietary software and then building and using novel open source software, EDGI has compared versions of these web pages weekly since January 2017. This new dataset represents the documented changes that EDGI’s website monitoring team flagged as significant in some way over the past four years.
EDGI’s Federal Environmental Web Tracker gives journalists, academic researchers, and the public data that can be used to provide insight, documentation, and analysis of the information policies and priorities of the Trump administration.
The Federal Environmental Web Tracker will be updated quarterly as EDGI continues to monitor federal environmental websites.
Latest Comments