Home » Posts tagged 'End of term archive'
Tag Archives: End of term archive
Vicky Reich and David Rosenthal receive CNI Paul Evans Peters Award for LOCKSS
Congratulations to Vicky and David, founders of LOCKSS for receiving the Paul Evans Peters Award! this is a HUGE and justified honor for their lifetimes’ impactful work in libraries and digital preservation. They join other luminaries who have received the Peters Award, among them Tim Berners Lee, Vint Serf, Brewster Kahle, Paul Ginsparg, Daniel Atkins, Christine Borgman, Donald Lindberg, Herbert Van de Sompel, Francine Berman, Paul Courant, and Tony Hey. A veritable who’s who of the internet and libraries!
The Paul Evans Peters Award is a lifetime achievement award that “recognizes the most notable and lasting international achievements related to information technology and the creation and use of information resources and services that advance scholarship and intellectual productivity.” It is jointly awarded by the Coalition for Networked Information (CNI), Association of Research Libraries (ARL), and EDUCAUSE.
David and Vicky put up the lecture entitled “Lessons from LOCKSS” that they gave in receiving the Peters Award. It’s a fascinating look at their social AND technical work in developing LOCKSS and the history of digital preservation in libraries. I especially appreciate Vicky’s discussion about the importance of preservation of government information and the spectrum of efforts that have gone into their preservation — including LOCKSS USDOCS, End of Term Archive (EOT), and the Data Rescue Project — and how the concept of the Federal Depository Library Program (FDLP) and its tamper-resistant and long impact on preservation of the information of our democracy is actually embedded in LOCKSS.
Call to arms: What government information librarians can do to help save critical federal information from being lost
What is a government information librarian to do during these times when the very public information we base our daily work around is being redacted, cleansed, and deleted? First, make yourself aware of all the work that is already being done (and has been being done since 2008 and before). Our friend and PEGI colleague Lynda Kellam has helpfully created a growing google document of the efforts currently underway to collect and preserve federal government information and data.
Then, what can each of us do, at our libraries, to make sure that government information, once published, is collected, described, preserved, and made freely and publicly available? Here are some things that EVERY government information librarian (regardless of the size of the organization they work for) can do.
1) Send in “unreported” documents to GPO. The executive branch is rife with unreported documents that should be part of the FDLP but have slipped through the ever-growing cracks. We should be absolutely flooding GPO with unreported documents for them to catalog and preserve. It’s quick and easy to do by following the directions on the FDLP website. And the form includes a space to attach a digital file so make sure to do that as well.
EVERY FDLP librarian should agree to track at least one federal agency and submit at least 10 unreported documents to GPO every week. We can’t assure long-term preservation of government information unless we ALL do this. Perhaps GPO or GODORT can help coordinate this? Maybe we can use govdoc-l to announce and update our commitments.
2) Use the Internet Archive’s “save page now” tool to save every .gov page that you visit. IA will crawl and preserve every one of these in the Wayback Machine. It’s quick and easy – and fun! – to copy/paste the url into the “save page now” tool and watch wayback do its work! And it’ll even save that page to your own personal web archive (if you’ve created a free “library card” and are logged in to the site!). You can create your own web archive of important websites. And you can install their free browser extensions to save web pages with a single click. In short, be a librarian! See something save something! Use every method open to you to participate in preserving government information that your users rely on. Dedicate time and energy (and the time and energy of your library) to long-term access to government information. GPO, LC, and NARA can’t do it by themselves.
3) Donate to the Internet Archive. (we are NOT IA staff!) It’s time to put our money where our livelihoods are. The Internet Archive does yeoman’s work to preserve the web. They have long put their valuable resources, infrastructure, technology, and staff time towards making sure the End of Term Archive is successful in collecting as much of the .gov/.mil web domain as they can. And they have started a new project called Democracy’s Library to collect the world’s born-digital web based government information and digitize historic government information. So you NEED to pitch in to help their efforts. Skip one or two Starbucks coffees and send them $10 a month. Every little bit helps them be able to continue to do their valuable work.
4) If you work for a library or organization that has an institutional repository and/or digital infrastructure, then advocate with your administration to put that repository and infrastructure toward the common good of hosting local copies of documents and mirroring important data sets.
5) And if your institution has some budgetary and infrastructure wherewithal (and especially if your institution is already a LOCKSS member!), please consider joining the LOCKSS-USDOCS project. The project just had its 16th birthday of distributed preservation of all content on GOVINFO (and FDsys and GPOaccess before that!).
In short, be a librarian! Use every method open to you to participate in preserving government information that your users rely on. Dedicate time and energy (and the time and energy of your library) to long-term access to government information.
These are short-term strategies for things that all of us can do RIGHT NOW and we still need to use this current historical moment as an opportunity to develop a long term strategy for building a Digital Preservation Infrastructure for government information.
Finally: GPO, if you’re listening, please store a copy of EVERY document you catalog and provide a link to your stored copy. Whole websites are being deleted from the web and the only way to assure long-term access is to store a copy. Don’t POINT to a document when you should be COLLECTING every document which is your legal and statutory purview.
Federal information scrubbing has begun. Please support the End of Term Archive and Environmental Data Governance Initiative (EDGI)
It seems that the scrubbing of public information and communication from Federal government websites has begun. But along with erasing information that the new administration does not like (mostly centered on climate change and the environment, science, health, DEI, civil rights, immigration, and the like), they have also signed a raft of executive orders overturning policies from the previous administration (here’s a track of all the executive orders signed in recent days) and purged up to 18 Inspector generals (IGs) from across the federal government. IG’s are meant to be independent government watchdogs who conduct investigations and audits into malfeasance, fraud, waste or abuse by government agencies and its personnel. So it seems pretty clear that the new administration wants to a) hide or delete information it doesn’t agree with; and b) make sure there are no watchdogs in place within agencies who could report on fraud, waste, or abuse by the new personnel being put in place by president Trump.
Luckily, there are librarians and NGO watchdog groups on the case. Ben Amata, Government Information Librarian at Sacramento State University, has started to track the issue in his new libguide Government Information: Eliminated, Suspended, Etc. His contact is on the libguide so please send him any news articles about the disappearance of federal information.
Our friends at the Environmental Data Governance Initiative (EDGI) are busy archiving public environmental data as they did in 2016 during the first Trump administration.
The End of Term Archive is once again harvesting and preserving the .gov/.mil web domain as it has done since 2008 regardless of each president’s political party.
And all kinds of non-profit organizations like the umbrella watchdog group Democracy2025 are gearing up to “analyze Trump-Vance administration actions, support legal challenges, and provide resources for the pro-democracy community.”
Here are but a few examples of news items I’ve seen in the last few days. Feel free to leave us a comment pointing to other examples.
Scope of the communications hold on federal health agencies expands. Chris Dall, January 23, 2025. Center for Infectious Disease Research & Policy CIDRAP, University of Minnesota.
The memo, sent to heads of operating divisions on January 21, orders recipients to “Refrain from publicly issuing any documents (e.g., regulation, guidance, notice, grant announcement) or communication (e.g., social media, websites, press releases, and communication using listservs) until it has been reviewed and approved by a presidential appointee,” through February 1.
The memo also bars participation in any public speaking engagements and sending documents intended for publication in the Office of the Federal Register.
Trump’s anti-DEI order yanks air force videos of Tuskegee Airmen and female pilots. Reuters (25 Jan 2025)
“…Donald Trump’s order halting diversity, equity and inclusion initiatives has led the US air force to suspend course instruction on a documentary about the first Black airmen in the US military, known as the Tuskegee Airmen, a US official said on Saturday.
Another video about civilian female pilots trained by the US military during the second world war, known as Women Airforce Service Pilots, or Wasps, was also pulled, the official said…”
Trump pardoned the January 6 convicts. Now his DOJ is wiping evidence of rioters’ crimes from the internet. Donie O’Sullivan and Katelyn Polantz, CNN (January 26, 2025)
“As President Donald Trump this week sought to rewrite the history of his supporters’ attack on the US Capitol, a database detailing the vast array of criminal charges and successful convictions of January 6 rioters was removed from the Department of Justice’s website.
The searchable database served as an easily accessible repository of all January 6, 2021, cases prosecuted by the US Attorney’s Office for the District of Columbia.
…Parts of the database were still accessible Sunday through the Internet Archive.
…The FBI — representing another leg of the Justice Department — also took offline its compendium of wanted Capitol rioters. Some of those individuals were fugitives or rioters who hadn’t been identified, and the FBI had posted images and other information of the suspects it was still seeking.
Thousands of pages that were part of the database now appear to be inaccessible. Details of January 6 cases are still accessible on the DOJ’s website in the form of press releases about charges and convictions. They are also still available through court records and services such as Pacer.”
Good Forbes piece on the End of Term crawl
Check out this recent piece in Forbes about the End of Term project (eotarchive.org). And if you’re so moved to help out, you can nominate any federal government url through our handy nomination tool. There’s still time to help us save democracy’s information!
Meet The Citizens Racing To Save Government Websites From Vanishing. Leslie Katz, Forbes, Oct 23, 2024
…With the Nov. 5 election just a week away, they’re harvesting vast amounts of government data before the White House welcomes new residents or former ones in January. The information will live on in the End of Term Web Archive, a giant repository of federal government websites preserved for the historical record as one administrative term ends and a new one begins. Librarians, archivists and technologists across the country join forces every four years to donate time, effort and resources to what they dub the end-of-term crawl, with the resulting datasets available to the public for free…
“Citizens have a right to access information about what their government is doing in their name,” says James Jacobs, a government information librarian at Stanford University, an End of Term Web Archive project partner. “That’s why libraries have long collected these materials to make sure they are organized, preserved and easily accessible for the long term.”
End of Term crawl 2024 is now underway!
Well it’s that time again. The 2024 End of Term web crawl of the federal .gov/.mil web space (and other domains 🙂 ) has begun. We have just posted our first public announcement on the Internet Archive blog.
As we have done since 2008 (NARA did the first comprehensive crawl in 2004), a group of volunteers from the Internet Archive, GPO, Library of Congress, NARA, University of North Texas, and Stanford will be doing a “comprehensive” web harvest of the Federal government’s web space. For more information and background on the project, see our home page at https://eotarchive.org/. These archives can be searched full-text via the Internet Archive’s collections search (https://web.archive.org/) and also downloaded as bulk data for machine-assisted analysis from the project site.
But MOST IMPORTANTLY, we need YOUR help! We are currently accepting nominations for websites to be included in the 2024 End of Term Web Archive. Submit a url nomination by going to our nomination tool (hosted by University of North Texas!) and clicking the big yellow “add a url” button in the top right:
https://digital2.library.unt.edu/nomination/eth2024/
We encourage you to nominate any and all U.S. federal government websites that you want to make sure get captured. We’re also interested in any and all urls of federal sites that are NOT hosted on .gov/.mil (there are lots of federal government sites hosted on .edu, .org, and even .com! That includes social media but also research labs and other private/public partnerships). We already have a solid list of top level domains (eg epa.gov, congress.gov, defense.mil etc). Nominating urls deep within .gov/.mil websites helps to make our web crawls as thorough and complete as possible. Prizes will be awarded for most url nominations by individuals and institutions!
So get to it! Help us do the most complete crawl we can and also assure that the sites/publications/videos/data etc that are most important to YOU make it into the archive!!
Latest Comments