Home » post » End of Term crawl 2024 is now underway!

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

End of Term crawl 2024 is now underway!

Well it’s that time again. The 2024 End of Term web crawl of the federal .gov/.mil web space (and other domains 🙂 ) has begun. We have just posted our first public announcement on the Internet Archive blog.

As we have done since 2008 (NARA did the first comprehensive crawl in 2004), a group of volunteers from the Internet Archive, GPO, Library of Congress, NARA, University of North Texas, and Stanford will be doing a “comprehensive” web harvest of the Federal government’s web space. For more information and background on the project, see our home page at https://eotarchive.org/. These archives can be searched full-text via the Internet Archive’s collections search (https://web.archive.org/) and also downloaded as bulk data for machine-assisted analysis from the project site.

But MOST IMPORTANTLY, we need YOUR help! We are currently accepting nominations for websites to be included in the 2024 End of Term Web Archive. Submit a url nomination by going to our nomination tool (hosted by University of North Texas!) and clicking the big yellow “add a url” button in the top right:

https://digital2.library.unt.edu/nomination/eth2024/

We encourage you to nominate any and all U.S. federal government websites that you want to make sure get captured. We’re also interested in any and all urls of federal sites that are NOT hosted on .gov/.mil (there are lots of federal government sites hosted on .edu, .org, and even .com! That includes social media but also research labs and other private/public partnerships). We already have a solid list of top level domains (eg epa.gov, congress.gov, defense.mil etc). Nominating urls deep within .gov/.mil websites helps to make our web crawls as thorough and complete as possible. Prizes will be awarded for most url nominations by individuals and institutions!

So get to it! Help us do the most complete crawl we can and also assure that the sites/publications/videos/data etc that are most important to YOU make it into the archive!!

CC BY-NC-SA 4.0 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Archives