Paste your Google Webmaster Tools verification code here

Home » Posts tagged 'End of term archive'

Tag Archives: End of term archive

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

EDGI Releases Dataset of Federal Environmental Website Changes Under Trump

Thanks to the Environmental Data and Governance Initiative (EDGI) for releasing the Federal Environmental Web Tracker. This tool is a public dataset of searchable records of approximately 1,500 significant changes to federal agency environmental webpages under the Trump administration, these changes were almost always precursors or responses to policy changes. These changes came from a “list of 25,000 federal Web pages related to climate, energy, and the environment, including pages for 20 federal agencies such as EPA, NOAA, and NASA.” Here’s the Tracker’s explanatory page for more context and background.

EDGI continues to do important work in tracking the federal .gov Web domain. EDGI’s work goes hand in hand with the work of the End of Term Web Archive which has harvested the .gov/.mil Web space every 4 years since 2008 and is now deep into its 2020 harvest. And we’re still accepting nominations, so go to the End of Term Nomination Tool hosted by the University of North Texas (UNT) library. Help us collect a snapshot of the federal Web domain!

Today, the Environmental Data & Governance Initiative (EDGI) publishes searchable records of approximately 1,500 changes to federal agency environmental webpages under the Trump administration. For four years, EDGI’s website monitoring team has identified and catalogued significant changes to federal websites using their open source monitoring software. EDGI’s Federal Environmental Web Tracker makes records of significant changes publicly available.

The information that’s available on federal websites can have important policy implications. As EDGI has often reported over the past four years, changes to the information that’s available on federal websites are almost always precursors or responses to policy changes. Federal websites provide information that the public is likely to access before commenting on a proposed rule to learn about current regulatory efforts, the science underlying a new policy decision, or likely impacts of a proposed rule. The information found (or not found) on a federal website can impact public participation in regulatory processes.

In the weeks after Trump’s election in November 2016, newly-formed EDGI compiled a list of 25,000 federal web pages related to climate, energy, and the environment, including pages for 20 federal agencies such as EPA, NOAA, and NASA. First using proprietary software and then building and using novel open source software, EDGI has compared versions of these web pages weekly since January 2017. This new dataset represents the documented changes that EDGI’s website monitoring team flagged as significant in some way over the past four years.

EDGI’s Federal Environmental Web Tracker gives journalists, academic researchers, and the public data that can be used to provide insight, documentation, and analysis of the information policies and priorities of the Trump administration.

The Federal Environmental Web Tracker will be updated quarterly as EDGI continues to monitor federal environmental websites.

HT to InfoDocket!

Nominations sought for the U.S. Federal Government Domain End of Term 2020 Web Archive

It’s that time again folks. The End of Term Archive is once again gearing up to harvest the .gov/.mil Web domain. For the End of Term 2020, The Library of Congress, University of North Texas Libraries, Internet Archive, Stanford University Libraries, and the U.S. Government Publishing Office (GPO) are joining efforts again, this time with new partners Environmental Data & Governance Initiative (EDGI) and the General Services Administration (GSA), to preserve public United States Government websites at the end of the current presidential administration ending January 20, 2021. This web harvest – like its predecessors in 2008, 2012, and 2016 – is intended to document the federal government’s presence on the World Wide Web during the transition of Presidential administrations and to enhance the existing collections of the partner institutions. This broad comprehensive crawl of the .gov domain will include as many federal .gov sites as we can find, plus federal content in other domains (such as .mil, .com, and social media content) and FTP’d datasets.

Here’s the official announcement asking for YOUR help. Please forward widely!

WE NEED YOUR HELP TO PRESERVE THE .GOV WEB DOMAIN!

How would YOU like to help preserve the United States federal government .gov/.mil Web domain for future generations? But, that’s too huge of a swath of Internet real estate for any one person or organization to preserve, right?!

Wrong! The volunteers working on the End of Term Web Archiving Project are doing just that. BUT WE NEED YOUR HELP!

And that’s where YOU come in. You can help the project immensely by nominating your favorite .gov website/document/dataset, other federal government websites, or governmental social media account with the End of Term Nomination Tool. You can nominate as many sites as you want. Nominate early and often! Win a prize for the most seed nominations!! Tell your friends, family and colleagues to do the same. Help us preserve the .gov domain for posterity, public access, and long-term preservation. Only YOU can help prevent … link rot!

The EPA’s Website after a year of climate change censorship

Here’s a good article from Time Magazine“Here’s What the EPA’s Website Looks Like After a Year of Climate Change Censorship” — which accurately reports how the Trump Administration and EPA Administrator Scott Pruitt have changed, skewed or deleted government information from the EPA Website for crass political purposes. For more in-depth analysis of the issue of information scrubbing from federal websites, one should look to the work of the Environmental Data and Governance Initiative (EDGI) and especially their reports: “Changing the Digital Climate” and “The EPA Under Siege”.

According to former government officials and EPA staffers, the level of scrutiny is without precedent. In the hands of an administration that has eschewed facts for their alternative cousins, the agency’s site is increasingly unmoored from its scientific core.

“In my experience, new administrations might come in and change the appearance of an agency website or the way they present information, but this is an unprecedented attempt to delete or bury credible scientific information they find politically inconvenient,” Heather Zichal, a senior fellow at the Atlantic Council’s Global Energy Center, and previously President Barack Obama’s top White House adviser on energy and climate change, tells TIME.

The EPA’s site is now riddled with missing links, redirecting pages and buried information. Over the past year, terms like “fossil fuels”, “greenhouse gases” and “global warming” have been excised. Even the term “science” is no longer safe.

Christine Todd Whitman, the EPA Administrator under George W. Bush, says the overhaul is “to such an extreme degree that [it] undermines the credibility of the site”…

Of the more than 25,000 web pages tracked by the Environmental Data and Governance Initiative (EDGI) since Trump’s election, they say the EPA’s have been hit hardest. One section, which provided local communities with resources for combating climate change, disappeared for months only to resurface heavily redacted, including just 175 of its 380 pages.

via The EPA’s Website After a Year of Climate Change Censorship | Time.

2016 End of Term Web Archive is now available

The 2016 end of term .gov/.mil web crawl is now available! We collected approximately 300TB of government websites which includes over “70 million html pages, over 40 million PDFs and, towards the other end of the spectrum and for semantic web aficionados, 8 files of the text/turtle mime type” as well as @100TB of public data via .gov FTP file servers! Thanks to everyone who participated on the project and the thousands(!) of seed nominators, both individuals and those that came in via DataRefuge and EDGI tools and public events.

The End of Term Web Archive contains federal government websites (.gov, .mil, etc) in the Legislative, Executive, or Judicial branches of the government. Websites that were at risk of changing (i.e., whitehouse.gov) or disappearing altogether during government transitions were captured. Local government websites, or any other site not part of the federal government domain were out of scope.

The mystery of the suspended U.S. government Twitter accounts

here’s a strange story unfolding. Our End of Term project friend Justin Littman from George Washington University, was doing some maintenance on the official US government twitter accounts that had been captured for the End of Term crawl, and noticed a number of .gov twitter accounts had been suspended. Account suspension happens when an account is sending spam or has been hacked or compromised in some way. I’ll let Justin explain below, but I’ll be really interested to find out how the folks running the U.S. Digital Registry are going to respond.

…When collecting a large number of Twitter accounts, the list of accounts requires occasional maintenance, as sometimes Twitter accounts are deleted or protected. It’s understandable how U.S. government accounts would be expected to change over time as agencies and initiatives change. However, when I was doing maintenance earlier today, I noticed something odd: a number of the accounts were suspended, not deleted or protected.

Curious, I exported the tweets from some of the suspended accounts. Really odd – the tweets were in Russian.

Then I checked back in the U.S. Digital Registry. The U.S. Digital Registry is supposed to be the authoritative list of the official U.S. government social media accounts…

…Still, there are some immediate take-aways:

  • While the U.S. Digital Registry is a very important service for promoting trust and transparency in the U.S. government and invaluable for those of us attempting to archive the web presence of the U.S. government, it desperately needs a scrubbing and quality control processes put into place.
  • The U.S. government needs to take full advantage of verified status on Twitter (i.e., the blue check), perhaps even requiring it.
  • Twitter needs to deal with the problem of recycled screen names. A person or organization should be able to delete an account without the fear of being impersonated. In particular, for organizations such as government agencies, this is critical.

via Suspended U.S. government Twitter accounts • Social Feed Manager.

Archives