Home » Posts tagged 'digital preservation'
Tag Archives: digital preservation
PEGI Project publishes Environmental Scan of Government Information and Data Preservation Efforts and Challenges
I’m happy to announce that today the PEGI Project released their Environmental Scan of Government Information and Data Preservation Efforts and Challenges. PEGI commissioned the most capable Sarah Lippincott as consultant to write this report, a multimodal environmental scan of at-risk federal digital content. This free, open publication describes the landscape of initiatives within and outside of government that aim to disseminate and preserve government information and data. It describes government-led initiatives, from dissemination through official agency websites to publication on third-party platforms, and reviews a range of initiatives that have emerged in recent years outside of government, both those intended to address perceived gaps and vulnerabilities in the federal government’s curation initiatives and those that add value to publicly available information and datasets. The report also addresses existing policies and infrastructure undergirding both government-led and non-government initiatives. Each section contains representative examples of initiatives relevant to federal government information.
Preserving government information is a long-term responsibility that requires ongoing coordination and commitment. By surveying the current environment, defining key features of the problem space, and identifying gaps and pressing needs, this Environmental Scan contributes to the resources available to all who seek to plan cooperative solutions.
The Preservation of Electronic Government Information (PEGI) Project is a two-year IMLS grant-funded initiative to address national concerns regarding preservation of born-digital government information by cultural memory institutions for long-term public access and use.
In a recent post on the blog of the Web Science and Digital Libraries Research Group, Shawn Jones reports on research that is vital to all those interested in long term access to government information.
- How well are the National Guideline Clearinghouse and the National Quality Measures Clearinghouse Archived? Shawn M. Jones, Web Science and Digital Libraries Research Group (July 15, 2018).
In the post, Jones reports on his research into how much of the content of two sites (more…)
Please join the PEGI Project for their May webinar. There’s a great list of speakers who will be talking about various efforts and projects to identify, collect, and preserve born-digital government information. Please RSVP and forward on to any of your colleagues and networks who may be interested. See you there!
Please join the PEGI project for a webinar on Monday, May 14th, 2018 at 12:00pm EDT to hear directly from trailblazing organizations about projects underway to identify, collect, and preserve born-digital government information. Leading figures from these organizations will be on hand to discuss the advocacy and coordination necessary to make an impact, and they can answer your questions about more ways to contribute to national efforts at a local level.
To hear about the current state of preservation efforts and contribute your ideas and priorities, please RSVP at the following link: http://bit.ly/PEGIMayWebinarRSVP.
Heather Joseph, Executive Director, SPARC
Brandon Locke, Director of LEADR at Michigan State University & Founder & co-organizer of Endangered Data Week
Rachel Mattson, Curator of the Tretter Collection for GLBT Studies at the University of Minnesota Libraries & Founder/co-leader of the Digital Library Federation’s interest group on Government Records Transparency & Accountability
Bernard F. Reilly, President, Center for Research Libraries
Justin Schell, Director, Shapiro Design Lab & Member of EDGI (Environmental Data & Governance Initiative)
Bethany Wiggin, Founding Director, Penn Program in Environmental Humanities (PPEH)
Shari Laster, PEGI Project Steering Committee
If you have any questions or comments, please direct them to [email protected]
Good on GPO for cataloging this important declassified CAESAR series of 54 online titles from the CIA. These working papers are a collection of “declassified analytic monographs and reference aids, designated within the Central Intelligence Agency (CIA) Directorate of Intelligence (DI) as the CAESAR, ESAU, and POLO series, highlights the CIA’s efforts from the 1950s through the mid-1970s to pursue in-depth research on Soviet and Chinese internal politics and Sino-Soviet relations.”
And what’s even better is that the Permanent url or PURL in their Catalog of Govt Publications (CGP) (https://purl.fdlp.gov/GPO/LPS87177) points not to the CIA’s site but to GPO’s permanent.access.gpo.gov server — which means that GPO actually captured a copy for local storage and control. And I just confirmed with Marcive that the bibliographic records will soon be pushed out through their Documents without shelves service! Now if GPO would just move all the content they have on permanent.access.gpo.gov into their govinfo.gov digital repository — which, unlike permanent.access, is going through the Trustworthy Digital Repository Audit and Certification — then all would be right with the world 🙂
Series summary: GPO has cataloged 54 online titles from a declassified CIA numbered series known as the CAESAR series. The Director of the CIA established Project CAESAR in 1952; and this series of working papers was published from 1953-1972. The purpose of Project CAESAR was to study the members of, and events affecting the Soviet leadership hierarchy. The collection focuses on internal policies and politics.
The 2016 end of term .gov/.mil web crawl is now available! We collected approximately 300TB of government websites which includes over “70 million html pages, over 40 million PDFs and, towards the other end of the spectrum and for semantic web aficionados, 8 files of the text/turtle mime type” as well as @100TB of public data via .gov FTP file servers! Thanks to everyone who participated on the project and the thousands(!) of seed nominators, both individuals and those that came in via DataRefuge and EDGI tools and public events.
The End of Term Web Archive contains federal government websites (.gov, .mil, etc) in the Legislative, Executive, or Judicial branches of the government. Websites that were at risk of changing (i.e., whitehouse.gov) or disappearing altogether during government transitions were captured. Local government websites, or any other site not part of the federal government domain were out of scope.