Home » Commentary » Can we rely on trying to ‘harvest’ the web?
Our mission
Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.
← State Agency Databases Activity Report 5/6/2012 House votes to cut “intrusive” American Community Survey (ACS) →
Latest Posts
- John Oliver again nails it re environmental racism. Oh and EPA is sunsetting its online archive
- FGI’s recommendations for creating the “all-digital FDLP”
- Some facts about the born-digital “National Collection”
- New Dataset on FY2022 Congressionally Directed Spending
- Data is plural newsletter posts 2 amazing govinfo datasets: House Comm witnesses and 1900 census immigrant populations
Latest Comments
- Bernadine Abbott Hoduski on FGI’s recommendations for creating the “all-digital FDLP”
- James R. Jacobs on Data is plural newsletter posts 2 amazing govinfo datasets: House Comm witnesses and 1900 census immigrant populations
- James R. Jacobs on FGI comment on GPO RFC re Regional Online Selections Draft Policy
- Bernadine Abbott Hoduski on FGI comment on GPO RFC re Regional Online Selections Draft Policy
- Colman McMahon on Ruggles report on preservation and use of economic data liberated!
- James R. Jacobs on Analysis of GPO’s proposed Title 44 changes to FDLP and FGI’s suggestions
- Michael McCulley on Analysis of GPO’s proposed Title 44 changes to FDLP and FGI’s suggestions
- Aimee Quinn on Analysis of GPO’s proposed Title 44 changes to FDLP and FGI’s suggestions
- James A Jacobs on Analysis of GPO’s proposed Title 44 changes to FDLP and FGI’s suggestions
- James R. Jacobs on Analysis of GPO’s proposed Title 44 changes to FDLP and FGI’s suggestions
Blogroll
- ASU Gov Docs
- beSpacific
- Best. Titles. Ever. (Tumblr)
- Center for Effective Government
- Every CRS Report New Reports RSS Feed
- FDLP Desktop
- FDLP News & Events
- FullTextReports
- GISIG UW-SLIS: Gov Info, Sources, Data & Docs
- Government Book Talk
- Government Information Network (Canada)
- Government Information News from Fondren Library, Rice University
- GPO [twitter]
- INFOdocket
- Information Observatory
- Libraries+ Network
- Library Babel Fish by Barbara Fister
- NARA records express
- Open The Government
- Secrecy News
- SLA GovInfo [twitter]
- StatFountain
- Sunlight Foundation
- University of Washington Gov Pubs Finds
Can we rely on trying to ‘harvest’ the web?
Dr. David S.H. Rosenthal, who is Chief Scientist at LOCKSS, and Kris Carpenter Negulescu of the Internet Archive recently organized a workshop on the problems of harvesting and preserving the Web as it evolves from a collection of linked HTML documents to a programming environment whose primary language is Javascript.
David and Kris, with help from staff at the Internet Archive, put together a list of 13 problem areas already causing problems for Web preservation:
Read more about this on David’s blog:
Related
Tags: Web harvesting