Home » Posts tagged 'newspapers'

Tag Archives: newspapers

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

Dodging the memory hole

Abbey Potter’s comments about preserving digital news are also very relevant to the preservation of government information.

Potter is the Program Officer with the the National Digital Information Infrastructure and Preservation Program (NDIIPP). In her post on The Signal blog, she elaborates on her closing keynote address at the Dodging the Memory Hole II: An Action Assembly meeting in Charlotte NC last month.


She quotes a presentation by Andy Jackson of the UK Web Archive in which he addresses the questions: “How much of the content of the UK Web Archive collection is still on the live web?” and “How bad is reference rot in the UK domain?”

By sampling URLs collected in the UK Web Archive, Jackson examined URLs that have moved, changed, or gone missing. He analyzed both link rot (a file gone missing) and content drift (a file that has changed since being archived). He shows that 50 percent of content had gone, moved, or changed so as to be unrecognizable in only one year. After three years the figure rose to 65 percent.

Potter says that it is safe to assume that the results would be similar for newspaper content on the web. It would probably also be similar for U.S. government web sites.

What can we learn from this and what can we do? For newspapers, Potter says, libraries have acquisition and preservation methods that are too closely linked to physical objects and that too often exclude digital objects. This results in libraries having gaps in their collections – “especially the born-digital content.” She summarizes the problem:

Libraries haven’t broadly adopted collecting practices so that they are relevant to the current publishing environment which today is dominated by the web.

This sounds exactly like what is happening with government information.

First, because GPO has explicitly limited actual deposit of government information to so-called “tangible” products (Superintendent Of Documents Policy Statement 301 [SOD 301]). This policy does exactly what Potter says is wrong: it establishes collecting practices that are not relevant to the current publishing environment. (See more on the effects of SOD 301 here.)

Second, because most of the conversation within the FDLP in the last few years has been about our historic paper collections rather than about the real digital preservation issue we should be facing: born-digital government information. (See Born-Digital U.S. Federal Government Information: Preservation and Access.)

As Potter says, “We have clear data that if content is not captured from the web soon after its creation, it is at risk.” And, “The absence of an acquisition stream for this [born-digital] content puts it at risk of being lost to future library and archives users.”

Potter outlines a plan of action for digital newspaper information that is surprisingly relevant for government information. She suggests that libraries should establish relationships (and eventually agreements) with the organizations that create, distribute, and own news content. That sounds like exactly what FDLP libraries have always done for 200+ years with paper and should be doing, could be doing, with digital government information today. There is no legal or regulatory barrier to GPO depositing FDLP digital files with FDLP libraries; indeed, GPO is already doing this de facto with its explicit actions that allow “USDocs” private LOCKSS network partners to download FDsys content.

Potter also recommends web archiving as another promising strategy. Since many agencies are reluctant to deposit digital content with FDsys, and because they are allowed by law to refrain from doing so, web archiving is a practical alternative, even if it is imperfect. Indeed, GPO does its own web harvesting program. Although some libraries also do web harvesting that includes U.S. Federal government web sites, more needs to be done in this area. (See: Webinar on fugitive documents: notes and links.)

I find it ironic that libraries are not at least experimenting with preserving born-digital government information. It is difficult to find an article about library projects that does not assert scarcity of funds or high barriers of copyright to overcome in digital library projects. So, why not use born-digital government information as a test bed for preserving digital content? The FDLP agreements and commitments are already in place, most of the content is public domain, and communities of interest for the content already exist. FDLP libraries could start today by building digital library collections and test-bed technology for government information and later expand to other more difficult collections and build on a base of experience and success. The fact that this would help our designated communities, preserve essential information, and further the goals of the FDLP would be welcome side-effects.

State Agency Databases Project: 2014 Statistics

At the beginning of each year, I (Daniel) compile yearly statistics for the pages of the State Agency Databases Project at http://wikis.ala.org/godort/index.php/State_Agency_Databases. Here are a few highlights from 2014.

State Pages

We had five states top 10,000 visits a year, led by Missouri with 22,069 visits.

1. Missouri  (Annie Moots)                       22069
2. Florida     (Wilhelmina Randtke)        17558
3. California  (Joel Rane)                          14012
4. Ohio  (Kirstin Krumsee)                       12364
5. Alaska (Daniel Cornwall)                     10583

Virginia had the fewest visits at 1638, but even this state page was visited an average of 4.5 times a day in 2014.

Subject Pages

Here are all eight of our subject collection pages, ranked by number of visits received in 2014:
1. Prisoner Locater Tools                                                  10373
2. Health Practitioner Databases A-M                            7802
3. Historical Media databases                                           2441
4. Biographical databases                                                  2251
5. Historical Newspaper and Magazine Indexes           2124
6. Health Practitioner Databases N-Z                             1893
7. Official records databases                                              1711
8. Museum Collection databases                                       986

If you’d like to establish a new subject collection on the State Agency Databases project or would like to build a subject collection on your own site using project links, let me know.

If you are interested in full project statistics from 2011 forward visit:

If you have questions or comments about these statistics, please leave a comment here or e-mail me.

State Agency Databases Activity Report 4/6/2014

It was a busy week for the volunteers at the State Agency Databases Project at http://wikis.ala.org/godort/index.php/State_Agency_Databases.


Today’s featured database is from Susanne Caro who maintains the Montana page:

Montana Field Guides http://fieldguide.mt.gov/default.aspx

“These guides and this website are a collaborative effort between the Montana Natural Heritage Program and Montana Fish, Wildlife and Parks.  The Animal Field Guide provides information on identification, habitat, ecology, reproduction, range, and distribution of Montana’s animals; new features include a hierarchal approach to finding an animal of interest, thumbnail photos of the animals and additional links.  The Plant Field Guide offers information on plant species of concern, including references and photographs.” –FWP description



See the full story of the last week’s changes by visiting http://tinyurl.com/statedbs. Below are some highlights of the week.






Security Interest Filings – To search for information on security interest filings made with the State Department of Assessments & Taxation.  One can search by the business name, by individual’s name and/or by the file number.

NEW YORK (Michael Tatonetti)

Child Care Facility – Search for child care facilities. Search by license/registration ID, facility name, type, county, or zip code. You can also limit your search by facilities that administer medication and facilities that offer non-traditional hours of care.

VIRGINIA (Louise Buckley)

Virginia Natural Heritage Database Search – Users can generate lists of plants and animal species that occur in specific counties, watersheds, subwatersheds or regions such as the Cumberland Mountains or the Outer Coastal Plain. Searches can be done on individual or groups of resources, by scientific or common name, taxonomic group, federal or state legal status, and global or state rarity rank.

WISCONSIN (Mark Rozmarynowski)

Find Local Crime Victim Resources – from the Office of Victim Services. Resources for survivors of domestic violence, sexual assault, child abuse, drunk driving, homicide and other crimes. Resources searchable by county.

Historical Newspaper and Magazine Indexes

Maryland Online Digitized Newspapers – These online digitized newspapers are part of the Special Collections of the Maryland State Archives.  These newspapers are from the 18th century to the mid-20th century.  The digitized newspapers are listed in alphabetical order along with its time period.  Scroll down the page to find if a particular newspaper title that you needed is on the list.  Please read the noted information on fair use from the bottom of page.



Michigan (Michael McDonnell)

Michigan School Report Card – Search schools by name, school district or zip code.  Search results provide a list of schools and links to their scores for student achievement in language and mathematics as well as indicators of school performance (teacher quality, student dropout rate). Former URL: https://baa.state.mi.us/ayp

NY Times publishes documents

More than once here at FGI we have lamented the fact that newspapers have not used the web to link to documents (of all kinds, not just government publications) that they cite. The New York Times is doing a better job of this than most.

I recently realized that they even have a server named documents.nytimes.com. I noticed this when following a story (Army History Finds Early Missteps in Afghanistan, By James Dao, December 30, 2009) about a new, unpublished Army history of the war in Afghanistan.

The report, “A Different Kind of War,” was “written by a team of seven historians at the Army’s Combat Studies Institute at Fort Leavenworth, Kan., and based on open source material, it is scheduled to be published by spring.” The Times posted the document online http://documents.nytimes.com/a-different-kind-of-war#p=1 , but not as a PDF or other downloadable format, but as a series of page-images. I would certainly prefer to see the option of downloading the entire document and can’t see why the Times didn’t provide that option. (There are no ads on the pages I viewed, so it isn’t a matter of forcing you to view an ad for every page you read.) Presumably the published version will be available for downloading and preservation, but it would be better if this version was also available for downloading and preservation. That would make it easier for scholars to use now and easier to compare changes when the final version is released.

I also noticed that, if you go to the root web directory of the Times documents web site (http://documents.nytimes.com) you are redirected to http://documents.nytimes.com/atom which is an RSS (actually “atom” — a similar format) feed of documents posted. That is excellent!

NY Times publishes some FOIA documents

In an investigation on how the Bush administration uses retired military officers to promote its message on the Iraq war, the New York Times successfully sued the Defense Department to gain access to 8,000 pages of e-mail messages, transcripts and records describing years of private briefings, trips to Iraq and Guantanamo and an extensive Pentagon talking points operation.

The story based on these documents (Behind Military Analysts, the Pentagon’s Hidden Hand By David Barstow, New York Times, April 20, 2008) is supplemented online by “Audio, video and documents that show how the military’s talking points were disseminated” (How the Pentagon Spread Its Message and a “Document Archive,” which allows users to read and download documents and parts of documents. Of the 8000 pages, only a few are available online, but these include emails, a “Talking Points Memo,” excerpts from a Transcript of meeting with Mr. Rumsfeld, and a Pentagon document that reports “Monitoring of Analysts.”

Together, the audio-visual presentation and the documents are a small model for how newspapers could be using the power of the web to enhance their coverage and utility. I would certainly like to see all 8000 pages online!

The story itself is a fascinating glimpse behind the scenes of the daily news.

Internal Pentagon documents repeatedly refer to the military analysts as “message force multipliers” or “surrogates” who could be counted on to deliver administration “themes and messages” to millions of Americans “in the form of their own opinions.”

…Analysts have been wooed in hundreds of private briefings with senior military leaders, including officials with significant influence over contracting and budget matters, records show. They have been taken on tours of Iraq and given access to classified intelligence. They have been briefed by officials from the White House, State Department and Justice Department, including Mr. Cheney, Alberto R. Gonzales and Stephen J. Hadley.