Home » Posts tagged 'Transcription'

Tag Archives: Transcription

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

NARA and NOAA join Old Weather Project to crowdsource transcription of historic naval ship weather logs

According to today’s press release from NOAA, the National Archives (NARA) and NOAA are teaming up and joining the Old Weather Project hosted at Zoonivers.org to crowdsource the transcription of historic ships’ logs in order to extract critical environmental data. The Old Weather Project began over 2 years ago with British Royal Navy log books — 16,400 volunteers have transcribed 1.6 million weather observations so far! Transcribed data produced by Old Weather volunteers will be integrated into existing large-scale data sets, such as the International Comprehensive Ocean Atmosphere Data Set (ICOADS). Human volunteers are so important in this case because Optical Character Recognition (OCR) technologies cannot currently recognize hand-written text.

Before there were satellites, weather data transmitters, or computer databases, there were the ship’s logs of Arctic sea voyages, where sailors dutifully recording weather observations. Now, a new crowdsourcing effort could soon make of the weather data from these ship logs, some more than 150 years old, available to climate scientists worldwide.

NOAA, National Archives and Records Administration, Zooniverse — a citizen science web portal — and other partners are seeking volunteers to transcribe a newly digitized set of ship logs dating to 1850. The ship logs, preserved by NARA, are from U.S. Navy, Coast Guard and Revenue Cutter voyages in the Arctic between 1850 and the World War II era.


Organizers hope to enlist thousands of volunteers to transcribe scanned copies of logbook pages via the Old Weather project with an eye to Information recorded in these logbooks will also appeal to a wide array of scientists from other fields – and professionals from other fields, including historians, genealogists, as well as current members and veterans of the U.S. Navy and Coast Guard.

[HT to Gary Price at InfoDocket for calling our attention to this project!]

Sunlight Foundation’s Transparency Corps Recruits People Amazon Turk Style

The Sunlight Foundation recently announced the creation of the Transparency Corps. Modeled after Amazon’s Mechanical Turk, the Transparency Corps aim to make it easy to harness small efforts by enthusiastic volunteers to move forward efforts to improve government transparency.

From the June 30, 2009 Sunlight Foundation press release:

“Inspired by Amazon’s Mechanical Turk, Sunlight created Transparency Corps as a new way for people to volunteer to make government transparency a reality,” said Ellen Miller, executive director and co-founder of the Sunlight Foundation. “Now, when people ask ‘how can I help?’ Sunlight and future partners can provide micro-tasks that when aggregated, help solve research and data analysis problems when computers alone cannot properly scrutinize government information.”

Right now there are two projects:

Each time you complete a task, you get points. Those points add up and are how you move up the transparency leader board. I joined up to see what a task would look like. For the earmarks task I was presented with a PDF of a letter requesting funding for a local project and a form to the right of the letter to be filled in with data such as the quantity requested, title of the project and other requester information. You can see an example of one of the letters on ScribDB.

I am curious to see how big they can grow their corps & see what projects they target over the next year. I love that they are grabbing structured data. This particular task is part transcription and part encoding and reminds me of some of the work being done over on Freebase.com. For an example of one of the datasets they are building, take a look at their U.S. National Register of Historic Places base or the Government Commons.