The Journal Nature has a special issue about “Big Data” with articles by Clifford Lynch, Cory Doctorow, and others. The whole issue is worth reading and is freely available online for a short time.
Coping with floods of data is now one of science’s biggest challenges. In this Nature Special, we assess the need to complement smart science with smart searching; look at what the next Google will be; talk to the pioneering biologists who are trying to use wiki-type web pages to manage and interpret data; and recall that the first mass data crunchers were not computers, but the remarkable women of Harvard’s Observatory.
In the area of government information, David Goldston, the former chief of staff of the House Committee on Science, writes about environmental data.
- Big data: Data wrangling by David Goldston, Nature 455, no. 7209 (September 3, 2008): 15.
He notes that there is no set of environmental indicators that is regularly updated — something akin to economic statistics — and that a report by the Heinz Center on the State of the Nation’s Ecosystems (www.heinzcenter.org/ecosystems) is chock-full of lists of subject and geographical areas for which few if any data exist.
He calls attention to the The Data Quality Act, which “has been anathema to environmental groups, which have seen it as a way to stymie regulation. And it has been primarily invoked by corporations questioning studies that raise alarms about their products.” (The act is less than half a page in a public law of more seven hundred pages (Public Law 106-554 Sec. 515; Statutes at Large volume 114, pages 2763A-153 to 2763A-154, available online as plain text and as pdf).
He also says that, “Even when instrumentation is regularly funded, as some kinds of satellites are, money is often lacking to maintain the data or to make them sufficiently accessible or digestible.”