The United Kingdom has it's own version of data.gov and it has the added cachet of being promoted and advised by Sir Tim Berners-Lee.
This site seeks to give a way into the wealth of government data. [T]his means it needs to be: easy to find; easy to licence; and easy to re-use. We are drawing on the expertise and wisdom of Sir Tim Berners-Lee and Professor Nigel Shadbolt to publish government data as RDF – enabling data to be linked together.
- Tim Berners-Lee unveils government data project, BBC (21 January 2010).
Web founder Sir Tim Berners-Lee has unveiled his latest venture for the UK government, which offers the public better access to official data.
A new website, data.gov.uk, will offer reams of public sector data, ranging from traffic statistics to crime figures, for private or commercial use.
In this case, the Open Knowledge Foundation has gone a long way toward clarification…
See specifically SEE: http://opendefinition.org/
From the the “Open Knowledge Definition” home page:
“In the simplest form the definition can be summed up in the statement that ‘A piece of knowledge is open if you are free to use, reuse, and redistribute it’. “
In detail the definition suggests: [for sake of clarity, I have here deleted –- marked
“The term knowledge is taken to include:
1. Content such as music, films, books
2. Data be it scientific, historical, geographic or otherwise
3. Government and other administrative information
“Software is excluded despite its obvious centrality because it is already adequately addressed by previous work.
“The term ‘work’ will be used to denote the item of knowledge at issue.
“The term ‘package’ may also be used to denote a collection of works. Of course such a package may be considered a work in itself.
“The term ‘license’ refers to the legal license under which the work is made available. Where no license has been made this should be interpreted as referring to the resulting default legal conditions under which the work is available.”
“A work is ‘open’ if its manner of distribution satisfies the following conditions:
- Access: The work shall be available as a whole and at no more than a reasonable reproduction cost, preferably downloading via the Internet without charge. The work must also be available in a convenient and modifiable form.
- Redistribution: The license shall not restrict any party from selling or giving away the work either on its own or as part of a package made from works from many different sources. The license shall not require a royalty or other fee for such sale or distribution.
- Reuse: The license must allow for modifications and derivative works and must allow them to be distributed under the terms of the original work. The license may impose some form of attribution and integrity requirements: see principle 5 (Attribution) and principle 6 (Integrity) below.
- Absence of Technological Restriction: The work must be provided in such a form that there are no technological obstacles to the performance of the above activities. This can be achieved by the provision of the work in an open data format, i.e. one whose specification is publicly and freely available and which places no restrictions monetary or otherwise upon its use.
- Attribution: The license may require as a condition for redistribution and re-use the attribution of the contributors and creators to the work. If this condition is imposed it must not be onerous. For example if attribution is required a list of those requiring attribution should accompany the work.
- Integrity: The license may require as a condition for the work being distributed in modified form that the resulting work carry a different name or version number from the original work.
- No Discrimination Against Persons or Groups: The license must not discriminate against any person or group of persons.
- No Discrimination Against Fields of Endeavor: The license must not restrict anyone from making use of the work in a specific field of endeavor. For example, it may not restrict the work from being used in a business, or from being used for military research.
- Distribution of License: The rights attached to the work must apply to all to whom the work is redistributed without the need for execution of an additional license by those parties.
- License Must Not Be Specific to a Package: The rights attached to the work must not depend on the work being part of a particular package. If the work is extracted from that package and used or distributed within the terms of the work's license, all parties to whom the work is redistributed should have the same rights as those that are granted in conjunction with the original package.
- License Must Not Restrict the Distribution of Other Works: The license must not place restrictions on other works that are distributed along with the licensed work. For example, the license must not insist that all other works distributed on the same medium are open.”
Thomas Jefferson said:
"If nature has made any one thing less susceptible than all others of exclusive property, it is the action of the thinking power called an idea." And also --as noted in a previous blog -- "The field of knowledge is the common property of all mankind."
A century or so later US Supreme Court Justice Louis Brandeis (in the 1918 decision, International News Service v. Associated Press] wrote: "…the general rule of law is, that the noblest of human productions—knowledge, truths ascertained, conceptions, and ideas—became, after voluntary communication to others, free as the air to common use."
Jumping ahead a few decades, Stewart Brand famously declared at a Hackers Conference in the 1984’ – “Information wants to be free…” (while also noting that information is “expensive” creating an inevitable tension…)
When we call for "free" access to knowledge resources (used here to stand for data, information and knowledge [for some working definitions SEE: Moritz, Building the Biodiversity Commons Appendix 3 http://www.dlib.org/dlib/june02/moritz/06moritz.html ] ), we are saying that access to knowledge should not be a privilege with access granted only to those that can afford the current market price. Knowledge should not be placed behind “pay walls”. To assert this right to free access is to urge that as a national and global community our common welfare demands the free access to knowledge.
The creation of mechanisms of impedance to the free flow of knowledge has tremendous societal costs. Consider the “transaction costs” entailed every time any writer or researcher must simply contact an author or publisher for permission to use any resource. (I estimate that at the American Museum of Natural History, early in the last decade, we invested about $25,000 just to perform due diligence to secure our right to freely disseminate our own scientific publications.) Consider the transaction costs implied by the "inter-library loan" industry?
Add to transaction costs the possibility that additional charges may be assessed before an article can be actually used. As an independent researcher, without a current institutional base, I was forced to pay Nature/ MacMillan $32 US for access to the "Commonwealth of Science"(1941) article cited in a previous blog. Could I have found ways to circumvent? Yes, of course, that is hardly the point -- I intend to act in good faith as do most people.) Consider the plight of any teacher ambitious enough to seek use of original source materials – or of any public school student or parent for that matter… Consider costs associated with health care information...?
Having asserted this right to access we are obliged to address the question of cost and of fair compensation for the creation of knowledge. Since the era of Ronald Reagan and Margaret Thatcher the Anglo-American polity has been in a kind of thrall – a few years ago when I proposed an alternative system of public compensation for knowledge creation – a colleague – very highly place in a professional scientific society asked me with incredulity – “you mean pay for it with taxes”?
It has become almost an a priori article of faith that public investment is somehow bad (except, I can not help adding, when required to bail out major financial institutions and insure exorbitant bonuses for financial executives).
Market fundamentalists (and “casino economists”! SEE: JM Keynes), notwithstanding, the United States has always depended upon public investment to insure the viability of our economy. Whether by investment in postal service, energy, public schools/libraries/museums, the Interstate Highway system, the National Science Foundation or the Internet, it has been public investment that has created the infrastructure for our economic success and for innovation. And it has been the economic opportunities created for individuals by public investment that have continued to draw to our shores the ambitious and the energetic, the innovative and the productive. It is the rich diversity of America’s population that is our greatest asset and that holds our greatest hope for meeting 21st Century challenges.
Freeing “government information” is a fine starting point but all knowledge must be free…
I believe that we need an Andrew Carnegie for the 21st Century – assets that have been locked away behind pay walls should be placed in the public domain. And we need new paradigms that sustainably support and fairly compensate research and intellectual work but require release of knowledge products for free public use. (The open access publishing model suggests one such strategy…)
Next, "open"? And effective?
In the sciences, the general case for sharing of all scientific knowledge (and knowledge resources) has long been clearly articulated. Robert K. Merton, sociologist of science at Columbia made the case in 1942 -- where he said: “The substantive findings of science are a product of social collaboration and are assigned to the community. They constitute a common heritage in which the equity of the individual producer is severely limited…”
[SEE: Robert K. Merton, “A Note on Science and Democracy,” Journal of Law and Political Sociology 1 (1942): 121.] at about the same time (1941) a "Declaration of Scientific Principles" appeared in Nature -- “7. The pursuit of scientific inquiry demands complete intellectual freedom. And unrestricted international exchange of knowledge…“ [SEE:“The Commonwealth of Science, ” Nature No.3753 October 4, 1941] These scientific values have been affirmed and re-affirmed many times. In fact, the predecessor to UNESCO was the League of Nations Committee on International Intellectual Cooperation... [SEE related: "The union of International Associations" and "The Mundaneum" http://www.uia.be/node/85 ]
The principle that access to knowledge is an essential human right (and fundamental to effective citizenship) has also been widely affirmed. The Universal Declaration of Human Rights, Article 19 declares: "Everyone has the right to freedom of opinion and expression; this right includes freedom to hold opinions without interference and to seek, receive and impart information and ideas through any media and regardless of frontiers." [SEE: http://www.un.org/en/documents/udhr/index.shtml ]
In fact, the authors of the US Constitution realized that access to knowledge was essential to the public welfare -- the notion of limited monopoly on "intellectual property" -- as defined in the provisions for patent and copyright make this clear and by the recognition of the public domain. (Thomas Jefferson, corresponding with his Secretary of War in 1807, wrote: "The field of knowledge is the common property of all mankind.")
In 1954, when Edward R. Murrow asked him about patenting the polio vaccine, Jonas Salk famously commented" "That would be like patenting the sun..." 35 years later Nobel Laureate Joshua Lederberg was warning of the deterioration of the ethic of sharing [SEE: “Data Sharing: A Declining Ethic? -- Commercial pressures and heightened competition are testing the notion that scientific data and materials should be widely shared.” Science v. 248 p952- 957, 25 May 1990]. It seems more than a little ironic that now -- 50 years later -- "20% of human gene DNA sequences are patented" [Science Magazine Policy Forum: K.Jensen and F. Murray, "Intellectual Property: Landscape of the Human Genome," Science 14 October 2005: Vol. 310. no. 5746, pp. 239 - 240].
Thus sadly, despite clear and longstanding articulation of the principles of free and open access to knowledge, there has been a strong countervailing trend toward restriction of knowledge ["commoditization" -- SEE: J. Birkinshaw and T. Sheehan, “Managing the Knowledge Life Cycle,” MIT Sloan Management Review, 44 (2) Fall, 2002:77].
So, assuming (as in my previous post) that librarians are both stewards and advocates -- how do we make our case...? 1) We marshal all available historical and philosophical evidence for support of open access [in effect, I've cited just a few of the diverse sources for such a case...] 2) we organize and broadly disseminate by publishing, presenting, discussing, teaching 3) we insist upon evidence-based public policy with full transparency -- not only of data but of the logic that directs the definition of data as evidence 4) we insist upon transparent processes by which data can be transparently and effectively scrutinized -- this means specifying all forms of transformation to which data are subject and presenting the "chain of custody" / provenance of data thus certifying both logical validity and technical integrity 5) from a policy perspective, we begin with the "lowest hanging fruit", which politically means we start with bio-medicine (humans are naturally enough -- anthropocentric) -- thus NIH, CDC and UN/WHO and their evolving policies -- but we also push in other domains like conservation, agriculture and agrarian science, education... 6) we analyze carefully and advocate for the broad implementation of "qualified peer review mechanisms" (NSF provides an excellent model) 7) we aggressively advocate for K-gray science literacy...
More about all and each of these points will follow...
The University of Minnesota Libraries has taken a new approach to its planning process this year to help deal with seemingly conflicting realities. On the one hand, everything said publicly by University administration indicates that the U's financial future is Not Good. On the other, the Libraries has several projects in place that are innovative and many, many more on hold that would also be fabulous. These projects are in addition to the regular day-to-day work of a library. Something has to give somewhere, but the Libraries can't just metaphorically throw its hands in the air and say "the heck with this, I'm out".
So, the Libraries is hosting a speaker series with the goal of moving from lemons to lemonade. There have been two speakers so far - Lorcan Dempsey and Paul Courant. See https://wiki.lib.umn.edu/Staff/UniversityLibrariesSpeakerSeries for more information - future speakers will be Jim Neal and Clifford Lynch. While online access is limited during the talks, the future speakers will be recorded and the webcasts posted soon after for all to view. And, at the risk of sounding sycophantic, I believe our University Librarian's - Wendy P. Lougee - opening remarks are also worth a listen on their own merits.
Lorcan Dempsey - "Discovery and Delivery"
Dempsey began by describing levels of rarity of library collections based on OCLC data with the suggestion that where libraries should focus their expenditures (presumably on preservation, simply having the space to hold, doing really good digitization, etc) is on the rare items. Non-rare items could reasonably be entrusted to network-level services like the Hathi Trust. He then presented a typology of library collection types sorted by rarity and current levels of stewardship. Government publications fell into high stewardship, but low rarity. Dempsey acknowledged that this was a broad characterization and that there might be rare items within a category like government publications or maps. Also, the University of Minnesota is a partner in the Hathi Trust and has sent some of its government publications collection in for digitizing, so the Libraries are already on the path he's describing here. Caveats aside, I feel that he provides a well-reasoned and evidence-based rationale for shifting stewardship away from non-rare items and towards collections that are getting no real attention at all. This was only a tiny portion of his overall talk and I recommend going through the entire powerpoint or webcast to get the full presentation.
Presentation, Webcast, Related Readings: https://wiki.lib.umn.edu/Staff/UniversityLibrariesSpeakerSeries#dempsey
Paul Courant - "Scholarly Communications and Publishing"
Courant's talk can be best described as a reflection on just what is it that we'd like to pay for. He framed part of the problem in terms of the Parable of the Anarchist's Annual Meeting (see http://www.econ.ucsb.edu/~tedb/Journals/anarchists.pdf). In short: with coordination - either between libraries or between libraries and smaller publishers or both - we can take at least some control of the journal publishing arena. We already spend a fortune on a situation we don't like. Surely the logical thing is to begin to spend some money on creating a situation more to our liking. This includes taking on more of a publishing role and allying ourselves with societies and small publishers (including university presses) who might be more interested in the benefits of open access that the big vendors. However, when I asked if he was advocating canceling contracts with big vendors, he answered (I'm paraphrasing) "Well, probably not. Well, not entirely. Might want to pass on those Big Deals they offer though."
He also felt the library community should speak up loudly in favor of the recent RFI from the Office of Science and Technology Policy regarding increased access to the products of federally funded research. At the same time he reiterated that open access isn't exclusively a library issue. In fact, he said it's a faculty issue. Libraries need to keep pushing on the topic, but pushing faculty to understand that this is an arena they can control if the choose to do so.
Courant isn't a librarian - he's an economist by background and I found his application of an economics perspective refreshing. Again, like Dempsey's talk, there was no magic "the Libraries should do this" moment because we are in a tough spot without easy resolution. But, also like Dempsey's talk, he has a great way of expressing the issues facing libraries.
Presentation, Webcast, Related Readings: https://wiki.lib.umn.edu/Staff/UniversityLibrariesSpeakerSeries#courant
I don't know if these speakers really will lead to concrete ideas for coping with our budget problems, but I sure am glad we're having them - each one has been thought-provoking.
The Office of Science and Technology Policy (OSTP) is requesting input regarding enhanced access to federally funded science and technology research results, including the possibility of open access to them. Comments can be e-mailed to email@example.com. The deadline for comments is January 7, 2010. For more, see the Federal Register announcement:
Federal Register: December 9, 2009 (Volume 74, Number 235) Page 65173-65175
On his first day in office, the President issued a Memorandum on Transparency and Open Government that called for an "unprecedented level of openness in government" and the rapid disclosure of one of our nation's great assets--information. Moreover, the Administration is dedicated to maximizing the return on Federal investments made in R&D. Consistent with this policy, the Administration is exploring ways to leverage Federal investments to increase access to information that promises to stimulate scientific and technological innovation and competitiveness. The results of government-funded research can take many forms, including data sets, technical reports, and peer-reviewed scholarly publications, among others. This RFI focuses on approaches that would enhance the public's access to scholarly publications resulting from research conducted by employees of a Federal agency or from research funded by a Federal agency.
[Thanks to Charles Bailey for originally posting to DIGLIB list!]
Thus, I finally updated the latest list of Bills and contact information for the sponsoring Congressmen in the Delicious.com "CRS" tag Delicious.com "CRS" tag.
Following up on Can Proprietary Formats make Government More Open? :
Josh Tauberer of govtrack.us, points us to The good and the bad of PDFs (OpenGovData.org wiki) in which Kevin Lyons, who works for the Nebraska legislature, wrote up some guidelines for PDF in government.
Lyons reminds us that not all PDF files are equal and he enumerates some of the advantages and disadvantages of encapsulating government information in PDFs.
Given how popular the PDF standard itself is, it shouldn't be a surprise that the term PDF actually covers a wide variety of different types of files. While all PDF files fit the PDF standard, there are several different subtypes of PDF that are helpful in the government world.
States urged to create data catalogs, by Joab Jackson, Federal Computer Week (Oct 07, 2009).
Federal agencies shouldn't be the only ones to open their data for the public — states and local governments should also be ramping up efforts to become more transparent, the National Association of State Chief Information Officers (NASCIO) contends in a newly published report.
The first thing a state should do is create a one-stop portal, or data catalog, for all its publicly-accessible data, along the lines of the White House's Data.Gov , the report states.
Law.Gov: America's Operating System, Open Source, by Carl Malamud, O'Reilly Radar (Oct 15, 2009).
Public.Resource.Org is very pleased to announce that we're going to be working with a distinguished group of colleagues from across the country to create a solid business plan, technical specs, and enabling legislation for the federal government to create Law.Gov. We envision Law.Gov as a distributed, open source, authenticated registry and repository of all primary legal materials in the United States. More details on the effort are available on our Law.Gov page.