Free Culture and the Digital Library
Ten days ago, I was privileged to be able to participate in a symposium, Free Culture and the Digital Library, at Emory University in Atlanta Georgia. The symposium included keynotes by Lawrence Lessig, Siva Vaidhyanathan, and Clifford Lynch and more than a dozen papers.
The following are the notes I used for the presentation I gave and are based on the paper, Government Information in the Digital Era: Free Culture or Controlled Substance? by Karrie Peterson (NCSU Libraries, North Carolina State University) and James A. Jacobs (Data Services, University of California San Diego). We provide these notes now as a brief summary of the paper. This is not a transcript.
Karrie and I are here today to talk to you about government information.
It may seem odd to you that we’re talking at a session dealing with the problems of copyright and orphaned works about a body of information that, for the most part, is not copyrighted and therefore has little or no “orphaned works” problems.
But, if you see the copyright issue as an issue of “control”, then what we have to tell you fits right in.
Today, to address issues of access to government information we have to deal with the same kinds of questions of control that haunt those who deal with copyrighted materials and orphaned works.
who controls access?
what information will be available?
when will information be altered, changed, or withdrawn?
where will users find information?
how much will readers have to pay?
The reason for this is that, in the digital age, control of government information is rapidly shifting from us (the public, libraries, multiple institutions, the free information-commons) to the federal government.
This shift means that, where once the decisions about content of collections, organization of the collections, access points, utility, privacy of users, and no-fee access were all up to us, locally, now these are all controlled by the government.
Unfortunately, this shift in control is not obvious and is masked by the enhanced access we’ve seen when the government puts information on the web.
We will mostly speak today about U.S. federal government information, though some of what we say applies to state and local government information as well.
By “Government Information” we mean information collected, compiled, and created by governments in their official capacity.
This is information created by us collectively with our tax dollars through government agencies acting under mandates of law.
It is created for us and is the official public record of our democracy
And, the law requires that the information created by the federal government must be freely available to the public.
Examples of government information include:
– information that the government collects, such as information about toxins in the groundwater; censuses of population and business; and so forth…
– information about the performance of government, such as reports by the Government Accountability Office (on the effectiveness or legality of government policies;)
– Congressional deliberations as documented in the Congressional Record and committee hearings,
And, Government information comes in every imaginable form: as books, serials, maps, pamphlets, images, data, and so forth.
Let’s look at roles — past and present.
In the paper-and-ink world, the roles of government and libraries were clear:
The government collected, assembled, and created information and printed and distributed it. At the point the information was distributed, the role of government in access to and preservation of that information essentially ended.
Through a legislatively mandated program called the Federal Depository Library Program (FDLP), the government deposited documents in depository libraries.
The FDLP Libraries built collections of government information and provided access to and service for that information.
The FDLP Libraries preserved the information, and, if a particular library wanted to withdraw an item, the program provided mechanisms to ensure that the document was preserved somewhere in the system of over 1000 libraries nationwide.
In the digital world, all this has changed.
One conservative estimate by the Superintendent of Documents (who oversees the FDLP) is that only 14 percent of federal government information is deposited in libraries now.
So, it we know that something like 86% of all government information is available only from government-controlled web servers.
Unfortunately, the provision of “easy access today” is not the same as providing a “secure, sustainable information infrastructure” or guaranteeing long-term access.
Let’s examine, then, why it we should be concerned with government information only being available from gov. controlled web servers.
There are several issues that we want to outline quickly for you. Most of these issues will be familiar to you in one form or another, so we’ll cover them quickly…
We see three categories of problem: technical, economic, and issues of control.
Let’s look briefly at the technical issues.
Put simply, technological constraints, designed to protect copyrighted and licensed information may inadvertently limit and constrain access to government information.
If, for example, peer-to-peer tools are made illegal or regulated in such a way as to make their use difficult or problematic, or if P2P technologies are undermined in a way that smothers innovation, then we will not be able to use such tools for dissemination and re-distribution of government information. (e.g. LOCKSS)
If the government creates laws like the Induce Act or the Broadcast Flag regulation, these will affect how public domain materials can be used.
It is not even clear that we will be able to use our own hardware to make lawful copies of public domain material if the hardware industry follows proposals to incorporate copy control technologies aimed at prohibiting unlawful copying.
Then there is what I call the ‘poison-pill’ copyright problem.
Non-copyrighted government information is being mixed with copyrighted information and served through proprietary interfaces and bundled with proprietary software in proprietary formats.
The census data being distributed w/o fee to depository libraries is locked in a proprietary format that requires commercial software that only runs on current versions of windows.
Imagine the dilemma of a librarian or a citizen being prevented by The DMCA (Digital Millennium Copyright Act) from reverse-engineering public domain government information wrapped in a proprietary interface.
And we are seeing an increasing number of federal government web sites that have vague, disclaimers about part of the site containing copyrighted information. Most explicitly say that it is up to the user to figure out what is copyrighted and what is not and what the user is allowed to do and prohibited from doing.
With the Internet Archive being sued for storing copies of copyrighted materials it makes us wonder if we will we be allowed to preserve government information by spidering and storing copies.
Of course, it is possible that more reasonable laws, regulations, and industry standards will be developed and we won’t find ourselves in a world where a DVD of a Presidential Press conference is locked down the same way as a new Hollywood blockbuster. But there is still a large potential problem of governments using the ‘wrong’ tools or using tools in the ‘wrong way.’
We saw examples of this recently when both the copyright office and FEMA said that they were developing web sites that required the Microsoft web browser Internet Explorer — which is notoriously bad at conforming to open web standards and runs only on current versions of Windows. (no Mac or Linux users allowed.) And both agencies gave essentially the same explanation: they didn’t have the time or money to develop web sites that conformed to open web standards.
Governments will use the tools that are available and if those tools assume copy protection, digital rights management, and so forth, governments will create information that has those characteristics.
Another example of this is the Government Printing Office’s interest in using “Digital Object Identifiers” (DOI) for the reasonable purpose of better managing the pointers to online materials. Unfortunately, the intended purpose of DOIs includes checking the authority of a person to access a document, to protect copyright, and to prevent “piracy.”
How can we ensure that a technology designed to do these things for commercial users won’t subvert legitimate use of public domain materials?
Our concern about this is compounded when we look at issues of economics and control.
Let’s look next at problems that economic in nature.
The first economic problem is the cost of keeping digital information available over time. Digital preservation, format and media migration, maintaining documents online–all are all expensive.
This will put the cost of information access and preservation in competition with other federal budget items.
Imagine Congress mulling over spending a few million dollars to maintain online access to employment data for women or minorities that is 10 or 20 years old, or an annual report from an agency that is now defunct, or “out of date” economic data. Imagine whether or not these expenses will get priority over national security, education, or social security.
The second economic problem is that information is valuable and government agencies may want to sell their information rather than give it away for free.
Government information that has economic value includes
– aggregate census information that allows marketers to identify neighborhoods for locating stores or zip codes for directing ads,
and a wide variety of information about individuals including:
– who has bought or sold property,
– who has married or divorced,
– who has had a child or a death in the family,
– gis data
The problem in the digital age is that if agencies choose to sell digital information, they cannot make the same information available without charge — even to libraries. They have to be sure that the information they sell cannot be re-used or re-distributed. They can do this with licensing restrictions or DRM technologies.
We’ve seen dramatic evidence of this. An early attempt by the Government Printing Office to sell access to digital information it was simultaneously providing for free failed. Why would anyone pay for information they can get for free?
We saw an example of this recently when the Library of Congress produced a PDF document that GPO made available to depository libraries w/o fee, but with severe restrictions on use because the document was a “product for which costs must be recovered.”
The restrictions included:
* Files may “NOT be redistributed”
* Access only on “the premises”
* Digital access hidden from web crawlers
* Digital access prohibited by users outside the library
While it is easy to understand how a cash-strapped agency faced with a net cost of keeping information online might jump at the opportunity of turning that liability into an asset by selling that information, it is also easy to see how such policies result in citizens losing access to information.
This could mean that Citizens and libraries would have to pay for access to public information.
Another economic problem involves so-called competition between governments and the private sector. The publishing industry has argued for years that governments should not compete with the private sector.
We see increasing amount of government information being privatized. most recently was a proposal to privatize a prestigious and important journal, “Environmental Health Perspectives”
In the digital age, the private sector argues that the government should have a very limited role in the dissemination of information and offer online services only under limited circumstances “even if private-sector firms are not providing them” and that governments “should generally not aim to maximize net revenues or take actions that would reduce competition”
If we rely only on the government to provide access to information it produces, we may find information leaving the public domain as it becomes privatized.
A third problem is the issue of Control.
There is and will probably always be a tension between openness and secrecy, between government control of information and citizen access to and use of information.
While many government employees and politicians are very supportive of public access to government information, many are not. As they say in Washington, “information is power” and controlling access to information is a very great power.
This is not, by the way, a Republican vs. Democrat issue or a liberal vs. conservative issue. Though many people have become more aware of this issue in the last few years as we saw the government restrict and withdraw information, this is not even a “post 9/11” problem; it is simply a fact of political life.
There are many ways that the government can control information:
– They can simply remove files that were public when they become embarrassing and hope no one has copies. When the government insists that the only way the public can get an “authentic” copy of a government document is to go to a government web server, removal of a file allows them to disavow “unauthorized” copies that may exist.
– They can lock files with DRM technologies that allow their creator to limit who can read or use a document or even remove the the ability to read the file after it has been downloaded by a user.
– Perhaps most insidious, though, is that government can do exactly what it is doing: take a *passive* role by “making information available” rather than *distributing* information. They can put a document on a web server, but not tell anyone it is there, issue no press release, do not call attention to it, hope no one notices.
(We saw a particularly visible example of this recently when Bureau of Justice Statistics released a report that the administration found embarrassing. Rather than announce the report, the Justice Department opted not to issue a news release on the findings and simply posted the report online. Then they removed the director of Bureau who wrote the news release.)
This is the government playing a passive role. Rather than actively distributing and announcing and listing information they just “make it available” and it is up to us to find what is new, what has been changed, and what has been withdrawn.
Some see this as a good thing because it is an opportunity for librarians to make themselves useful in the digital age — by trying to find information that the government no longer lists, or catalogs, or announces, or distributes. We believe, however, that this is a bad thing because the government is neglecting its responsibility to inform the public. It is the government taking a passive approach where it should take an active approach. We believe that this passive approach in inadequate and believe that we, as librarians, should be insisting that the government take an active role.
All of these problems — technical, economic, and control — leave us with the concern that, if we don’t have copies of government information in our control, in our libraries, in our Instituional Repositories, then we will not be able to guarantee long-term access to that information, free access, or user privacy.
We believe that the beginning of a solution to these problems is to rely on what we already have: a law (title 44 of USC) that requires deposit of govinfo into depository libraries.
While this won’t solve all the problems, we believe it is the first step that provides a foundation on which we can build.
Government must provide digital information at the time of its release, w/o fee, to depository libraries. The information must be free of DRM technologies that lock down its use, and must be “fully-functional” digital versions, not less-than-optimal surrogates.
And the information must be free of contractual restrictions that restrict use and re-use, distribution and re-distribution.
How would this help?
It would make sure that the actual digital information products are usable and reusable and fully-functional and not encumbered. This means that once a library or anyone has a copy, they will be able to post it and repost it and everyone will be able to use it and reuse it. “Documents” would not be technologically “withdraw-able.”
It would ensure that the government won’t use licenses in place of copyright to restrict access, use, re-use, and re-distribution of information.
One final thought.
We see the issue of access to government information as analogous to the issue of access to academic journal articles.
When we have access to journal articles online through publishers we gain a lot of convenience, but the publishers control what will be available, what will be withdrawn, who will have access, and at what cost. Access, preservation, user-privacy are, for publishers, objectives that are secondary to their primary mission of making money.
If, however, authors deposit copies of their articles in our institutional repositories, we are in control of those collections. we decide what is available. we are able to ensure free-access and user privacy. it is our primary mission, not a secondary one.
Similarly, if we rely only on the government for access to government information, we are not in control of that information and cannot ensure access, no-fee access, or user-privacy. if we have copies, though, we can do all that and help ensure better access, more use and re-use of the information.
I want to close with a question for you. We would like to draw on your experience, ideas, and strategies.
We believe that government information is essential to democracy.
We believe that government must take an active role in distributing information w/o fees or restrictions.
We believe that libraries should be able to perform their primary mission of keeping gov. information in the public domain, freely usable and re-usable.
Our question to you is, What other strategies can we use collectively to accomplish this?