FGI volunteers Daniel Cornwall, James R. Jacobs and Shinjoung Yeo were invited to give a panel at the Nevada Library Association’s 2005 annual conference in Reno, NV, on October 21, 2005. Below is the text of James’ part of the presentation about privacy and government information in the digital age. We’d love to hear your comments, corrections, suggestions or ideas.
NLA presentation, Friday October 21, 2005, 10:30 – 12:00
Government Documents Interest Group (GODIG)
Who’s Government Information? Our Government Information!
Presenters: Cornwall, Yeo, Jacobs
Welcome and thanks for coming. My name is James R. Jacobs. I’m a government information librarian at University of California San Diego. These are my colleagues Daniel Cornwall, from the AK State Library, and Shinjoung Yeo, also at UC San Diego. The three of us — along with Jim Jacobs (yes there are actually 2 of us, both in the same department at UC San Diego!! and James Staub from the Tennessee State Library — started the advocacy organization Free Government Information almost a year ago as a way to reach a broader audience and create dialogue among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, citizens etc.) who have a stake in the preservation of and perpetual free access to government information. Today, we’d like to give some background of the debate as we see it currently working its way through the docs community, the libraries in the Federal Depository Library Program, the members of the Depository Library Council, and the Government Printing Office (GPO).
We’ll discuss the move, since about 1985, toward digital-only government information and how this move to digital is affecting and will continue to affect, how libraries do their jobs. If you’ve visited the website at http://freegovinfo.info, you know that we are concerned that the move to digital, although exciting and full of possibilities for increased library services, contains some major obstacles to overcome in order to provide free, fully-functional digital information while protecting users’ privacy. We have discussed in various forums the need to continue the FDLP tradition of document deposit and will hopefully make a strong case here for digital deposit of government information as the primary means of preserving government information, giving widespread access to that information and protecting users’ privacy in what they read and access online.
–Background and history of government information:
I’d like to start with some background and history of government information. First off, what exactly IS government information? I found this poignant quote from Anne Morris Boyd in 1949 that sums up what I mean when I talk about government information.
“Government publications, commonly called public documents, are among the oldest records, and if measured by their influence on civilization, are probably the most important of all written records. They are the sources of political, economic, and social history of peoples of all times; they contain the authentic accounts of the world’s great explorations, discoveries, and inventions in every field of human endeavor; they reveal and explain the phenomenal scientific and technological developments of modern times; they open up great treasuries wherein man has attempted to give expression to his artistic impulses. They contain the history of civilization itself in all its aspects.”
Perhaps a bit hyperbolic but you get her drift! Government information is the information collected, compiled and created by governments in their official capacity, funded by our tax dollars, and used by us for our common understanding and our common good. Government information belongs to citizens who have a right to know their government’s activities in order to participate fully in the democratic process.
Since 1860, there has been a system in place to insure public access to government information through a partnership between the Government Printing Office (GPO) and the hundreds of libraries in the Federal Depository Library Program (FDLP). Similar systems have been set up in nearly every state for state-produced information. The California State Library has been collecting documents from CA state legislative, executive, and judicial branches, commissions, etc. since 1850, before even the FDLP was around. CA has always been on the cutting edge! You may have seen Daniel’s previous presentation which shows that now Alaska is on the cutting edge in terms of digital govt information!! He’ll talk more about that later.
In essence, the FDLP has provided for centralized processing for a deliberate, distributed collection able to have been well-preserved over time. For almost 150 years, this system has been largely successful in providing for one of the core tenets of a democracy, an informed citizenry, not to mention being a veritable goldmine for researchers across the academic disciplines.
So now I’ve set the stage for Shinjoung who will talk about the current conditions we find ourselves in and ACCESS issues regarding the move to digital government information. I will then be back to talk about PRIVACY and then Daniel will wrap up with a discussion about PRESERVATION.
–Philosophy of privacy:
Libraries have traditionally protected patron privacy as one of the core tenets of librarianship. Patron privacy is an integral part of the practice of intellectual freedom inherent in the First Amendment of the Bill of Rights to the US Constitution. Like the Hippocratic oath, ALA’s Code of Ethics includes principles that guide our work as librarians. Among them are:
–We uphold the principles of intellectual freedom and resist all efforts to censor library resources.
–We protect each library user’s right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.
ALA Code of Ethics
This professional code, along with individual libraries’ strong privacy protecting policies, has led ALA and librarians to work tirelessly to protect patron privacy. In fact the American Bar Association, in an article in their ABA Journal in August of this year, recognized the growing lobbying clout of ALA, calling them, “one of the most active players in legal fights over technology, copyright, national security, censorship and privacy law.”
Librarians have been at the forefront in the battle against such legislation as the Children’s Internet Protection Act (CIPA) and against certain provisions of the USA PATRIOT (Uniting and Strengthening America by Providing Appropriate Tools Required to Intercept and Obstruct Terrorism) Act (or USAPA in government documents parlance!) — primarily section 215, the “library” section. Section 215 allows the FBI to access business records (including library circulation records showing what patrons are reading) with no probable cause and places a non-disclosure gag order on those librarians or libraries that are being asked for library records. USAPA precludes state statutes protecting private information. The strong stance against USAPA has led many libraries, in an effort to circumvent USAPA, to proactively delete circulation records and Web server logs and has led many to evaluate what personal information they keep and for how long.
This strong historical pillar of privacy protection is perhaps most important where govt information is concerned. What you read about your government (or about anything else for that matter) needs to be protected and kept strictly confidential. In the analog world, this was easier to assure since govt information was distributed to libraries that then gave access and had the power and organizational policies to protect privacy.
Privacy in the digital world is on much more tenuous footing. In the digital world the servers of the information decide how much privacy they’ll let you have, and this decision is not necessarily based on Codes of ethics, or explicit policies protecting user privacy, but on business plans and economic realities. Downloading a digital document from a Federal Agency web site, or from a server at the Government Printing Office will allow those organizations to collect the following information:
- The IP address of your computer and Internet provider.
- The date and time you accessed their site.
- Information about the document or piece of data downloaded.
- The Internet address of the web site that referred you to their site.
- Tracking information via cookies.
In other words, with the personal information that is collected as a matter of course in the digital world, the government will know what you’re reading and from where you’re reading it!
–The Technology of privacy:
The Government Printing Office (GPO), as everyone here is well aware of, are in the midst of creating their Future Digital System (FDSys) which, they maintain, will collect, describe, preserve and make accessible all past, present and future government information. On the face of things, this sounds like a good thing. However, this does not obviate libraries and librarians from doing what they have done so well for so long: selecting, acquiring, organizing, and preserving information; providing services for and access to that information; protecting the privacy of readers and users of that information; providing information without fees or stipulations.
We have asked GPO about privacy concerns inherent in some of the technologies being used in the Future Digital System (Digital Object Identifiers (DOI) and Public Key Infrasructure (PKI)) and have asked for clarification on the use of Digital Rights Management (DRM) technologies within FDSys. We have been assured that GPO follows Office of Management and Budget (OMB) recommendations and has a long-standing tradition of protecting the privacy of its customers and users of digital information content. However, GPO has yet to release a statement on whether or not they will use or reject DRM technologies. This could be a fatal flaw for FDsys and could undermine libraries’ abilities to serve our users.
Despite these assurances, the technologies being implemented in the Future Digital System have the possibility of being used to abuse the privacy of users. Let me just say a few words about DRM, DOI and PKI. I promise to keep this short and not get into technological alphabet soup!
Digital Rights Management (DRM) is an umbrella term comprising several technologies used to control or restrict the use of digital content. Two common technologies are the Digital Object Identifier (DOI) and Public Key Infrastructure (PKI). There are many others used for specific file formats (like CSS encryption for DVDs). Two things are certain: all of these technologies can be circumvented by other technologies; All of these technologies are based on systems having access to the user’s private information (IP address, name, address, credit card number…).
The Digital Object Identifier (DOI) — which you may have seen in links to electronic journal articles in some of the larger article databases like this one — is similar to a persistent URL (purl). But instead of simply being a persistent pointer to a digital object, the DOI is designed to verify the authenticity of a digital document, and also to check a user’s authority to access a document. Thus the DOI is designed to protect copyright, and prevent “piracy.” David Sidman, wrote a very interesting article about DOI in 2001 entitled “The Digital Object Identifier (DOI): The Keystone for Digital Rights Management (DRM).”
Public Key Infrastructure (PKI) is a crytographic program that provides for third-party vetting of, and vouching for, user identities and will be used by GPO to certify electronic content as authentic and official. Think of PKI as a kind of digital water mark that is verified by a third party. The drawback of PKI is that it shifts the verification of trust in a document to a third party (usually a private company) and again relies on private information in order to work. Both the creator and user of information must have access to the public key in order to unencrypt the information. This means that the holder of the public key for a document can not only withdraw permission to access but can easily collect information on who is reading what.
These tools by themselves can be inoccuous, harmless and even quite useful in the right situations. However, the private information used to make them run can also be saved, harvested and mined for other purposes.
Does this mean we cannot have Internet access to government information without Uncle Sam looking over our shoulder? The jury is still out on this. In GPO’s Future Digital System, the government will not need the USA PATRIOT Act because their content management system will be able to collect and mine personal data quickly and easily from their own servers. GPO takes great pains to note that the FDsys will be “policy-neutral.” As my colleague Jim Jacobs has pointed out, policy-neutral does not mean “neutral policies.” The FDsys is actually designed to accommodate changes to government information policy, including changes for how they deal with users’ private information.
One solution to this privacy conundrum would be to deposit electronic copies of government information with libraries and let the libraries serve the information on their own servers. That way, electronic government documents would be accessed from privacy minded libraries. The issue of trust would be shifted from government agencies and third party corporations to libraries, which have a long, well-established social role of providing information and protecting privacy. Even if the government used its new powers under the PATRIOT act, it would have to make literally thousands of requests to find out who had handled a given document.
History tells us that the Government and private companies do not play well with others’ personal information and often misuse their power to access personal information for their own goals.
From 1956 – 1971, the FBI’s counterintelligence program (nicknamed COINTELPRO), attempted to neutralize political dissidents by, among other things, targeting library records to find out which users were consulting sensitive — although unclassified — technical information in libraries. More recently, In 2001 the Bureau of Indian Affairs had its internet and email privileges taken away by a federal judge because of their gross mismanagement of and failure to protect Indian land trust records (Court appointed experts had actually hacked into and gained access to supposedly protected electronic files!). 4 years later, the BIA’s Web site STILL says “temporarily unavailable.”
An August, 2005 report from the Government Accountability Office (GAO), titled, “Data Mining: Agencies Have Taken Key Steps to Protect Privacy in Selected Efforts, but Significant Compliance Issues Remain,” (PDF) examined programs at various agencies like the Small Business Administration, the Agriculture Department’s Risk Management Agency, the IRS, the State Department and the FBI. The report found that the agencies’ implementation of privacy and security measures was haphazard. The General Services Administration (GSA), which provides a database to the State Department, actually claimed that the Privacy Act did not apply to its system.
On the corporate side, there is even more evidence of mismanagement of private information. Here are a couple of examples of corporations selling personal information or otherwise not protecting private information:
–In 1999, RealNetworks – the company that popularized streaming audio broadcasts, slipped some code into its RealJukebox music player that surreptitiously transmitted information about the individual listening habits of its 13.5 million users to the company’s servers.
The program secretly searched the users’ hard drives for every type of music file, and, even worse, it made a note of every music disc they played in their CD-ROM drives. RealNetworks made no mention of this little intrusion in the privacy statement posted on the company’s Web site.
This clandestine snooping went on for months until Richard Smith, an Internet consultant, discovered it and blew the whistle on the company.
— In September, 2002, JetBlue sold the itinerary information of over 1.5 million passengers, including passenger names, addresses, and phone numbers to another private company called Torch Concepts, under contract with the US Army to do research on data mining.
–Just last month, Yahoo provided private information to Chinese State Security authorities that helped to convict a Chinese dissident journalist named Shi Tao.
–What presentation would be complete without a mention of google! Google’s gmail service indexes private email in order to target advertising. While this is technically not illegal, it is, to my mind, unethical.
Google has also amassed a huge amount of personal information, search history, images, email etc. meaning there is a great likeliness for abuse. Google records everything they can:
For all searches they record the cookie ID, your Internet IP address, the time and date, your search terms, and your browser configuration. Increasingly, Google is customizing results based on your IP number. This is referred to in the industry as “IP delivery based on geolocation.” Google retains all data indefinitely and has no data retention policies. There is evidence that they are able to easily access all the user information they collect and save. Google won’t say why they need this data. Inquiries to Google about their privacy policies are ignored. When the New York Times (2002-11-28) asked Sergey Brin about whether Google ever gets subpoenaed for this information, he had no comment.
But enough of that. I’d like to conclude with a question for this audience:
Who do you want protecting your privacy and right to read in the digital age? Organizations (i.e., LIBRARIES) with a long history and a primary duty of protecting privacy, backed up by a strong code of ethics and institutional policies? Or organizations (government or private) who’s tertiary duty MAY be to protect privacy, but who may have other reasons, nefarious or economic, for NOT protecting privacy?
Now I’d like to turn things over to Daniel who will discuss preservation of government information.