One of the many bright spots of last week's Fall 2012 Depository Library Conference -- the notes and proceedings will soon be posted on the desktop -- was the announcement by the Government Printing Office (GPO) that GPO and US Department of the Treasury are partnering on a project to bring historic digitized Treasury publications onto the FDsys platform. This is a great step by GPO to provide a platform for Federal agencies to publish their historically relevant publications for better access to the public.
The U.S. Government Printing Office (GPO) and the U.S. Department of Treasury have partnered on a pilot project to make historical digitized content from the Treasury Library available on GPO’s Federal Digital System (FDsys). Through the pilot project, Treasury Reporting Rates of Exchange, 1956-2005, which list the exchange rates of foreign currencies based on the dollar, are now available on FDsys. Over the next year, additional historical documents within the Treasury’s library collection will be made available on FDsys through this pilot project.
The Minnesota Historical Society has several papers on authenticating digital legal information. Here you will find white papers that address authentication issues as well as information on the Uniform Electronic Legal Materials Act. Links to additional resources are also provided.
- Preserving state government digital information
Project partners have identified authentication of digital material -- the process by which information is assured to be what it appears or claims to be -- as a common interest. The trustworthiness of online state statutes and session laws is of particular interest.
The newest paper discusses five methods of authentication and their associated costs pertaining to authenticating primary legal materials in electronic format:
Hat tip to INFOdocket!
[UPDATE: several people have let me know that the link to Volume 1 of the report is dead. I've posted it to dropbox for now (have patience, it's 21MB!) and will update this post when I get a response from the Coast Guard. That raises *another* issue: the link to volume 1 on the coast guard site is a *dynamic* link (you can see the sessionID in the url). That means when the session ends, the link is dead. Documents NEED to have permanent links. One way to assure that is to send the document to the GPO for cataloging!. More soon.]
Yesterday, the Deepwater Horizon joint investigation team Released its final Report. According to the Wall Street Journal:
Federal investigators released their final report Wednesday into the causes of the Deepwater Horizon drilling rig explosion in the Gulf of Mexico last year, castigating oil giant BP PLC and its contractors for their risky decisions and criticizing the government itself for gaps in oversight.
The report contains dozens of recommendations for improving off-shore oil drilling; but that's not really what I want to talk about. As a govt documents librarian, my concern is instead with *HOW* the report was released and its implications for trusted government information. Here are a few of the questions or issues that I have:
- Why did the committee created to do the work -- a joint task force between the Department of Interior's Bureau of Ocean Energy Management, Regulation and Enforcement (BOEMRE) and the U.S. Coast Guard -- create a separate .com site (http://deepwaterjointinvestigation.com) rather than doing/posting their work on the BOEMRE.gov site -- or at least requesting a .gov domain from General Services Administration (GSA)?
- Why was the report released on both the BOEMRE site as well as the Joint Investigation Team (JIT) site? The fact that it's on both .gov and .com sites calls into question the authenticity of the report -- and GPO has been very strong on digital authenticity and digital signatures of key govt documents to verify chain of custody and document integrity.
- Further, why were the actual documents themselves (The Adm. Papp/Director Bromwich Cover Memo, volume 1, volume 2, and appendices) were posted on different sites -- cover memo, volume 2 and appendices on BOEMRE's .gov site and volume 1 on the Coast Guard's .mil site. And why were there redacted versions posted on the Coast Guard's site (look at the file names) and unredacted versions posted on BOEMRE and the joint taskforce site?!
- Lastly -- and this one particularly irked me because it blocked me from actually preserving the document in the Stanford Digital Repository -- why was volume 1 of the report (the part posted on the Coast Guard's .mil site) posted as a PDF with password security in place? I needed to combine the separate PDFs ( (memo, volume 1, volume 2 and appendices) into 1 file in order to save it in the Stanford Digital Repository (for more on that workflow, see the briefing of Everyday Electronic Materials (EEMs) that my colleague Katherine Kott did at CNI in Fall, 2010). But because it was posted as a "secure" PDF, I was blocked from extracting pages or assembling the PDF together.
So here's what I would suggest that agencies do with their reports -- especially their high-profile reports -- in the future:
- DO post them to a .gov site AND send a copy to the GPO so they can be cataloged and the bibliographic records can be distributed to federal depository libraries for more widespread access
- DO post the documents on the same domain as the press release
- DO give users a choice for large documents: downloading multiple files for specific pieces of a report AND downloading the report in its entirety as 1 file
- DO NOT put any sort of digital rights management on public domain govt publications (I can't stress this point too strongly!)
Is that so much to ask?
Today I attended the "Chat with GPO" OPAL session, which focused on authentication and authentication for FDLP partners.Ted Priebe, GPO's Director of Library Planning & Development (LPD) and Lisa Russell, the Manager of LPD's Content Management unit presented material and answered questions.
Basically, LSCM wants to partner with Federal Depository Libraries and find ways to authenticate content hosted by the FDL partners. The digital signatures of authentication will indicate partnership with the FDL institution and the contact information for that institution. This is great news, especially for those FDLs also interested in hosting digital content in partnership with GPO.
The authentication session is archived on the GPO OPAL site.
I had to explain to a student patron and their Professor today what is meant by "born digital" and how digital government documents are wonderful resources for a paper if we do not have the print version or when the print version doesn't exist (or is horribly out of date). Have any of you had to explain this a lot?
It all started when the student patron told me she could only have three web sources for her Nursing research paper after I had shown her the wonderful world of digital documents online. She had found an eleven year old version of a government print source in our catalog but I cringed...born digital documents online via NIH or the U.S. Dept. of Health had more up to date medical information on her topic! I told her to use both the print and online sources. She would be able to see if there were any noticeable differences from the 1997 print version and the 2007/2008 online information on her topic.
I contacted the Professor and explained this too. All is well and she will allow for the use of online government information. She was just hoping to avoid the use of too many general (i.e. crappy) websites. I understand that but I wanted to make sure that the student would not be punished for using several good government online documents and websites for her paper.
I didn't get into the nitty gritty digital authentication of government documents, but with some Professors who require legislative research, I tell them about the digitally authenticated documents that currently exist from GPO.
I have a feeling we government document librarians are going to have to explain this concept of "born digital" gov docs and digital authentication more often...especially now that more and more gov docs are being born digitally.
As I am sure you know, we at GPO have been talking with the library community for several years now about our authentication efforts. This year, we were able to move beyond the discussion phase and implement authentication technology into some of our top GPO Access applications. In early 2008, we integrated an Automated PDF Signing system into our GPO Access workflows, and we successfully released the digitally signed and certified FY 09 Budget of the United States and 110th Congress Public and Private Laws documents on GPO Access. Digitally signing these publications was just the stepping stone for implementing our authentication initiative. Upon approval from publishing agencies, all publications ingested into the Federal Digital System (FDsys) will be digitally signed and certified in the future.
In addition, we will implement authentication technology at the granular level. Granular content, as described in relation to the FDsys, is content that is broken into smaller content units such as chapters, parts, or sections. Our next challenge is to identify at what level of granularity content should be authenticated and digitally certified for each content format. I am very interested in feedback on your thoughts on the level of granularity GPO should authenticate content to share with the team developing FDsys. I am also interested in learning more about your opinions and expectations for the future in relation to GPO’s authentication initiative. For more background on our authentication initiative, please visit http://www.gpoaccess.gov/authentication/.
...the U.S. is not the leader in e-Government...at least according to a study released last week by the Brookings Institution. However, we do rank third, but we are "falling behind other countries in broadband access, public-sector innovation and implementation of the latest interactive tools to federal Web sites".
Two other articles I read this morning also got me thinking about where we stand as a nation with digital government information: "Old-school Recordkeeping Meets the Digital Age" and "Government Data and the Invisible Hand". The first article made me feel quite frustrated with our lack of digital preservation progress, especially after reading this quote:
"...lacking a statutory prescription for maintaining electronic records, most agencies print and file [records] as they would paper documents, according to a recent investigation by the Government Accountability Office...Under current regulations, NARA does not require agencies to maintain records in their native formats. So for now, many agencies still print e-mail messages and file the paper versions.Although the filing process is relatively easy, the practice has a major weakness: It eliminates the searchability of digital documents". (Gee, ya think?!)
Envisioning all those emails being printed by government agency employees makes me think of Google's April Fool's joke: the "Google Paper" service!
I hope the next President and his administration will take the issue of e-government and digital preservation/authentication very seriously. Obama and McCain have touched on the issue a bit, including Obama's vague vision of online government transparency:
"I want people to be able to know, today, this issue is going on...Today, President Obama talked about his proposal for $4,000 student college-tuition credits. It’s going to be going to this congressional committee, these are the key leaders in the House and Senate who are going to be deciding on the bill, here are the groups that support it, you should contact your congressman. The more that we can enlist the American people to stay involved, that’s the only way we can move an agenda forward."
The second article touches on this issue as well, and urges the next Presidential administration to "embrace the potential of Internet-enabled government transparency [by reducing] the federal role in presenting important government information to citizens". A profound statement, but read the rest of their argument as stated in the abstract:
"Today, government bodies consider their own websites to be a higher priority than technical infrastructures that open up their data for others to use. We argue that this understanding is a mistake. It would be preferable for government to understand providing reusable data, rather than providing websites, as the core of its online publishing responsibility.
Rather than struggling, as it currently does, to design sites that meet each end-user need, we argue that the executive branch should focus on creating a simple, reliable and publicly accessible infrastructure that exposes the underlying data. Private actors, either nonprofit or commercial, are better suited to deliver government information to citizens and can constantly create and reshape the tools individuals use to find and leverage public data. The best way to ensure that the government allows private parties to compete on equal terms in the provision of government data is to require that federal websites themselves use the same open systems for accessing the underlying data as they make available to the public at large".
This makes sense if you think of it from the context of all the mashups, RSS feeds, and other interactivity with web content that exists. The rest of the article makes some other interesting points and counterarguments, such as
"A government data provider can provide a digital signature alongside each data item. A third party site that presents the data can offer a copy of the signature along with the data, allowing the user to verify the authenticity of the data item, by verifying the digital signature, without needing to visit the government site directly".
Easier said than done? Is the "digital signature" they talk about the same as GPO Digital Authentication?
We are making some progress in e-Government and digital preservation of government information but we need to do better. Like Obama said, we can start by contacting our congressmen to voice our concerns and suggestions for improvement on e-Gov initiatives and digital preservation...because I don't know about you, but I sure don't want the government to use "Google Paper".
[Cross posted on LegalResearchPlus.com]
On a daily basis I visit various court and other government websites, often to locate recent opinions, regulations, or agency decisions. It is a common practice for law librarians and for any researcher who wants very recent sources or does not have access to commercial databases. Admittedly it is far less often that I consider whether the case I just downloaded is an authentic representation of the court’s decision.
“The Official Reports page is primarily intended to provide effective public access to all of California's precedential appellate decisions; it is not intended to function as an alternative to commercial computer-based services and products for comprehensive legal research.”
“Although every effort is made to ensure that the information contained on this site is correct and timely, the First Circuit does not warrant its accuracy. Portions of the information may be incorrect or not current. The information contained on this site should not be cited as legal authority.”
In 2007 the American Association of Law Librarians completed a survey of states' online statutes, regulations and case law to determine which states, if any, were deeming their online material to be official and/or authentic. The survey, “State-by-State Report on Authentication of Online Legal Resources,” is available from the Washington Affairs Office of AALL. Survey authors Richard Matthews and Mary Alice Baish concluded that while many states considered the primary legal material that they put online to be official, no state had taken steps to authenticate those materials.
In a world where online research is becoming the norm, are courts (and other government websites) really keeping up with the needs of the people they serve by not offering official and authenticated versions of their opinions online?
Who do you Trust? The Authentication Problem
How do we know when a digital document is "authentic"? While many in the library and academic communities hope that there will be a technological solution, the reality is that technology alone cannot solve the problem of authenticity. A report this week of research at a Chinese university illuminates one reason for this: technical tools are subject to failure, compromise, forgery, and hacking.
- U.S. mulls new digital-signature standard, By Anne Broache, and Declan McCullagh, CNET News.com, November 1, 2005.
The article reports a flaw in an official federal standard that was originally devised by the National Security Agency and is widely used to create and verify digital signatures in e-mail and on the Web. In fact, it is embedded in every modern Web browser and operating system. The CNET article notes that, while the flaw that Chinese scientists discovered in the "Secure Hash Algorithm" is "theoretical," it will eventually make it easier to forge electronic signatures.
But authenticity requires more than secure software. Even if we had a tool that could never be hacked and that would last forever, we would still only have part of a solution: the technical part. The other part of the solution is social: it is the issue of Trust.
Software provides the technical part of the solution
The technology of authentication provides a way to verify that a document is what it purports to be and determine if it has been altered or not. Document-creators can use software to create special files (called "hashes" or "signatures" or "keys") based on the original document. These special files are typically stored with a "trusted third party" -- neither the document creator nor the recipient. Document-users can then use software to check the authenticity of the document in hand against that "hash." The software is able to determine only if the document in hand is identical to the original. Even the smallest change (e.g., the insertion or removal of a blank space) will result in a report that the documents are not identical.
Trust is the social part of the solution
But this technological check does not solve the authentication problem by itself. The check against the hash is only as reliable as the trusted third party. The software just gives us a technical means of shifting who we trust -- instead of trusting the party that delivered the document to us, for example, we trust a third party that tells us that the hash is correct and authentic. If the hash isn't authentic and unchanged, the check against the hash is worthless.
This concept of a trusted third party is, therefore, an essential component of the authentication chain. That should lead us to an important question: who will we choose as our trusted third parties? This is important because the tools only work if we can trust the third party to do its job. In the case of government information essential to our democracy, this trust has to last forever.
Who do you trust?
Ask yourself who in society is the most trusted third party in delivering information? The government? The press? Publishers? Technology companies like Microsoft and Verizon?
What about libraries?
Now ask yourself what we will do if we think that technological-verification is all we need to ensure authentication and we find one day that the tools have failed as described in the CNET article.
A Social Solution built on Trusted Institutions and Legal Deposit
Trust is a social phenomenon, not a technical one. What if, instead of putting all our faith in potential technological "solution" for ensuring authenticity of government documents, we instead relied on the existing infrastructure of depository libraries to ensure authenticity through their collective possession of multiple copies of digital government publications, distributed by GPO at the time of their publication under the legal-mandate of 44 USC?
This solution promises to be a sound, sustainable one because it relies on libraries as the trusted repository of information. Libraries have a long, well-established social role of providing information; people trust libraries because of it. Libraries have a vested interest in ensuring that the information they provide is authentic and people trust them to do so because it is their primary mission -- not a byproduct of publishing or making money or the various missions of government agencies.
The trust people place in libraries in general can be increased in the digital environment by relying, not on one or two libraries, but on many libraries with different funding streams and missions. Any unforeseen compromise in one institution becomes a single error in a large system of information-provision. (See Article outlines bottom-up standards for digital preservation systems.) Even in the paper and ink world, forgeries are possible -- though more difficult than in the digital world -- and one important way we determine authenticity is by comparing multiple copies.
A different approach
This approach is subtly different from the approach of hoping for a technological solution to authenticity. It recognizes that the social issue of trust (along with the existence of multiple copies controlled by different parties) is paramount and the role of technology is secondary. The role of technology is simply to provide tools to help implement that trust. Indeed, if we used this social-trust legal-digital-deposit approach, libraries would still use technical tools (e.g., LOCKSS, PKI, state of the art hash technologies) to validate the integrity of digital files. Combine these tools with trusted institutions, legal deposit, and multiple copies under multiple jurisdictions and you have fail-safe a recipe for ensuring authenticity.
The problem with hoping for a technological solution was clearly articulated back in 2000 by Abby Smith, Director of Programs at the Council on Library and Information Resources.
Interestingly, the scholar-participants suggested that technological solutions to the problem [of establishing the authenticity of a digital object] will probably emerge that would obviate the need for trusted third parties. Such solutions may include, for example, embedding texts, documents, images, and the like with various warrants (e.g., time stamps, encryption, digital signatures, and watermarks). The technologists replied with skepticism, saying that there is no technological solution that does not itself involve the transfer of trust to a third party. Encryption -- for example, public key infrastructure (PKI) -- and digital signatures are simply means of transferring risk to a trusted third party. Those technological solutions are as weak or as strong as the trusted third party. To devise technical solutions to what is, in their view, essentially a social challenge is to engender an "arms race" among hackers and their police.
-- Digital Authenticity in Perspective in "Authenticity in a Digital Environment," Council on Library and Information Resources, Publication 92. (May 2000).
James A. Jacobs, November 3, 2005