Home » Posts tagged 'transactions or instantiations'

Tag Archives: transactions or instantiations

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

Readers or Consumers? Citizens or Customers?

Are users of government information citizens or “customers”? What would Ranganathan say? Barbara Fister addresses the topic of library attitudes to their communities in her newest post on InsideHigherEd

    Books Are For Use, by Barbara Fister, Inside Higher Ed. Library Babel Fish (January 23, 2014).

    I have argued that as we increased access to information (for a high price), we’ve become more parochial (these collections are for authorized users only!) and more global (with our collections owned by distant corporations, not by institutions of higher learning).

We have written about this issue in terms of government information before here at FGI (E-Gov: are we citizens or customers? and Reflections on the end of a year and the beginning of a new year) but, as Fister says “market-based assumptions have so permeated our discipline they seem to be everywhere.” See for example Rick Anderson’s recent Ithaka S+R paper and our response: What’s love got to do with it? further thoughts on libraries and collections #lovegate.

Fister and colleagues are doing something about getting libraries right.They are surveying faculty at institutions of all kinds (particularly those in disciplines in which books matter) to assess support for a group of liberal arts college libraries to undertake founding an open access press. Read her post and forward the survey on to those who can help!

E-Gov: are we citizens or customers?

The idea of E-government initiatives is to make it easier for citizens to transact their business with their governments. This is surely a good idea, but it carries with it several problems including endangering the long term preservation of government information.

Take, for example, Adobe’s new product, The Adobe Digital Enterprise Platform for Customer Experience Management (CEM), which it hopes will attract government agencies:

This sounds attractive in a lot of ways. It promises better customer service, personalized information, and faster access to relevant information.

There are, however, several problems if this approach is used exclusively.

Citizens or Customers?

Adobe says that its software allows agencies to “stop making a distinction between customers and citizens.” Surely all of us would like to know that our “customer experience” with the DMV would be as easy and straightforward (and brief!) as our experiences with the best commercial web sites. Companies like Amazon have made their fortunes not because they offer better products, but because they make it easier to find and buy those products. Wouldn’t it be nice to have have government agencies’ web sites work as well as the best commercial web sites? Wouldn’t it be great if government agencies could shrug off the old, cliched unfriendly-to-users image, and create new, user-friendly, customer-centered web sites?

It would, of course. But the problem is that, when we visit an agency web site, we are not always the “customer” of that agency. We are more often citizens seeking information than we are customers engaging in business-like transactions.

And that is the beginning of the problem. Treating citizens as customers can jeopardize our privacy, make it harder for us to find the information we want, and make it harder to preserve government information for the future.

Privacy.
There are, of course, big privacy issues if governments start replacing the dissemination of information with the personalized transactions of e-government. Citizens should be able to search, browse for, read, and use government information without the government tracking and recording each individual’s every search and use. The e-gov interaction between citizen and agency requires just such tracking, however. Adobe, for example, says of its product that it would provide “an instant, unified record of a customer’s interaction with a company or agency, regardless of where or how that interaction is happening.”

There are certainly occasions when citizens want and need to personally interact with a government, but we probably all hope that we don’t have to do this frequently. Going to the DMV to renew a license, or applying for a grant, or filing tax returns are not what we do (or want to do!) every day. These are the exceptions to our interactions with governments.

Most of our interactions with governments are about looking for information that the government has gathered, or compiled, or created as part of its mission. Whether it is about proposed legislation, or existing regulations, or the location of flood-plains, or the population of a city, or the latest economic indicators, or how to manage agricultural pests, the government does not need to know who we are and what we are looking for in order to deliver the information we need.

If governments replace the anonymous delivery of information with e-government “customer-based” services, we will lose our ability to read (or even look for information) privately. (If you are not convinced that privacy is important, see Privacy: “I have nothing to hide”.)

Filering out what we want.
It seems almost counter-intuitive to say that personalization of web sites would make it harder to find what we need. Surely, personalization is designed to make it easier to find what we want, isn’t it? Take the example mentioned in the NextGov article above: Imagine you live in an area that has just been damaged by a flood or a hurricane and you go to the FEMA home page. Wouldn’t it be great to have the website “know” where you live and immediately show you links to specific services available and relevant to you? It would, but this example is not typical of all our interactions with government and therein lies a problem for relying only on customization.

Those who examine how people use the web have long understood and documented that customization of search results and browsing can do more to limit our understanding than enhance it. See, for example, Nicholas Negroponte or David Weinberger in 1995, and J.D. Lasica or Cass Sunstein in 2001.

And now Eli Pariser has written a book (The Filter Bubble: What the Internet Is Hiding from You) that documents how “the hidden rise of personalization on the Internet is controlling — and limiting — the information we consume.” Pariser says that, when we are seeking information, “personalization” silently filters out relevant content as it tries to predict what we “want.” (See Pariser’s excellent TED talk for more details and examples: Lunchtime listen: Eli Pariser on filter bubbles.)

Do we want governments to favor “customers” who require “personalization” over citizens who are seeking information? I worry that such an approach will likely lead to government web sites that silently filter out relevant search results in an attempt to show you what the government (or Adobe) thinks you want. If we do not know how this process works and if we have no control over whether or not to use this functionality, we will end up not knowing if we have found what would be most relevant to our information needs. This would be bad. As we know, If It Is Too Inconvenient, I’m Not Going After It. Citizens seeking a broad array of information are not the same as customers wanting to buy a single product. Governments delivering a cornucopia of information are not the same as businesses trying to persuade customers to buy the shiniest, newest, highest-profit-generating product.

How do you preserve something that you can’t get?
As we move to the delivery of government information through dynamic web sites (whether “customer” driven or not), we face an increasing problem of preserving that information because we have no direct access to the information that needs to be preserved.

In order to preserve information (even digital information), we require an “instantiation” of that information — a digital object to preserve. In the past, information was instantiated in physical books, pamphlets, maps, journals, posters, and even microfiche, CD-ROMs, and DVDs. Although some government information today is instantiated in PDFs and spreadsheets and even static web pages, government information is increasingly instantiated in databases that are not directly visible to users. When libraries (or GPO) cannot get copies of these databases, they cannot preserve them.

Websites use those databases to present selected information to users who visit web pages or who request information through searches. “Customer-driven” web sites will ensure that two people who make the same query or who visit the same URL will get different information. (To use the FEMA example again, if I live where there was just a flood and you live where there was just a hurricane and we both visit a customer-driven FEMA web site, I’ll get flood information and you’ll get hurricane information.) Web harvesting will be insufficient for preservation under these conditions.

The essential problem here is that, if agencies see their information mission as one of processing transactions with individuals rather than one of creating and delivering preservable instantiations of information, it will be difficult if not impossible for digital preservation to be complete or accurate or successful. Gertrude Stein might say of the lack of access to preservable digital objects, “There is no there there.”

Concerns

At FGI, we are not technological determinists. We don’t believe that the existence of software such as Adobe’s CEM will inevitably lead to loss of privacy, harder to find information, and the inability of libraries to preserve digital government information. As noted above, there are circumstances where use of such software could yield better service and make it easier for users to find the information they need. We believe that thoughtful management of digital technologies can result in easier access and new functionalities without sacrificing long-term, free, public access to government information.

We also know, however, that technology is political and that sometimes organizations make bad technological decisions for apparently necessary reasons. Our concern is that agencies are under pressures that could easily lead to bad decisions and that software such as Adobe’s CEM could make it easier to make bad decisions. Specifically, agencies are under pressure to reduce the number of government web sites and to streamline existing websites using new technology-based plans to improve their customer service at the same time that budgets are under increasing stress, open government initiatives are being reduced drastically, and GPO is being hit by big budget cuts.

Our concern is that these pressures will result in bad decisions. We worry that agencies will not add new, much-needed functionality to existing web sites, but will instead replace a citizen-centered model with a customer-center model. We worry that such a substitution will result in a loss of privacy, a loss in information-based functionality in favor of product-based functionality, and that all of this will make it even harder than it is already to preserve digital government information.

We agree with OMB Watch that information is a customer service and hope that agencies will keep this in mind when they make their information-technology decisions. But we worry that hard-pressed agencies will not.

Solutions

We believe that those (including GPO and FDLP libraries) who are interested in preserving government information should address the task of preserving the databases of government information that drive dynamic web sites. The information behind even customer-based transactions needs to be preserved; the transactions themselves do not. A model for this already exists with Census data. The Census Bureau has been able to provide a dynamic, database-driven web site and, at the same time, provide the databases behind the web site as preservable digital objects.

Preserving databases is a more complex task than preserving monographs or PDFs (see for example The Preservation of Databases, by Kevin Ashley, Vine, 34 (2004), 66-70), and agencies that personalize the delivery of information will have to ensure that personal information is kept separate from agency information, but database preservation and privacy protection can be accomplished. But to do so will take an active commitment.

Putting Citizens First: Transforming Online Government

A report by the Federal Web Managers Council provides some useful suggestions about how to make government information more useful.

Among their findings and suggestions:

There are approximately 24,000 U.S. Government websites now online (but no one knows the exact number).

Only a minority of agencies have developed strong web policies and management controls. Some have hundreds of “legacy” websites with outdated or irrelevant content.

We have too much content to categorize, search, and manage effectively, and there is no comprehensive system for removing or archiving old or underused content.

Agencies should be required and funded to conduct regular content reviews, to ensure their online content is accurate, relevant, mission-related, and written in plain language. They should have a process for archiving content that is no longer in frequent use and no longer required on the website.

The report solicits comments, so I wrote the following to one of the co-chairs, Sheila Campbell:


Ms. Campbell,

I am writing to comment on and make a suggestion for

Putting Citizens First:  Transforming Online Government A White Paper Written for  the 2008 – 2009 Presidential Transition Team by  the Federal Web Managers Council,  November 2008 http://www.usa.gov/webcontent/documents/Federal_Web_Managers_WhitePaper.pdf

May I suggest that, as you work with Federal Web Managers and with Congress for information dissemination requirements, that you keep in mind two things:

1. Long-term preservation and usability of and access to even “out of date” government-created information is essential in a democracy. (We need an accurate *record* of government, not just a snapshot of what is current.)

2. The *primary* information role of the government is the creation and initial communication of information; government agencies will need help to ensure long-term preservation of information. (Agencies may cease to exist, or get merged with other agencies, or change their missions, or simply lack funding for providing long-term access to older information. Even the National Archives does not have a mandate to preserve everything that needs to be preserved.)

In keeping these two assumptions in mind, I suggest you promote two simple procedures:

1. Agencies should always, at the time information products are created, instantiate their information in open, preservable, formats (e.g., not proprietary, commercial formats).

2. Agencies should always publicly announce and describe information products and make their digital information available through the Federal Depository Library Program (FDLP) and the Government Printing Office (GPO), where appropriate.  GPO and the more than 1000 FDLP libraries can help preserve your digital information and keep it available for the long-term.

Finally, I realize that the day-to-day requirements of e-government and creating reliable transaction-based information services for citizens may seem to conflict with the long-term  usability requirements of instantiating information in preservable, open formats.  But there are successful models of doing both. For example, the Census Bureau makes its statistical information available through a transaction-based service (American Factfinder (http://factfinder.census.gov/), while, at the same time making its raw data available in an operating-system-neutral, software-neutral format for researchers.  There are many archivists and librarians and technical experts who can help agencies with these issues.

Thank you for your thoughtful report. I hope these comments help.

2008 Society of American Archivists Convention: “Citizens in the Dark? Government Information in the Digital Age”

These are my speaker-notes for the presentation, “Citizens in the dark?
Government Information In the Digital Age,” which I gave on Friday
Aug 29, 2008, at the meeting of the Acquisitions and Appraisal Section
of the Society of American Archivists Convention in San Francisco.
The theme of the convention was “Archival R/Evolution & Identities.”

This is not a transcript of what I actually said, but an outline from which
I spoke. There are sentence fragments and inconsistent capitalization and
other less-than-final-draft editing. I hope that this is useful to you in
spite of these distractions.

I do include the points I tried to make and a bit of the verbiage and all
of the links I have.

– Jim Jacobs.


The title of this presentation is:

Citizens in the dark? Government Information In the Digital Age

We are seeing a fundamental change in the way governments communicate
with citizens. These changes are NOT caused by technology, although
they are enabled by technologies. They are driven and determined by
economic, political, and social issues.

The solutions are therefore, not technological either, although they
will be enabled by technology. The solutions are economic, social,
and political.

Abby Smith, Director of Programs at the Council on Library and Information
Resources, in a CLIR report on “authenticity” in a digital age, summed
this up quite nicely:

Interestingly, the scholar-participants suggested that technological
solutions to the problem [of establishing the authenticity of a digital
object] will probably emerge that would obviate the need for trusted
third parties. Such solutions may include, for example, embedding
texts, documents, images, and the like with various warrants (e.g.,
time stamps, encryption, digital signatures, and watermarks). The
technologists replied with skepticism, saying that there is no
technological solution that does not itself involve the transfer of
trust to a third party. Encryption — for example, public key
infrastructure (PKI) — and digital signatures are simply means of
transferring risk to a trusted third party. Those technological
solutions are as weak or as strong as the trusted third party. To
devise technical solutions to what is, in their view, essentially a
social challenge is to engender an “arms race” among hackers and their
police.45

Abby Smith, “Digital Authenticity in Perspective.” in “Authenticity in a
Digital Environment,” Council on Library and Information Resources,
Publication 92. (May 2000).
http://www.clir.org/pubs/reports/pub92/smith.html

“Trust” is a social issue, not a technological one.

———————————————————————–
As we look at the technological changes, the way governments are using
and not using, adopting and avoiding, and in general coping with these
technological changes, i think we all see trends.

The agenda of this conference reflects these trends and changes.
with its theme of Revolution and Evolution, with sessions on
everything from
– digital repositories
– born digital materials,
– digitization
– digital manuscripts
– e-mail
– e-records
– e-discovery
– the “e-tiger”

And of course, representatives from NARA, LC, and GPO are here to discuss
their projects

I assume that all of us are familiar at least in a general way with the
many of the difficulties of digital archiving. things like:

– format obsolescence
– media deterioration
– content that is tied to a particular operating system or application
– the need for new kinds of metadata

and

– emulation and migration strategies.

So, I will not cover those today.

What I do want to do is to give you a (perhaps) slightly different
perspective and some (possibly) different ideas and approaches to these
challenges and bring up some issues that i believe do not have enough
attention yet.

THE PAST
———————————————————————–

1. In the past, government information archiving was straightforward

a) We knew and could fairly easily define and identify records

b) We could (again, in a fairly straightforward way) identify
responsibility for record creation, scheduling, retention, deposit,
preservation, access, etc.

c) We could establish procedures to get things done. predictable,
definable, etc.

So… in the past, we had a pretty clear path of preservation:

– of what we wanted to preserve and
– of how to preserve it and
– of who was responsible at each stage from record creation through
retention and disposition and preservation.

We could define and identify what we wanted to preserve and seek and
possibly fund the preservation.

We may not have always been 100% effective, there may have been failures,
gaps, short-funding, recalcitrant agencies, mistakes, etc. but we at least
knew what we were doing and where the gaps were…

THE PRESENT
———————————————————————–
A lot has changed, perhaps everything. Here are four areas
of fundamental change that affect our ability to archive the
complete historical record of governments:

1) WHAT. While to some extent we can still define and identify records,
the job of doing so is much less clear. There may be some things that
we cannot get a hold on to define as records. there may be things that
are part of the record which the govt does not even possess. Or for
which it lacks licensing or copyright permission to possess or copy.

2) WHO. Even to the extent that we can identify (broadly) what we want
to preserve, it may be hard to identify who is responsible and
difficult to create adequate, implementable, schedules for
preservation.

3) HOW. Even if we can do all that, digital preservation itself is
difficult and it is very hard to move from a quick-moving,
service-oriented, bureaucratic, day-to-day, digital environment, to an
environment of digital preservation.

4) ACCESS. While preservation without access is not preservation at all,
“access” is a very different process than preservation.

It seems to me that the very processes that make it *easier* for a
current end-user to find and use digital information make it *harder*
for the archivist to preserve that same information and ensure its
usability in the future.

EXAMPLES
———————————————————————–

Let’s look at some examples

EMAIL (1)
————————————————————————
E-mail certainly provides good examples of the “recalcitrant agency” problem.
But I want to emphasize some other issues that will plague us even if we
solve that one.

An article in Technology Review gave several good examples.

One related a story that Allen Weinstein tells about how he discovered
in his FBI files a newspaper clipping with a note hand-written on it by
J. Edgar Hoover.

If that same communication happened today, it would most likely happen
in an email with, perhaps an attachment of the article, or worse, a
link to the article.

Even if we had in place all the new laws and regulations that are being proposed
to ensure that we can actually save email, would we have complete record? Or
would we have a partial record with a key part missing. And would be able
to find or identify that part? Would we be permitted to archive it?

Talbot, David. “The Fading Memory of the State.” Technology Review, July
2005.
http://www.technologyreview.com/printer_friendly_article.aspx?id=14583&channel=infotech&section=.

EMAIL (2)
————————————————————————
Another problem with email is the difficulty in knowing what to preserve.
the simplest algorithm for preservation of email is to preserve everything, but
that means preserving so many trivial, unimportant messages that would not
normally be scheduled for retention in any rational universe.

RECORD OF INFORMATION USED IN DECISION MAKING
————————————————————————
Another example from that same Technology Review article:

The mistaken bombing in 1999 of the Chinese embassy in Belgrade.
U.S. officials blamed the error on outdated maps used in targeting.

Today’s planners would use GIS software to zoom and pan, and run
calculations about the topography to make a targeting decision.

Would the software preserve the decision making process?

There are layers of challenges here:
– the data used (spatial data, databases of locations, topography, etc.)
– the software used to analyze and use the spatial data
– the code behind that software that has its own algorithms for implementing
particular user-analyses
– the actual use by the end-users, the trail of how they used the software
to analyze the data

These are difficult things to archive!

When decisions are based on computer models working on dynamic databases,
will we be able to preserve for future historians the state of the database
and the algorithms built into the models?

PUBLIC DOCUMENTS
————————————————————————
When we think about the preservation of the historical record, we have to
include public documents as well as private communications and decision-making
records.

In the past, “public documents” meant “publications” that were widely distributed
to the public and depository libraries.

Today, it means web sites.

As you probably know the Library of Congress recently announced a big
project with several partners to crawl the .gov domain at the end of the
current presidential administration. This will harvest a lot of digital
content that might otherwise disappear with the change of administrations.

http://www.loc.gov/today/pr/2008/08-139.html

An article about this project discusses the software being used and some
of the issues.

Quint, Barbara. “Consortium–Minus NARA–Archiving Bush Administration
Websites.” Information Today NewsBreaks, August 28, 2008.
http://newsbreaks.infotoday.com/nbReader.asp?ArticleId=50486.

See also a discussion about NARA’s role at the ArchivesNext blog:

http://www.archivesnext.com/?p=137

And some more on NARA here:

http://freegovinfo.info/taxonomy/term/189

Web-harvesting is not a definitive solution, though,

When we compare web harvesting with active deposit by the government of
documents in depository libraries we can get a glimpse at the scope of
the preservation problem we face.

Web harvesting puts the onus on the harvesters.

It releases the government from the obligation of actively depositing information.

It is a step back in time. It means that archivists and librarians have
less control on their own selection and acquisition and that agencies have
less responsibility.

WHERE IS GOVINFO? (ON THE WEB…?)
————————————————————————
A study by the Center for Democracy and Technology late last year
examined

“Why Important Government Information Cannot Be Found Through
Commercial Search Engines”
http://www.cdt.org/righttoknow/search/

The reasons for the failure of search engines to adequately index
government web sites are the same reasons that make it difficult for
web harvesting to be successful. If we can’t find the information,
we cannot harvest it.

We cannot preserve what we cannot save.

BEYOND .GOV AND .MIL
————————————————————————
One problem (and a rapidly growing one) is that not all government information is
on the .GOV and .MIL domains.

Here are some examples:

– TWITTER.

twitter.com is the very popular “micro blogging” site, where people post
very short entries about what they are doing, where they are having lunch,
and so forth. did you know that many government agencies twitter?
among them:
– the white house Communications Office
– the Department of Health & Human Services: Office on Women’s Health
– and more than 60 others.
– 20 or 30 members of congress

http://twitter.pbwiki.com/USGovernment
http://www.sourcewatch.org/index.php?title=Members_of_Congress_who_Twitter

– YOUTUBE

The military is actively using YouTube to post videos

http://articles.latimes.com/2007/may/01/world/fg-cyberwar1
U.S. military offers up its side of the Iraq war on YouTube
la times By Alexandra Zavis May 01, 2007 in print edition A-4

One military youtube channel says that:

“Video clips document action as it appeared to personnel on the ground and
in the air as it was shot.”
http://www.youtube.com/profile?user=MNFIRAQ

– NASA posts videos on YouTube and iTunes

– NOT JUST FEDERAL…

I’m mostly giving examples of the federal government today, but the trends
stretch across all levels of government.

for example:

the PRINCE WILLIAM COUNTY SERVICE AUTHORITY IN VIRGINIA is posting
videos on YouTube.
http://www.fcw.com/print/22_25/technology/153418-1.html?type=pf

– the STATE OF CALIFORNIA has a youtube channel
http://www.youtube.com/californiagovernment

and GOVERNOR SCHWARZENEGGER posts to twitter.
http://twitter.com/schwarzenegger

HOUSE MEMBERS
– According to GovTech magazine, more than 100 House members have
multimedia pages and YouTube links on their Web sites
http://www.govtech.com/gt/241670

FLICKR / LC

– FLICKR. I’m sure you read about the success the library of congress
has had by posting photos on flickr.com
http://www.flickr.com/photos/library_of_congress/

QIK.com

qik.com allows you to stream video live from their cell phones.

Congressman John Culberson of Texas is a big fan and has his own qik
channel where he streams and posts interviews, meetings and more.
http://qik.com/johnculberson

Is this official government information? or political? or both?

While these examples may strike you as “not official” or “non-governmental”
the point here is that the environment for distribution of information is
changing rapidly and we must keep up with the changes. If Culbertson’s qik
site is not “official,” for example, we need a way of appraising it as such
and a way of differentiating it from the next channel that appears that
that *is* official.

HYBRID SITES WITH MIXED MESSAGES
————————————————————————
The military is providing us with lots of examples of archiving problems.
These issues of provenance, use-rights, copyright, and just plain finding
and getting information.

This is an extension of what I call the “copyright poison-pill” in which
copyrighted material appears in an otherwise non-copyrighted government
publication and creates confusion over the rights of libraries and archives
to save, reproduce, and display any or all of such materials. We see this
today in the way the Google book project has blocked access to most
government publications because they “might” be covered by copyright.

DEPARTMENT OF DEFENSE
————————————————————————
The the DOD “official website of Multi-National Force – Iraq” is a .com site,
not a .mil.

http://www.mnf-iraq.com/

There we find
– links to other commercial sites with streaming video without download
links

– web links designed to be clever (with javascript and hidden urls)
but which add an additional level of difficulty in identifying and
bookmarking links and downloading pages.

Another DOD site that provides video clips is .mil but a lot of the content
is actually hosted by .coms

DODvClips.mil

– While this is a .mil domain, it is actually operated by the Intel
Corporation and is hosted and maintained by a commercial organization
known as The FeedRoom or Globix Corporation

– While you can download video, you are bound by an END-USER LICENSE
AGREEMENT, in which Intel claims all proprietary rights to the content
and videos on the site.

– Those who try to harvest the content from this site will find
that the site instructs robots that should must not save copies of
videos or even web pages.

HORMUZ
————————————————————————
Here is another example of DOD video problems.

In January of 2008, the Pentagon broadcast a video of the “straights of
Hormuz incident” in which an unidentified voice says, apparently to a US
battleship “You will … explode.”

It was much in the news. (I found more than 1000 items on LexisNexis over
about a 4 week period.)

In January, two defense department web sites linked to a video of the
incident and one labeled it as “From Defense Department Video.” One
of those pages still exists.
http://www.defenselink.mil/transcripts/transcript.aspx?transcriptid=4116

But by June 2008, that url linked to a chef doing a promo for his show
called “Grill Seargent” and searches for “hormuz” turned up zero hits.

Last week, when I checked again, the link I got was to a 15 second ad for
“the pentagon channel” that said:
“Embrace accountability for all that you do — for everything in your
area of responsibility.”

A shorter version of the Hormuz video, complete with “you will explode” quote
is available here:
http://www.defenselink.mil/dodcmsshare/briefingslide%5C320%5C080107-D-6570C-001.wmv

Background information including why it is hard to know what goes missing here:
Documenting the Government — Strait of Hormuz edition
http://freegovinfo.info/node/1567

But, this is more than the tale of a broken link.

– The link was not to a .mil site, but to a commercial site (FEEDROOM again).

– The video was provided only as a streaming video and no download was
available.

So, here we have a critical piece of the historical record, with no
indication of who filmed it or edited it or posted it or took it down.

And we have no easy way to preserve this video and no guarantee that any
one will or can taken the responsibility for doing so.

WHAT MAKES A WEBSITE OFFICIAL?
————————————————————————
How do we determine what makes a website official?

One document I found is explicit, but vague. It says that

an “official website” includes any website hosted on the .mil domain,
but also “any website PUBLISHED or SPONSORED by a military comand but
hosted on a commercial server.”
http://www.mnf-iraq.com/images/stories/For_The_Troops/bloggers_policy.pdf

Unfortunately, this creates a cascade of problems.

– Will archivists overlook these sites because they are not .mil or
.gov sites?

– Upon finding them, can we identify who is actually responsible for
the content? (were they Published or Sponsored by the government?)

– If we find the site and identify it as something that is
government-generated, are we allowed to archive it?

STANDARDS BY ANY OTHER NAME
————————————————————————
One of the biggest problems digital archivists face is that of file
formats. When formats are tied to particular software or operating
systems or operating environments, it creates barriers to preservation.

“Standards” that work well for the end-user (and the service provider)
one year may be exactly the wrong standard for the archivist.

We can see an example of user-friendly, archive-unfriendly at the EPA.

EPA
————————————————————————
The EPA has a nice site that has videos, audios, podcasts, and more.

But they have chosen the “Flash based” video format as a “standard”
this is indeed a common format for streaming video, but adds additional
layers of difficulty to anyone wanting to preserve the videos by
downloading them.

http://www.epa.gov/multimedia/

Feds set sights on small screen
By Wade-Hahn Chan FCW August 11, 2008
http://www.fcw.com/print/22_25/technology/153418-1.html?type=pf

DISAPPEARING PUBLICATIONS
————————————————————————
There have not been any substantial, comprehensive studies of what gets
withdrawn from the web by government agencies.

(See “Chronology of Disappearing Government Information” Data collected
through May 8, 2002, Compiled by Barbara Miller for ALA/GODORT
Education Committee With special assistance of Karrie Peterson, for an
example of one attempt.
http://www.library.okstate.edu/Govdocs/chronchart.doc )

We are left with anecdotes about things disappearing or being
withdrawn and random discoveries of something here today and gone
tomorrow.

Anyone who works with government agencies for very long will encounter,
as I have over the years, as many “policies” as their are individuals
who administer those policies.

So we sometimes see agencies that are very careful about keeping older
documents online and others that express that opinion that “No one wants
last year’s (or last month’s) report.

Here is a recent example:
————————————————————————
AT: http://www.mnf-iraq.com/

We find links to a issue 6 of a “newspaper” but no links or indication of
earlier issues being available.
http://www.mnf-iraq.com/images/Unit_Newsletters/080826_aam_al-binaa_english.pdf

E-GOV
————————————————————————
I have left for last the concept of “E-government” — not because it is
less important, but because it is emerging and something to watch.

E-government is intended to transform the way government communicates with
citizens and business and itself.

To the extent that it creates communications that are faster, more
accurate, and more convenient, it is a Good Thing.

But, for us, it, again, fundamentally transforms the role of government.

In the past, the role of government in information dissemination ended at
the point of dissemination. Governments would collect and create and
assemble and edit and publish information products and distribute them to
the public and to libraries.

But today, the government is taking on a new, continuing role.

With e-government, governments are saying we must go them to get our
information today, and tomorrow, and forever.

As governments move to e-government, we are going to increasingly see
government information provided as “transactions” as opposed to
“instantiations.”

Here is a simple example:

I can call 411 and get a phone number: that’s a transaction and is a
big improvement over having to locate and use a bulky telephone book
which may not even be current.

Lots of kinds of government information lend themselves to this kind of
transaction delivery and make for better, more accurate, more timely
service.

But, if I am a journalist and I want to look at a directory of all
employees in a department, or if I’m an historian and want to see who
was in a particular office last year (or 10 or 50 years ago), or if I’m
a demographer and I want to do an surname or given-name analysis of an
agency’s employees, then a current, up-to-date
one-transaction-at-a-time system won’t help me at all. I need an
instantiation of the information from one or more time periods.

THE CENSUS
————————————————————————
Let me give you a concrete example: The Census

Every 10 years the federal government takes a population and housing census.

Through the government’s American FactFinder web site, the Census bureau
delivers a transaction-based service where you can find census facts and
tables.
http://factfinder.census.gov/

But, in addition, the Bureau makes the raw, anonymized census data
available for downloading and has deposited the data in the largest social
science data archive in the U.S (ICPSR at the university of michigan).
http://www.icpsr.umich.edu/cocoon/ICPSR/SERIES/00166.xml

What this means for us is that we can preserve the census. There is an
instantiation of the census in a format that we can preserve over time.
That instantiation is what is behind American FactFinder, but it is a
preservable form of that information.

This means that, we can preserve the data
– without crawling a web site
– even if the census bureau budget is cut and it takes data offline

it also means that the raw data are available for uses and re-uses
beyond the transactions that the bureau makes available.

This is a model for making government information available and preservable
and usable and re-usable for the long-term.

It is important to note that this model benefits users today, not just in
the future. Transaction-interfaces offer a limited number of possible uses
of the underlying information. When the raw data are available, users
can analyze use, and re-use the data in many ways not provided by the
transaction-interface.

Clifford Lynch has written eloquently about the need that scholars
have to get access to the raw information in the realm of scholarly
literature (Clifford A. Lynch, “Open Computation: Beyond
Human-Reader-Centric Views of Scholarly Literatures,” Open Access: Key
Strategic, Technical and Economic Aspects, Neil Jacobs Ed., Oxford: Chandos
Publishing, 2006, pp. 185-193.).
http://www.cni.org/staff/cliffpubs/OpenComputation.htm

Governments may not like the idea of doing this, though. They may want to
keep control and may want to do so under the vise of “accuracy.” (E.g.,
“Last year’s phone book isn’t accurate anymore. We don’t want copies of it
out in the world confusing people.”) Indeed, we hear that very argument
from some who still argue that the people should not have free open access
to Congressional Research Service reports. Local governments in particular
may also see information as an “asset” and wish to charge for access or
use of it.

And the private sector may not like the idea of raw information being
freely distributed because they want to control access so they can charge
for it. (Indeed we see something like that with CRS reports!)

It may be a challenge to get governments to understand this concept and,
once they do, to embrace distribution.

WHAT CAN WE DO?
————————————————————————

There is no single solution. And we should not expect any single entity or
agency or archive to “solve” the problems.

We need a multifaceted approach to preserving the historical record.

Here are some general approaches that I hope will guide you in your
local environments.

1) Do you have influence over the creation of information? Then make
sure that the information is created with preservation in mind. Talk
to creators about providing an instantiation of information in addition
to transaction-based access. Advocate free and open access. Insist on
open formats (e.g., ODF http://opendocument.xml.org/) rather than
proprietary formats.

2) Identify your partners in your organization.
– IT depts. They may have tools that will help you do your job. They may
be able to do things differently that would enable preservation, but they
haven’t thought of them.

– Managers who want information access in the near term. Managers may not
think of long-term access and usability, but they usually do understand the
benefits of having their own information usable in the near-term (1 to 5 years).
If you can *guarantee* something will be usable in 5 years, you can probably
guarantee that you are going to be able to preserve it for longer periods.

3) Identify other partners
– The Internet Archive is doing a lot right now to preserve information
on the web and you can work with them to have them do preservation for you.
http://www.archive.org/index.php
http://www.archive.org/create/
http://www.archive-it.org/

– Look for others locally and regionally with whom you can collaborate.
Universities may want to collaborate with governments and vice-versa, for
example.

4) Are you a partner?
Even if you are in an archive that has clearly no responsibility for
preservation of (say) the records of an agency, you may be in a
position, because of your own archival mandates (you have personal
records of a government official, soldier, elected offical) or because
of your constituency (users at a university who need the complete record for
historical analysis), you may have the opportunity (and obligation) to
collect information that is relevant to and even part of the complete
historical record.

The library model of having many copies dispersed over many institutions
has worked well for preserving and authenticating published materials, and
it may work in the archival environment as well when we are no longer tied
to a single copy of record. Software already exists to help with this:
Lots of Copies Keep Stuff Safe
http://www.lockss.org/lockss/Home
http://www.clockss.org/clockss/Home
http://lockss-docs.stanford.edu/

i want to close with a quote from that same Technology Review article
that I quoted earlier.

In it, computer scientist Robert F. Sproull of Sun Microsystems
Laboratories, who chaired a a National Academy of Sciences panel that
advised NARA, said:

“If you become obsessed with getting the technical solution, you will
never build an archive.”

The challenges we face are as much political, sociological, and
economic, as technological.

LCSH heading change from “govt publications” to “govt “information”

My pal, Jenna Freedman, the Lower East Side Librarian started subscribing to the Library of Congress Subject Heading Weekly list (probably out of love for Libraries’ great good friend and cataloger extraordinaire Sandy Berman!). This week she came across a strange one that I hope our readers can expound on in the comments, especially since I’m not a cataloger.

150 Electronic government information [May Subd Geog]
* 450 UF Electronic government publications [EARLIER FORM OF HEADING]
* 550 BT Government publications

Is this LC’s documentation of a move away from government publications as the instantiation of our government’s work toward e-government and government information as transaction? Should we be worried about this change in the heading? Is this just semantics? Is there a cataloger in the house?

The other one that I found strange was:

(C) 150 Global cooling [Not Subd Geog]
450 UF Cooling, Global
550 BT Global temperature changes

Is that some sort of Newspeak?!?!

Archives