OMB policy on posting information sparks debate By Jason Miller, Government Computer News, 12/23/05.

The Office of Management and Budget’s new policy asking agencies to improve how they disseminate public information is at the heart of a larger battle over how much categorization is needed to make government information publicly accessible.

Patrice McDermott, deputy director for government relations for the American Library Association, said:

“Essentially, what OMB appears to be saying is, for information you want to make publicly accessible, if you put it on your Web site or post it electronically, you have fulfilled all requirements of law….That is not true. That is not the spirit or intent of the law.”

  1. I’m not at all surprised that the OMB official who made the statement below chose to be anonymous:

    “This policy certainly meets the spirit and intent of the E-Government Act and capitalizes the extraordinary advances in search technology including the way they crawl and index information preparing it for retrieval over the Internet,” said an OMB official, who requested anonymity. “To say that this policy ‘only’ requires agencies to post information ignores the great advances in search technologies over the past two or so years and the considerable ongoing research.”

    I don’t think I’m being nasty when I say that only someone who doesn’t search for government information on the net could make such a statement. I DO think that the statement is somewhat true when one is looking for a known item and has the agency-issued titled handy. The statement is kind of true if you don’t have an exact title, but do know the agency (or preferable sub-office) AND know the DNS name of the server it is likely to be on.

    But for general searches when you only know a little or when you want to know what the government has on a given topic, good luck! I think the problem has increased since state and local governments have been allowed to use the “.gov” domain, which once only belonged to federal government sites. This situation has made it impossible for FirstGov’s “restrict to federal sites” function to work properly. Any search for federal information brings up state and local information.

    Another problem with relying on general search engines is that most search engines cannot sort by date, so you may get a superseded copy of a document when you want the most current version. But the superseded copy isn’t clearly labelled as such. But if you remove superseded documents from the web, you thwart historical and current affairs research (i.e. how have human rights in Egypt changed from 2002 to 2004?).

    Mere web posting also makes it difficult to determine when new material has been added to a web site unless that site has an RSS feed, which is still new to many Americans.

    There’s a whole paper’s worth of material on why general search engines won’t really meet the needs of government information seekers. Hopefully someone will start writing soon.

    “And besides all that, what we need is a decentralized, distributed system of depositing electronic files to local libraries willing to host them.” — Daniel Cornwall, tipping his hat to Cato the Elder for the original quote.

  2. I applaud Clay Johnson’s new common-sense policy. I for one don’t want my tax dollars going to a government-funded jobs program for librarians and information architects who don’t want to change their business model to account for new technology. Also, I use Google to find Government info all the time, and have never had a problem.

    That said, I’ll agree with the previous poster that some Government web sites do not have good search engine placement. The solution for that, however, is better web site design and search engine optimization (SEO), not more meta-tagging and cataloging.

    Meta-tagging is a waste of time and money…in the case of government info, MY money.

  3. Hi,

    If you read the actual policy (http://www.whitehouse.gov/omb/memoranda/fy2006/m06-02.pdf), you’ll see that OMB is not framing this as an either-or proposition. They are merely stating that agencies should leverage commercial search technologies appropriately when disseminating information to the public – which not all of them are currently doing well, by the way. The policy also states that where search technology is inadequate, agencies should continue to use formal information models such as taxonomies and metadata element sets to categorize government information to enable discovery. By the way, this position is consistent with the findings from the Efficient and Effective Information Sharing study (http://www.cio.gov/documents/EEIRS_RFI_Response_Analysis.pdf) commissioned by GSA.

    The recommendations of the ICGI Categorization of Government Information (CGI) Working Group were considered by OMB but not used. I’m not sure why, but I suspect it’s because they were biased and slanted. The chair of this group (http://www.gils.net/eliot.html) wanted to use this group’s recommendations to resurrect the hopelessly obsolete GILS system, which he invented. The media has been calling GILS obsolete since 1997 (http://www.gcn.com/16_27/news/32197-1.html). Also, NIST has withdrawn the FIPS on which GILS is based for the same reason (http://www.fcw.com/article89673-07-25-05-Print).

    The ICGI CGI WG was also co-oped by the librarian community, which wants to manually categorize and tag every government web page – all 1.02 billion of them. This is a colossal waste of taxpayer dollars, given the search technology available. Of course, it would indeed provide a government-funded jobs program for librarians, and give them a convenient excuse not to update their business practices to leverage 21st century technology.

  4. Thanks for dropping by and joining the conversation. Thanks also for the relevant links to the information you cite.

    However, there is one startling claim that you make that you do not source:

    The ICGI CGI WG was also co-oped by the librarian community, which wants to manually categorize and tag every government web page – all 1.02 billion of them.

    Where did you find this bit, and what do you mean by the “librarian community”? I like to think of myself as well-versed in the literature of librarianship and have many contacts in the documents community. I’ve never heard any librarian or read any article that suggested that every government web page should be manually tagged by librarians. Without a source, I don’t think your statement is a fair characterization of the belief of either the community of documents librarians, or of librarianship as a whole.

    All of the suggestions I’ve seen for government-wide metatagging depend on document creators to enter the tags. This would be the most cost-effective method. I accept GILS as basically dead, but mostly because there were too many required data elements, not because metatagging is a bad idea.

    Speaking strictly for myself, my main concerns are for notification and preservation. How do we know when an agency issues a new document, and how do we keep the agency from deleting a document if it becomes embarrassing for the government? A policy that treats simply web-posting as acceptable doesn’t address either question. A document can be posted to a sub-page somewhere and be considered “published” even if there is no other notice outside the agency. Search engines would only pull up the document when it was asked for, usually by title. And if you don’t know the title of the document, you’re a lot less likely to find it, as shown by the example in another comment.

    In terms of preservation too a policy of simply web-posting fails. Agencies are understandably oriented towards the short-term. Yesterday’s quarterly report is old news to them — but gold for researchers tracing policies and accomplishments. Last Administration’s reports that don’t support the current government are bad for the agency’s health, so off the server they go. A simple take-down letter to the Internet Archive (www.archive.org) takes the document down there as well. In a simple agency posting model, or in a centralized gov’t server like GPO’s Future Digital System, that’s the last we see of the document.

    But not when the document is deposited in a library, an institution where access, preservation, and privacy are our watchwords. We want the documents so we can continue to provide an accountable record of the government’s actions, research, and policies. You can have the web pages.

    “And besides all that, what we need is a decentralized, distributed system of depositing electronic files to local libraries willing to host them.” — Daniel Cornwall, tipping his hat to Cato the Elder for the original quote.

  5. The anonymous comment (Tue, 2005-12-27 21:59) that says that meta-tagging is a waste of time and money is simply wrong for several reasons.

    1. The best search-retrieval we have today (Google) is good, not because it relies on keyword indexing of full text, but because it ranks results based on human-created metadata (web pages created by humans pointing to other web pages) to rank search retrievals. Human indexing (“meta tagging”) is nothing more than a concerted, consistent, organized, professional approach to this process.

    2. Keyword indexing can be very powerful, but does not work for all searches. For example, if you search Google for the size of the National Guard and do this search:

    size “national guard”

    You won’t see the document you need: Selected Manpower Statistics published by Statistical Information Analysis Division (SIAD), Directorate for Information Operations and Reports (DIOR).

    3. OMB shouldn’t be seeing the answer to this question as either/or but as both/and. We shouldn’t be thinking “either search or metadata” we should be thinking both smart indexing with search and ranking technologies enhanced by and making use of human-assigned indexing.

    4. The controversy is partly over whether or not OMB is accurately implementing the law (the E-Government Act of 2002) and the the recommendations from the Interagency Committee on Government Information that was established by the law. As the article says, one federal official says that OMB ignored the committee’s suggestions and Sen. Joseph Lieberman, the bill’s author, said there are “serious concerns about whether OMB’s new guidelines comply with the act’s requirements.”

    5. It is important to remember that “metadata” is not a luxury, but essential information content. For example, geospatial information is worthless without rich metadata. For an example of how metadata is useful for information location and retrieval as well as information utilization, see The Technology Behind the New Geodata.gov and the Non-Technology Challenges Ahead By Adena Schutzberg and Joe Francica (Mar 08, 2005)

    6. If we are really concerned about how government is spending its time and money, we should certainly be concerned about being sure that government information (paid for with tax money) is available without cost to taxpayers. One simple way to ensure that is to turn the information over to FDLP libraries — a simple step that GPO has, so far, refused to do.

    7. The reality is very different from the characterization in the anonymous comment of “a government-funded jobs program for librarians and information architects who don’t want to change their business model to account for new technology.” In reality, those of us who advocate digital deposit are eager to use advanced technologies to create new and different views of government information once we have digital documents to work with. This is very different from a one-size-fits-all approach that GPO (and now OMB) envision.

