Home » post » Cabbage statistics and google bombs

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

Cabbage statistics and google bombs

The recent govdoc-l posts regarding cabbage statistics, selection and google piqued our collective FGI interest (thanks Chuck, Carlos and Amy!). I thought it might be interesting to share our thoughts and connect these threads together.

After reading the above govdoc-l posts, Daniel, Jim and I had a spirited discussion about the Congressional Record, databases, and google. We explored whether or not google indexes the CR (turns out that at least for cabbages, google gets to at least some of the “correct” data), all of that un-indexed information stuck in govt databases, and the fact that anyone who uses google will miss out on govt info in the “dark web.”

What really resonated in the original thread was that Chuck was willing to deselect that item because cabbage is not of interest to Illinoisans (sorry Chuck, but I’m glad Carlos pointed out that that decision would be short-sighted). But also, we found it interesting that the google searches all hit on a site at Cornell rather than a gpoaccess purl, and even the purl pointed to the Cornell site. And this leads to the main gist of our internal discussion.

Rather than shrugging our collective shoulders at the ubiquity of google, we should be using google’s indexing strategy and people’s search inclinations to our advantage (google docs bombing if you will). The article that Amy referenced (“Agencies are working with Google to boost rankings and increase traffic” by Trudy Walsh) discussed how govt agencies were starting to do just that (if inadvertantly!).

We need to create our own govt info sites, remix govt info, open up our catalogs — or better yet use tools like scriblio, a not-yet-released WordPress plugin that converts marcXML to blog posts (check out the Cook Memorial Library in Tamworth, NH for a look at scriblio nee WPopac). And once we have govt info in as many indexable places as possible, we (meaning docs librarians but also librarians in general) need to make sure that all those index points lead the info-curious back to our analog collections. In other words, we need to leverage the google logorithm and users search proclivities, to position ourselves in the online environment.

Here’s a real world example (apologies for tooting the FGI horn 🙂 ). The Wall Street Journal contacted us this week wanting to know about government podcasts (Daniel’s audio interview will be up in a few days!). We found it odd that WSJ would want to talk with us until we did a few searches and realized why. If you do a Google search on the words “government podcasts,” FGI’s the first hit, above that for usa.gov. FGI is 2nd on Yahoo, just below usa.gov. We’re #1 on ask.com. We didn’t do anything special, we just described a subject area that happened to be of interest to the WSJ. And this is what libraries can and should be doing.

Shinjoung and I saw Rick and Megan Prelinger talk two nights ago. For those of you who don’t know, Rick is a film archivist and is on the board of the Internet Archive, Megan’s an independent scholar and bird rescuer; the two of them have built their own amazing library! A couple of things that stuck in my mind about their talk was when they described the library as an “analog-digital landscape of ideas” and an “information ecology.”

I think that’s what we need to be building. That’s what archives do with digital finding aids, and that’s what we need to do with the rest of the library. WE know that only an extremely miniscule portion of knowledge/information will ever be digitized, and only a small portion of that will be accessible via google (I *hate* snippets!), but the general public doesn’t know that. We also know, as Carlos said, “You just never know when that piece of information will be useful to anybody.”

I’ll end with a link to David Weinberger’s blog where he quotes (with permission) an email from Bobbi Carlton about a conversation she had recently with Bernie Margolis, president of the Boston Public Library:

“Bernie Margolis was … talking about how people think the Web is going to put libraries out of business. He says that the more hits on the BPL website, the more visitors come to the library. The more people learn about the library, the more they come in. The BPL sees a direct correlation between web traffic and foot traffic but that is because the library is more than a repository of things and information – it is a resource as well.”

CC BY-NC-SA 4.0 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.