Today I noticed that Search Engine Watch has posted a well written review of the new FirstGov search engine.
For those of you not familar with it, FirstGov was established during the Clinton Administration and has this current vision:
FirstGov.gov, the official U.S. gateway to all government information, is the catalyst for a growing electronic government. Our work transcends the traditional boundaries of government and our vision is globalâ€“connecting the world to all U.S. government information and services
If you interact with government at all, this is a site worth visit for its well organized lists of resources grouped by user (citizen, business, other gov’t) and by subject. The creators of FirstGov realized everyone isn’t on the web yet, so they also provide 800 numbers for agency helplines.
I’ve never thought that much of their search engine because it never distinguished between federal and state resources very well. Despite the great improvements outlined in the Search Engine Watch article above, this failing remains.
A quick demonstration can be found by going to FirstGov’s advanced search page and limit a search on “unemployment” to “federal only”. Mixed in with Federal results are hits from Washington, California, Ohio, and Alamaba. Almost any search you can do will pull up state as well as federal hits.
Conversely, if you search on “unemployment” and choose “all states”, the hits from the states above are taken out of your results, despite being state results.
The searches above expose the limits of current search technology. An indexing program has no idea whether a given piece of content is local, state or federal. It has to follow a rule. The rules FirstGov appear to have been given are:
1) A file is federal if its URL ends in .gov.
2) A file is state level if its URL ends in [state abbreviation].us (i.e. ak.us).
Back in the mid-90s when FirstGov was born, these were sensible rules. The .gov domain was reserved for federal agencies and the states were assigned domains based on their postal abbreviations and .us. So Alaska got assigned ak.us. But a few years back, .gov was opened up to ANY level of government. So not only you have states with .gov, but even cities (like sanantonio.gov). But since the search engine is programmed with the rules above, FirstGov can no longer distinguish pages and documents from differing levels of government even though a person could tell the difference immediately.
Ironically, better precision could be had simply by adding one metatag that agencies could have as a default setting on their document creation software something like GovernmentLevel, which would have the value of local, federal or state. No librarians involved, but what a difference it would make!
Level of government is probably already defined by some existing metadata scheme, but I don’t have that at my fingertips.