Google to index government deep web?

Google seeks better access to government information By Daniel Pulliam, GovExec (October 25, 2006).

As much as 40 percent of the content on agency Web sites is invisible to Google's crawlers, [J.L.] Needham [a strategic partner development manager at Google] said, which means that for a majority of Internet users who do not know how to look beyond a search engine site, that information is effectively invisible.

Needham said he is meeting with a variety of undisclosed agencies to discuss how the information housed in their databases can be made available in the search results from engines such as Google, Yahoo or MSN. One method would be to use Google Sitemaps..

Perhaps of even greater interest is this:

A Dec. 16, 2005, memorandum from Clay Johnson, deputy director for management at the Office of Management and Budget, states that all agencies must set up their public information so that it is searchable by Sept. 1, 2006. It states that "increasingly sophisticated Internet search functions" can "greatly assist agencies in this area."

Agencies also were required to provide all public data in an open format that allows the public to aggregate "or otherwise manipulate and analyze the data to meet their needs" by Dec. 31, 2005, according to a separate OMB memorandum signed by Johnson on Dec. 17, 2004.

No votes yet

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

So, Google misses 40% of gov content?

I'd like to see a presentation around this fact at upcoming Depository Library Council meeting. It would be a counterbalance to some folks who talk about how libraries should forget describing Internet resources because "people want Google."

------------------------------------
"And besides all that, what we need is a decentralized, distributed system of depositing electronic files to local libraries willing to host them." -- Daniel Cornwall, tipping his hat to Cato the Elder for the original quote.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Easily link to terms in various wikis. For help, see <a href="/interwiki/3">interwiki</a>.

More information about formatting options

Syndicate content