Here is a bit more information about All The Government’s Information, Carl Malamud’s “Google Tech Talk” (51 min video) May 24, 2006. (See also our post: Carl Malamud on “All The Government’s Information”).
I urge you to listen to this because there is so much that government information specialists will find interesting. The prepared part of the talk is only about 22 minutes. There is enough technology to make the talk interesting to technologists, but not so much as to make it uninteresting to the non-technologist. Most of the talk is about ideas and how to make them happen, not the technology.
Here are my notes about some interesting parts — with (very approximate) references to the minute:second time within the video.
The heart of Carl’s talk is his description of his new project, The Public Memory Trust (17:11), and its first project, “Washington Bridge” (14:43). The trust has a mission “of building public works projects on the internet.” The Washington Bridge project is an attempt to get video from Congressional hearing rooms to the real world and to the internet in real time and to archive the video permanently. His goal is to demonstrate the feasibility and practicality and utility of this kind of service so that it can be taken over and run by others. (This is the same thing he did with SEC data.) He sees training other to do the same thing at the state and local level and replicating the service (48:30). He says that if there is a public proceeding, the public needs to have access to it, now and forever and it needs to be archived.
I find it particularly interesting that Carl’s vision is to provide “the data” (i.e., the raw video streams) not just to service providers (such as television networks and Google Video), but also to anyone who might want to “take this data and do something with it.” He mentions technology companies that could use huge amounts of video as raw material for their experiments with creating digital signatures, creating metadata automatically, and doing speech to text conversion, but, presumably, “the data” would be available for anyone to use and reuse as they please. At question time someone asks about peer-to-peer and Carl says, “bittorrent service on the entire archive is a no-brainer” (40:25).
Carl says he spoke to Bruce James at GPO about the project and GPO is giving the project a “GPO fellow” position. (18:00 – 20:00) But a key part of Carl’s goal is to create a “large public domain archive” — something that GPO has never explicitly stated that it is willing to do in quite the same way. GPO’s model is to hold onto and control access to government information; Carl’s model is to distribute it and make it available to others. He notes that “in the long run, I would like to see every library in the country have a peta-byte of disk and a permanent archive of all congressional proceedings.” (18:07)
Net-Neutrality comes up briefly (47:00 to 48:30)
Meta-data creation is discussed in several places (e.g. at 25 minutes and 35 minutes).
Carl’s slide with links is at http://public.resource.org/google.techtalk.html.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.