Smithsonian
Smithsonian digitization strategic plan
Submitted by jrjacobs on Mon, 2010-06-07 08:13.The Smithsonian has just released their digitization strategic plan for fiscal years 2010 - 2015 called "Creating a Digital Smithsonian" -- executive summary and full report.
I'm in 2 minds about this as well as similar digitization plans. On the one hand, the digitization of Smithsonian collections -- books, research reports, data, music, film and other sounds (like frog vocalizations!) -- will mean potentially a boon to online access to some really amazing materials.
On the other hand, this quote from the executive summary worries me:
To preserve our collections, the Smithsonian constantly battles the destructive forces of time and environment. Despite our best efforts, plastics discolor, wax cylinder recordings distort, and botanical specimens become brittle. Digitization offers a way to make objects — and the valuable information they contain — available without jeopardizing their integrity by handling or by exposure to the elements.
While they mention a "life cycle-management approach to digitization," there doesn't seem to be a serious amount of thought given to the fact that digital objects degrade faster than physical objects, and that digital preservation is an ongoing and potentially more expensive effort. I worry that SI.edu will broker the same kind of disastrous deal that GAO did with Thomson-West whereby a whole swath of public domain information was privatized.
I would call on SI.edu and ALL .gov agencies to insert a clause into ANY digitization contract that ALL digital files and metadata will be accessible via free and open sites. That means where applicable, copies of all digital content would be ingested into GPO's FDsys, Library of Congress, NARA and/or publicly accessible non-profit sites (eg. UNT digital library or Internet Archive). Please help us get this message across to your friends in the .gov sector. Public information should remain public!
- jrjacobs's blog
- Add new comment
- 555 reads
Malamud calls for a national scan center public works project
Submitted by jrjacobs on Wed, 2009-12-30 14:54.Carl Malamud posed this question over on twitter: "What if our national cultural institutions all worked together on a common problem, attracted White House support?" In his post on the O'Reilly blog, "A National Scan Center: A Public Works Project", Malamud scopes out the issues and calls for Library of Congress, the Smithsonian Institution, the Government Printing Office, the National Archives and Records Administration, and the National Technical Information Service to come together and make the compelling case for funding a 5-year $500 million effort to create a National Scan Center. Here here Carl!
In the U.S., we face a similar deluge of paperwork that we faced in the 1930s. A huge backlog of paper, microfiche, audio, video, and other materials is located throughout the federal government. Little money has gone from Congress for digitization, and bureaucracies have resorted to a series of questionable private-public partnerships as a way of digitizing their materials. For example, the Government Accountability Office shipped 60 million pages of our Federal Legislative Histories (the record of each law from the initial bill through the hearings and conference reports) off to Thomson West, but didn't even get digital copies back. Another example is the recent failed effort by the Government Printing Office to digitize 60 million pages of the Federal Depository Library Program, an effort they tried to get through as a "zero dollar cost to the government" effort with the private sector.
There are no free lunches and there are no "no cost to the government" deals. The costs involve the government effort to supervise the contract, prepare the materials, and ship them, and in both the GAO and GPO cases, the government wasn't getting much back for its effort. What the government and the people usually get is a lien on the public domain, preventing the public from accessing these vital materials. Similar efforts are sprinkled throughout the government. I testified to Congress that I had learned that the National Archives was contemplating a scan of congressional hearings with LexisNexis under similar circumstances, and many may be aware of the questionable deal the Archives cut with Amazon where my favorite online superstore got de facto exclusive rights to 1,899 wonderful pieces of video.
- jrjacobs's blog
- Add new comment
- 943 reads
Smithsonian Reports Its Web and New Media Strategy
Submitted by jajacobs on Sat, 2009-08-01 09:11.This new Smithsonian document "describes a transformational change for the Smithsonian, which will have impact on the Institution’s culture, operations, allocation of resources, talent recruitment, and priorities. This strategy can only become operational with adequate resources, and will require the Smithsonian to rethink the ways in which it generates revenues and prioritizes how resources are allocated to programs."
A key part of the strategy is to create a Smithsonian Commons "dedicated to the free and unrestricted sharing of Smithsonian resources and encouraging new kinds of learning and creation through interaction with Smithsonian research, collections, and communities" but it will also "Use the commons to attract the funding necessary to update the Smithsonian’s Web and New Media operations and business models."
The report notes that:
Attempting to directly monetize access to, and use of, museum content does not appear to be a sustainable business model. Through these low-margin business practices, we alienate users, perpetuate the practice of institutions charging each other, discourage research and publications, and undermine our civic mission. The commons presents an alternative: gradually reduce our dependence on access and use fees by aggregating larger number of visitors under a strong brand supported by sponsorships and other value-added products and services. It is likely that the Smithsonian will make more money by promoting “free” resources to a large audience than it can make charging small amounts for small transactions to a small audience, and it is a much better fit with the mission.
- jajacobs's blog
- Add new comment
- 842 reads
Searchability at the Smithsonian website
Submitted by Susannaleers on Wed, 2008-02-06 09:53.The January 7 issue of Federal Computer Week had an article by Ari Schwartz (deputy director of the Center for Democracy and Technology) that summarized the findings of the "Hiding in Plain Sight" report that was blogged about here on FGI in December. In the article Mr. Schwartz mentions that they found that the Smithsonian Institution (among other agencies) website resources had information obscured "including many online content collections in the Smithsonian Institution Research Information System".
The Smithsonian has responded in a letter to the editor of Federal of Computer Week: "All of the databases in the Smithsonian Institution Research Information System (SIRIS) are site-mapped according to the international standard ... SIRIS has 1,679,277 records available via the sitemaps to crawlers such as Googlebot. According to Google, 1,567,170 records from SIRIS, or 93 percent, are included in the Google index. We started working on the sitemaps in February 2007 and worked directly with Google engineers in June to enhance the accessibility and ranking of our records. We understand that the public expects to find all Smithsonian information in one system, but as stated on our home page, SIRIS contains only information from the Smithsonian’s libraries, archives and the Smithsonian American Art Museum’s research databases. The museum collections information is available through other systems described on the Smithsonian’s home page. However, efforts are under way to make one-stop searching available to the public in the future. We continue to work hard to raise the visibility of our data to the public through multiple channels, including search engines."
Thanks for the clarification, Smithsonian and we're glad to hear you are responsive to the public.
- Susannaleers's blog
- Add new comment
- 979 reads
Smithsonian Image Claims Challenged
Submitted by PGarvin on Fri, 2007-05-18 18:29.Carl Malamud has challenged Smithsonian Institution restrictions on use of the images at the Smithsonian Images website. The Associated Press reported today on the action by Malamud's Public.Resource.Org site.
- PGarvin's blog
- Add new comment
- 1542 reads
Smithsonian campaign and hacker tax credit
Submitted by jrjacobs on Wed, 2006-11-22 10:24.Here's a twofer to give you some more reading matter over the long weekend:
A friend sent me this internet campaign to shed public light on the secret Smithsonian/Showtime contract that would give Showtime a 30 year, non-competitive stranglehold on Smithsonian (i.e. public domain!) archives. Background on the story can be found at boingboing. If you want to be added as a signatory, please send email to Carl Malamud (carl@media.org) no later than Sunday, November 25, 2006.
After, I signed the letter, I was looking around the public.resource site and came across another campaign (perhaps dated but still viable!) that Carl had put together calling for a hacker tax credit! The basic idea is that open source software, because it is the driving force behind our new information culture, should be supported publically so that more growth can happen. Check out the text of the letter below that Malamud suggests you send to your Congresspeople. This campaign, as I said earlier, may be dated (he lists Vice President Al Gore as a suggested addressee), but open source software (sometimes called FLOSS) is still something for which we should all be advocating!
Pablo Picasso once said that good art is created, but great art is stolen. On the Internet, the same holds true. Good code is created, but great code is copied over and over.
The Internet was created from open source software, code that people can freely use to build new code, to run their networks, to create a new business, or to build a service that people can use.
Take for example the work of Paul Vixie, who has placed in the public domain the software that the Domain Name System runs on. This software has been used by every major Internet Service Provider and has been bundled into the operating system products of IBM, DEC, Silicon Graphics, and Sun.
Open source software created the Internet, and created the economic boom we now see in Silicon Valley. Most of the large web sites in the world run on the open source Apache web server. The $4 billion Netscape Corporation was built from the open source Mosaic. The PERL programming language was created as open source, but now fuels over $100 million in book sales for publishers like O'Reilly & Associates.
But, we are eating our seed corn. There is no systematic national effort to create open source software and it is increasingly difficult to keep this infrastructure alive. For every success story like Apache, there are dozens of projects that languish because of the lack of formal support for open source projects.
In the global village, open source software is not an alternative to commercial software, just as in our real cities public parks are not an alternative to our commercial districts. The parks make our cities thrive, and thriving cities are a good place to do business.
It is a happy accident that we have open source software, but there are simple steps that the federal government can take to provide even more fuel for the growth of our information economy. Here is a simple algorithm for a Hacker Tax Credit that could be added to the U.S. Code:
#/us/usc/irs if { You produce software that is in the public domain ; } andif { That software is used by at least 1000 people ; } then { You may deduct your development and operational costs from your gross income for tax purposes ; }If the U.S. Congress could compile this simple subroutine into the U.S. Code, this simple step would have a greater effect than any cuts in capital gain taxes. I urge you to consider steps that the U.S. Congress can take to insure a strategic national reserve of open source software.
Sincerely,
Carl Malamud
media.org
- jrjacobs's blog
- Add new comment
- 1869 reads


Recent comments
13 hours 43 min ago
1 day 6 hours ago
3 days 10 hours ago
3 days 12 hours ago
4 days 1 hour ago
5 days 23 hours ago
6 days 10 hours ago
1 week 6 days ago
2 weeks 3 hours ago
2 weeks 6 hours ago