From the NARAtions Blog:
The National Archives just joined iTunes U, a dedicated area within the iTunes Store giving users public access to thousands of free lectures, videos, books and podcasts from learning institutions all over the world. If you already have iTunes on your iPhone, iPad, iPod, or computer, you can search for “National Archives” on iTunes U to find our channel, or visit us at http://itunes.apple.com/us/institution/national-archives-and-records/id4.... Our initial collections feature selected archival documents, lesson plan materials, podcasts by the Presidential Libraries, videos from our “Inside the Vaults” series and more.
Rolling Stone's Matt Taibbi has a new piece online "Is the SEC Covering Up Wall Street Crimes?" that will churn your stomach. A whistleblower SEC attorney named Darcy Flynn came forward to Congress earlier this summer to describe the systematic destruction of SEC records that were supposed to be archived for 25 years.
Taibbi's piece shows the worst of the revolving door between .gov regulators and the industry they're entrusted to regulate and investigate. It also highlights the tenuous nature of the preservation of government information and the critical need for trusted 3rd party organizations -- libraries! -- to be part of that system of preservation.
This is a different world, one far friendlier to lawbreakers, where even the suspicion of wrongdoing gets wiped from the record.
That, it now appears, is exactly how the Securities and Exchange Commission has been treating the Wall Street criminals who cratered the global economy a few years back. For the past two decades, according to a whistle-blower at the SEC who recently came forward to Congress, the agency has been systematically destroying records of its preliminary investigations once they are closed. By whitewashing the files of some of the nation's worst financial criminals, the SEC has kept an entire generation of federal investigators in the dark about past inquiries into insider trading, fraud and market manipulation against companies like Goldman Sachs, Deutsche Bank and AIG. With a few strokes of the keyboard, the evidence gathered during thousands of investigations – "18,000 ... including Madoff," as one high-ranking SEC official put it during a panicked meeting about the destruction – have apparently disappeared forever into the wormhole of history.
Under a deal the SEC worked out with the National Archives and Records Administration, all of the agency's records – "including case files relating to preliminary investigations" – are supposed to be maintained for at least 25 years. But the SEC, using history-altering practices that for once actually deserve the overused and usually hysterical term "Orwellian," devised an elaborate and possibly illegal system under which staffers were directed to dispose of the documents from any preliminary inquiry that did not receive approval from senior staff to become a full-blown, formal investigation. Amazingly, the wholesale destruction of the cases – known as MUIs, or "Matters Under Inquiry" – was not something done on the sly, in secret. The enforcement division of the SEC even spelled out the procedure in writing, on the commission's internal website. "After you have closed a MUI that has not become an investigation," the site advised staffers, "you should dispose of any documents obtained in connection with the MUI."
Many of the destroyed files involved companies and individuals who would later play prominent roles in the economic meltdown of 2008. Two MUIs involving con artist Bernie Madoff vanished. So did a 2002 inquiry into financial fraud at Lehman Brothers, as well as a 2005 case of insider trading at the same soon-to-be-bankrupt bank. A 2009 preliminary investigation of insider trading by Goldman Sachs was deleted, along with records for at least three cases involving the infamous hedge fund SAC Capital.
The National Archives is apparently preparing to reverse a long standing policy of providing free public access to Census Schedules when it releases the 1940 Census next year. (See "Update" below for additional information.)
Early next year the National Archives and Records Administration (NARA) can make the 1940 Census Schedules available to the public for the first time. (See "Background" below.) NARA has digitized these files and created metadata for them in preparation for making this valuable trove of information accessible on the web. It now only remains for NARA to decide who should provide access at what cost to users. Should NARA provide free access? Or should it contract this service out to a private company that will imposes fees on users and make a profit by providing access to this public information?
The answer should be obvious. For decades NARA has provided free access to Census Schedules at regional archive facilities and has sold microfilm to libraries that provide free access to their users. Now that online digital access is possible, NARA can provide better access online without having to maintain physical access at its regional facilities. It can distribute digital files to libraries for little or no cost so that libraries could further increase access and functionality for all of the information or subsets of it.
It seems, however, that NARA is choosing privatization instead of free public access. In an eight page RFI (Services Request for Information (RFI) NAMA-11-RFI-0004, 1940 Census [Microsoft Word .docx] or see the PDF version] NARA is seeking "industry input" for a "no-cost contract" to provide managed hosting and online access to the 1940 Census. The RFI is intended to explore options and "may or may not lead to a solicitation" for an actual contract. This means that NARA could, apparently, make a decision to do this work itself, but it is exploring the privatization route first. (Presumably, libraries could respond to the no-cost RFI as well. Responses were due on June 22.)
According to NextGov, the "no-cost contract" means that "the vendor would do the work for free and then charge the public a fee to access the records." (Archives Wants to Put 1940 Census Online, by Joseph Marks, NextGov TechInsider, July 15, 2011
The RFI does not explain why NARA is pursuing this path or what advantages it sees to privatization. I would guess the most likely reason is that NARA does not anticipate that it can get adequate funding to host the data online itself. But has it asked for funding? Has it made the case for continuing its historic provision of free public access to Census Schedules? Has it justified privatizing public information?
We have seen NARA follow this path before. NARA partnered with footnote.com to digitize selected holdings. This resulted in access restrictions including membership fees, per-page charges for downloading, and age restrictions to these digitized public documents. NARA partnered with Ancestry.com to make public records available for a fee. At the time, a spokesperson said that budget constraints and other priorities kept the Archive from making this information available itself.
"In a perfect world, we would do all this ourselves and it would be up there for free," she said. "While we continue to work to make our materials accessible as widely as possible, we can't do everything." -- Ancestry.com unveiled more than 90 million U.S. war records, New York Times (May 24, 2007).
In 2008, NARA contracted with TGN to digitize and provide access to some of NARA's holdings. The contract restricted free public access for five years. We've written about this here at FGI before (The NARA/TGN contract as a bad precedent) and believe that deals like this remain bad for NARA and bad for the country. We believe these kinds of deals set a bad precedent -- a precedent that is now being unnecessarily followed with the 1940 Census.
Those past deals were different from this one in one key way, however. They involved digitization of materials by the private contractors. In the case of the 1940 Census, the materials are already digitized, according to the RFI. The arguments we heard in the past were that digital access was so much better that it was worth privatizing access in order to get the digitization done. Without privatization, it was argued (even by some librarians and archivists), the materials could not be digitized and we'd be stuck with analog access. This is not the case for the 1940 Census since the materials are already digitized. The decision for online digital access has been made. The only question now is whether to make the existing digital files freely available or available through privatization.
Of course, the cost of any project providing access to all the 1940 Census Schedules and maps will not be insignificant. According to the RFI, NARA has created 3.8 million JPEG images, comprised of 20 terabytes of data.
Twenty terabytes is a lot of data, but it is fast becoming an almost modest size for a digital library. For comparison, the HathiTrust has over 3 billion pages and over 400 terabytes of data, OCLC has an over 600 Terabyte capacity, the Wayback machine contains 100 terabytes, the Library of Congress web archive is 235 terabytes, the University of California Curation Center has 70 TB in its Merritt digital preservation repository, and NARA's own Electronic Records Archives (ERA) has more than 90 terabytes. These terabyte-scale digital libraries are virtually the new norm and petabyte-scale digital libraries are already being built. Some of these are the Shoah Foundation Institute's digital library (8 petabytes), the Stanford Digital Repository anticipating a capacity of petabytes, and the Digital Hammurabi Project which is building a petabyte-scale digital library and museum of virtual 3D cuneiform tablets.
But the cost of providing access to microfilm at 13 regional offices was not cheap either. To me the question is whether or not the government is willing to continue its historic mission of providing free access or if it is ready to abandon that mission to the private sector.
There have always been those who argue that the fee-based private sector should take precedence over the public sector free-access. But, privatization of access to Census schedules would represent a reversal of long-standing policy. What was once an unquestioned government function is now, apparently, being considered a commercial function. Where, in the past, it was the government that provided free, public access to census schedules, now, when access can be improved, the government is abrogating its role and turning access over to private companies that will provide the information for a fee. The issues are not new. The precedents for providing free public access exist and have a long and respectable history. The only thing new is that NARA seems to have accepted privatization as inevitable.
Will there be funding for NARA to provide access to 1940 Census Schedules? There may not be. We have argued here at FGI for years that relying solely on Congressional funding for permanent, free public access to government information is risky because there is always the chance that Congress will not fund it. In these highly-politicized, economically troubled times it is easier to imagine a lack of any funding that to imagine adequate funding for the long term.
But this does not mean that privatization is the only option. There are precedents for government projects that are supported by donations and public-private partnerships. The American Memory Project is one notable example. And individual libraries or groups of libraries could step in and offer to provide free public access.
Now is the time for NARA, supported by researchers, libraries, and archivists to actively promote and pursue free public access solutions. There is no reason to accept privatization as inevitable.
Update: Note that the NARA website 1940 Census page says that "The digital images will be accessible at NARA facilities nationwide through our public access computers as well as on personal computers via the internet." Additionally, a comment on the Ancestry World web site said: "NARA will make the digitized copies of the 1940 Census population schedules available to the public, free of charge, on April 2, 2012 through our new Online Public Access search (http://www.archives.gov/research/search/)."
It is not clear from the above if the policy on free public access has changed with issuance of the RFI or not.
The Census Bureau conducts the Decennial Census every ten years. The Bureau summarizes its findings in reports that contain no information on individuals. The raw information collected, including names and addresses of those surveyed (sometimes called the "manuscript census" or the "census schedules"), is protected by law and is kept confidential for 72 years. After 72 years, that raw information is released by the National Archives and Records Administration. This information is invaluable to genealogists and other researchers. Typically, this raw information has been made available on microfilm at regional National Archives offices (Availability of Census Records About Individuals). This 72 year period for the 1940 Census expires on April 2, 2012.
- Measuring America: The Decennial Censuses From 1790 to 2000
- Census Of Population And Housing: Reports, 1790-2010
- All of the information that the U.S. Census Bureau collects under 13 USC 9 is confidential.
- The "72-Year Rule"
- Census information and records can be invaluable tools in genealogical research
The National Archives asking volunteer transcribers at Wikisource to turn paper and ink historical manuscripts into simple, searchable Web text.
- National Archives' first Wikipedian in residence to bring more holdings to the public, by Joseph Marks, NextGov (07/11/2011).
An interesting part of this story is the issue of the quality of scans.
A major barrier, McDevitt-Parks said, is the quality of the Archives' digitized files, the most important of which were scanned in the 1990s using early technology that makes them difficult to read online.
...Unfortunately, it's hard to [make the case to] go back and scan things that are already scanned when there are millions and millions of things that aren't in any digitized form at all.
I think there is an important lesson here. As we develop policies for our individual libraries today and plans for the FDLP of the future, we should always remember that digitization technologies improve over time and the uses we make of digital documents evolve over time. We should avoid choices that are merely good-enough today.
We should aim for a future that will enable us to increase access and functionality in the future, not lock us into what we are technologically capable of today.
Roll Call reports:
The National Archives could be just months away from starting a long-planned project to create for the first time a searchable digital log of the archives of Congress.
The project, which had been discussed for about six years, would essentially catalog Congressional records dating back to 1789 and create a database where researchers could search for specific topics.
Though it wouldn't digitize the records themselves, the database would point researchers to places within the expansive records where that topic is discussed.
"The idea is to take those various sources ... and to make it a state-of-the-art finding aid," Senate Archivist Karen Paul said.
+ Final Plan Approved Yesterday
+ Archivist of the United States David Ferriero, expressed concerned over the cost
+ Cost Estimate Should Be Provided in Six Months
+ Size? 500 Million Pages (200,000 Cubic Feet of Records)
+ Expected To Take Five Years to Complete
The "Pentagon Papers," officially titled "Report of the Office of the Secretary of Defense Vietnam Task Force," are now online at the National Archives in PDF format:
- Pentagon Papers, U.S. National Archives and Records Administration.
On the 40th anniversary of the leak to the press, the National Archives, along with the Kennedy, Johnson, and Nixon Presidential Libraries, has released the complete report. There are 48 boxes and approximately 7,000 declassified pages. Approximately 34% of the report is available for the first time.
What is unique about this, compared to other versions, is that:
- The complete Report is now available with no redactions compared to previous releases
- The Report is presented as Leslie Gelb presented it to then Secretary of Defense Clark Clifford on January 15, 1969
- All the supplemental back-documentation is included. In the Gravel Edition, 80% of the documents in Part V.B. were not included
- This release includes the complete account of peace negotiations, significant portions of which were not previously available either in the House Armed Services Committee redacted copy of the Report or in the Gravel Edition
For all you (de)classification geeks out there, here's an interesting new .gov blog to add to your blogrolls. NARA's Public Interest Declassification Board now has a blog called Transforming Classification. They'll be posting a bunch of white papers on various topics over the coming months. I'm looking forward to the conversation.
The Public Interest Declassification Board is an advisory board established by Congress to promote the fullest possible public access to a thorough, accurate, and reliable documentary record of significant U.S. national security decisions and activities. The Board’s mandate includes advising the President and other government officials on policies deriving from the issuance by the President of Executive orders regarding the classification and declassification of national security information...
...President Obama has charged the Board with designing a more fundamental transformation of the security classification system. In response to his request, we are proposing new solutions that address the shortcomings of the current system and tackle the challenges of digital records. By reducing inefficiencies and increasing public access, our proposals aim to improve the classification/declassification’s system capacity to protect and serve the American people.
Every other Wednesday over the next eight weeks, we will post either two or three “white paper” synopses to the blog describing an element of our proposed transformation.
[HT to Meredith Stewart]
From UNREDACTED (Decades Later, NARA Posts Documents on Guatemalan Syphilis Experiments, by Kate Doyle, April 25, 2011):
Between 1946 and 1948, U.S. public health researchers infected hundreds of Guatemalan prisoners, soldiers and mentally ill patients with syphilis and gonorrhea, without their knowledge or consent, in order to test the effectiveness of penicillin. The experiments were carried out in Guatemala under the cloak of confidentiality, and the results were never published in the United States. But after a scholar discovered archives chronicling the program at the University of Pittsburgh and published her findings last year, the U.S. National Archives and Records Administration (NARA) took custody of the documents and on March 29 made them publicly available.
From the NARA press release:
The National Archives at Atlanta announced that on March 29, 2011, it will release online the papers of Dr. John C. Cutler. Dr. Cutler, a former employee of the U.S. Public Health Service, 1942-1967, was involved in research on Guatemalan soldiers, prisoners, and mental health patients who were exposed to the syphilis bacteria. The collection is available online http://www.archives.gov/research/health/cdc-cutler-records and at the National Archives at Atlanta, located at 5780 Jonesboro Road, Morrow, Georgia, 30260.
The Government Accountability Office (GAO) has a new report on the National Archives and Records Administration (NARA) development of an Electronic Records Archive (ERA) to preserve and provide access to massive volumes and all types of electronic records.
- Electronic Records Archive: National Archives Needs to Strengthen Its Capacity to Use Earned Value Techniques to Manage and Oversee Development GAO-11-86 January 13, 2011.
GAO recommends, among other things, that NARA establish a comprehensive plan for all remaining work; improve the accuracy of earned value performance reports; and engage executive leadership in correcting negative trends. NARA generally concurred with GAO's recommendations.
- Costs soaring for Archives' digital library, auditors say, by Lisa Rein, Washington Post (February 4, 2011)
The cost of building a digital system to gather, preserve and give the public access to the records of the federal government has ballooned as high as $1.4 billion, and the project could go as much as 41 percent over budget...
The Government Accountability Office blames the cost overruns and schedule delays on weak oversight and planning by the National Archives which awarded a $317 million contract to Lockheed Martin Corp. six years ago to create a modern archive for electronic records....
National Archives joins Foursquare!, by Dawn, NARAtions, The Blog of the United States National Archives (February 2, 2011).
With the help of Foursquare, the National Archives is reaching out to people where they live and where they visit, by leaving tips related to NARA from coast to coast. Each tip highlights an interesting place or event related to the Archives. In the initial launch, the National Archives is focusing on providing tips pertaining to the Washington, DC, Kansas City, and Philadelphia Archives locations. Links between the Archives and the Washington Monument, Independence Hall, Edgar Allan Poe House, US Penitentiary and many more are now available to users visiting those locations. A Foursquare user is able to walk back through time and view scenes from the motion picture, "March on Washington" while standing on the steps on the Lincoln Memorial. The Archives is now rolling out tips through Foursquare, and we'll continue to release new tips about the historical places behind all of the National Archives regional locations.