According to the Future Digital System blog, FDSys is ingesting a full run of Statutes at Large from the Library of Congress.
The scanned files take up 1.4TB worth of storage space and “The next step is for GPO to assess the content and determine whether the content complies with GPO specifications and create access derivatives (including OCR text) of the content.”
People who are considering LOCKSS boxes to store federal content shouldn’t blanch at the 1.4TB figure for Statutes at Large. Generally speaking, scanned files (which are images) are much larger than born digital content. For example, GPO deposited a year’s worth of 10 Federal born-digital e-journals during their LOCKSS pilot. These 10 “journal-years”” worth of content took up 900MB or roughly 0.9 GB. At that rate, we could have harvested these 10 journals for over 250 years before filling up our 250GB hard drive. Of course, we’d need to upgrade our hard drives well before that.
Having said that, it will be interesting to see what sort of uses that GPO can put this material to.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.