Home » post » Likely space requirements for digital deposit

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

Likely space requirements for digital deposit

One of the understandable concerns expressed by Federal Depository Libraries about the possibility of local storage and serving of electronic government publications is that of unknown storage requirements. How big of a hard drive and how expensive a computer would a depository library need to maintain a local digital collection?

Our assumption at FGI is that if libraries were offered the option of digital deposit, they would only store materials that fit their depository profile. Thanks to the wonderful folks at Documents Data Miner, we now have a way to estimate storage requirements for particular depositories.

I selected the libraries below at random and used the “URL Locator” feature to determine how many electronic titles these libraries would have received in 2005 if a digital deposit system had been in place. I then calculated storage space based on an average document size of 5MB per document or 10MB per document to come up with an amount of yearly storage space. For comparison, an Alaska State Document averages 3.7MB per document, so I’m being VERY conservative in my space estimate. If GPO has a figure for the average size of a federal electronic publication, I’d love to know it.

Here are my estimates for:

A Regional (100% Selection)
Number of Documents: 5855
Yearly Storage (5MB/Doc): 29.3 GB/yr
Yearly Storage (10MB/Doc): 58.6 GB/yr

Downey City Library (0041A / 7% Selection)
Number of Documents: 313
Yearly Storage (5MB/Doc): 1.6GB/yr
Yearly Storage (10MB/Doc): 3.2GB/yr

Alaska State Library (0016 / 15% Selection)
Number of Documents: 707
Yearly Storage (5MB/Doc): 3.5 GB/yr
Yearly Storage (10MB/Doc): 7.0 GB/yr

Cal State Univ Chico (0045B / 25% Selection)
Number of Documents: 2478
Yearly Storage (5MB/Doc): 12.3 GB/yr
Yearly Storage (10MB/Doc): 24.8 GB/yr

Cal State Univ LA (0062 / 58% Selection)
Number of Documents: 3572
Yearly Storage (5MB/Doc): 17.9 GB/yr
Yearly Storage (10MB/Doc): 37.7 GB/yr

Los Angeles Public Library (0057 / 84% Select)
Number of Documents: 3721
Yearly Storage (5MB/Doc): 18.6 GB/yr
Yearly Storage (10MB/Doc): 37.2 GB/yr

With 300GB hard drives becoming common on $2,000 computers, even a Regional would have approximately five years worth of storage space on a single drive based on an average document size of 10MB, and ten years worth of storage space based on an average document size of 5MB. The rest of us would have a lot longer, but even smaller libraries can replace computers every five years. By then, hard drives will be even larger.

CC BY-NC-SA 4.0 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

1 Comment

  1. I just found this chart that shows cost for hard drive space over time. The latest date listed is 2004 and it was at about $1 / GB and dropping. So really, from an infrastructure standpoint, it’s never been cheaper to build and maintain a collection. Anyone know the cost/book for long-term storage to compare? Whatever the cost of paper storage, I’m sure digital is MUCH cheaper.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.