Over the holidays, we switched FGI to new CMS software and a new theme and, in the process, installed some new back-end tools allowing us to do things like easily check for broken links. FGI went online in November 2004, so we have a little more than 9 years of outgoing links. Of those, 2676 link to .gov web sites and we discovered that 540 of those links are broken. That is about 20%.
That is actually lower than the 51% that the recent Chesapeake report found in its newest link rot study but still disconcertingly high. For those libraries that rely on pointing to URLs in their OPACs as a means of linking users to information, these kinds of numbers can lead to one of two conclusions: Either a) you better do link checking and link-repair frequently, or b) your “collection” is slowly disappearing. Adding to your workload is no fun and angering your users with bad links probably does not encourage them to increase your funding for better services. As the Chesapeake reports concluded: “documents posted on web sites will disappear at an increasing rate over time.”
As I browsed through the broken links on FGI, I found a variety of reasons for link breakage.
- abandoned domains. There is no “2010.census.gov” or any “amlife.america.gov” any more.
- cache problems. GPO use Akamai technology to “cache” frequently requested documents on Akamai servers throughout the world so that requests for those documents can be completed more quickly. In two cases we carelessly copied the “akamaitech” cache URL instead of the actual GPO URL. I checked and the documents still exist at their GPO address. But I do wonder how often users (and even libraries?) make this mistake of copying a very-temporary cache url.
- redesigned sites change URLs. the House Appropriations Committee Subcommittee on Legislative Branch URL apparently changed from appropriations.house.gov/Subcommittees/sub_leg.shtml to appropriations.house.gov/Subcommittees/Subcommittee/?IssueID=34776 and the similar Senate sub-committee changed from appropriations.senate.gov/legislative.cfm to appropriations.senate.gov/sc-legislative.cfm
- minor changes. Why would the BLS change its Data Finder search page from /query to /find at http://beta.bls.gov/dataQuery/ ? At least the data finder is still there!
- e-government interest changes. What was once pandemicflu.gov is now flu.gov and blog.pandemicflu.gov is gone. HHS still has information about “Pandemic Awareness” but has evidently changed its focus to flu in general.
- re-branding. the “govgab” blog, once at blog.usa.gov/roller/govgab/ and later at govgab.gov is either gone or maybe just replace by blog.usa.gov/
- suspended blogs. A blog for “examining rumors, conspiracy theories and false stories” at blogs.america.gov/rumors/ has been “archived or suspended” — but we don’t know which or where any “archive” might be.
- temporary sites are … temporary. The site change.gov simply says “the transition has ended” and invites you to go to whitehouse.gov where, apparently, “agendas” have changed to “issues.” http://change.gov/agenda/technology http://www.whitehouse.gov/issues/technology/
- CMS changes? why would HRSA change a nice, lean URL like datawarehouse.hrsa.gov/NSSRN.htm for the National Sample Survey of Registered Nurses Web site to datawarehouse.hrsa.gov/data/datadownload/nssrndownload.aspx ? My guess is they changed the software they are using a new content management system which dictates how urls will be constructed.
- scrubbing? When a report is controversial, is it just easier to remove it than to keep it online? The link to the Wegman Report at the House Energy and Commerce Committee is broken.
- FDLP “out of date” information? FDLP is not immune to link rot. Where are the questions for a 2009 DLC discussion?
- GPO moves stuff too. It is not easy to work in a bureaucracy that itself changes and, in doing so, changes how it does things. Remember the “The Federal Bulletin Board”? Probably not. Back in the 1990s It had 4,500 individual Federal agency files, in a variety of formats. GPO operated the bulletin board, which could “be accessed 24 hours a day, 7 days a week, by direct dialing 202/512-1387 from a modem using any communications software.” (Just type “/GO FAC.”!) As time moved on, GPO moved the files to permanent.fdlp.gov/fbb/ — but they are not all there, at least not under the same links like this one: http://fedbbs.access.gpo.gov/library/compare/compr5.pdf which is now, apparently, here: http://beta.fdlp.gov/file-repository/about-the-fdlp/gpo-projects/legislative-comparison-report/1189-legislative-comparison-report-2008-revised/file
I’ll stop there. I’ve only looked at about one tenth of the broken links, but the above should give us an idea of the kind of problems we face with pointing instead of collecting. Of course, some of the information may still exist somewhere with different URLs, but some may be gone permanently. We should be telling our library managers that “pointing” is not a cheap way to provide good service, it is a laborious task that is not necessarily easier than collecting, and certainly is not as reliable.
By the way: links to FGI pages are not immune from the kinds of link rot described above. We went to a lot of trouble in our switchover to a new theme to minimize broken links, but we know there are some that we were unable to duplicate. We’re still fixing the ones we can and invite you to let us know if you find any. We’re not saying FGI is better than .gov, we are saying that libraries should not rely on pointing. 🙂