Home » Posts tagged 'Permanent Public Access'
Tag Archives: Permanent Public Access
Librarian of Congress Carla Hayden announced today that reports from the Congressional Research Service (CRS) are now online at crsreports.congress.gov. This is HUGE news indeed because many librarians and open government advocates have been asking for this for at least 25 years.
The site is a good first step, and hopefully will only get better over time — eg I’d love to see CRS reports in multiple formats (not just PDF) and in bulk start to be distributed to FDLP libraries and LoC provide MARC records so that libraries could download the metadata and add to their local catalogs like DOE’s Office of Scientific and technical Information (OSTI) has been doing for years.
However, Daniel Schuman, one of the co-founders of everyCRSreport.com and a long-time advocate for public access to CRS reports, points out that the site has much to be desired so far:
I messed up my thread on the new CRS reports website. Bottom lines:
-They are missing THOUSANDS of reports
-They're disclosing author names
-Faceted searching appears decent (but slow)
-They have some archival reports with stable URLs, but possible implementation problem
— Daniel Schuman (@danielschuman) September 18, 2018
Many of us are hopeful that the site will continue to improve over time and that the Library of Congress will reach out to the library- and open government communities for ideas on how to make the site better for public access. Rome, and CRS reports database, were not built in a day 😉
I’m pleased to announce that, for the first time, the Library of Congress is providing Congressional Research Service (CRS) reports to the public. The reports are available online at crsreports.congress.gov. Created by experts in CRS, the reports present a legislative perspective on topics such as agriculture policy, counterterrorism operations, banking regulation, veteran’s issues and much more.
Founded over a century ago, CRS provides authoritative and confidential research and analysis for Congress’ deliberative use.
The Consolidated Appropriations Act of 2018 directs the Library to also make CRS reports publicly available online. We worked closely with Congress to make sure that we had a mutual understanding of the law’s requirements and Congress’ expectations in our approach to this project.
The result is a new public website for CRS reports based on the same search functionality that Congress uses – designed to be as user friendly as possible – that allows reports to be found by common keywords. We believe the site will be intuitive for the public to use and will also be easily updated with enhancements made to the congressional site in the future.
Many were thrilled earlier this spring when the FY 2018 Omnibus Appropriations Law included the “public access to all non-confidential CRS reports.”
Well, not so fast it seems. Daniel Schuman, Kevin Kosar, and Josh Tauberer (3 folks doing great work over the last several years on the CRS reports issue) have found that the “Library plan to publish CRS reports falls short of the law, and is unduly expensive.” LOC plan “does not comport with the law or best practices for creating websites and is unusually expensive,” they wrote. By contrast, their own collection of 14,000 reports on everyCRSreport.com cost about $20,000. Another point of criticism is that the the library’s plans to publish the reports only as PDF files — rather than in both HTML and PDF formats — making them harder to access on mobile devices and potentially inaccessible to people with visual impairments. The plan also apparently ignores a directive to publish a separate index of all the reports published by CRS, they said, which would make it easier for laypeople to see all available documents at once.
the group makes several important recommendations. To comply with the law, the Library should:
- Update its implementation plan to ensure that it publishes all CRS reports — we believe there are many more than the 2,900 the Implementation Plan says will be published by Spring 2019 — by the statutory deadline of September 19 of this year. We request it aim for September 17th, which is Constitution Day. The Library’s implementation extends beyond April of next year;
- Update its implementation plan to include all CRS Reports, including insights, infographics, sidebars/legal sidebars, in focus, and testimony;
- Revise its implementation plan to ensure that HTML versions of the reports are available to the public just as they are already available to Congressional staff — this would help the visually impaired read the reports as well as allow reports to be read on mobile devices;
- Revise its implementation plan to include an index of CRS reports, in accordance with the law’s requirements; and
- Review the code we published to see whether it would help the Library meet its obligations, in particular our automated author information redaction functionality, or whether the Library could develop an automated tool that would enable it to comply with the timeline.
With respect to the website design, the Library should:
- Consult with the Government Publishing Office and the public on how best to implement bulk access;
- Develop a plan to respond to any initial heavy loads on the website;
- Implement a robust website search capability and develop a plan to do so;
- Create predictable URLs for CRS reports and a landing page for a report series, and set forth a plan to do so;
- Keep down costs by examining our approach to see whether it can use some of our techniques to save money; and
- Consider engaging an entity like the General Services Administration’s 18F to help keep down costs and ensure a quality product.
So far, attempts to communicate with the Library of Congress have fallen on deaf ears. So if any of our readers have connections to Carla Hayden’s office, please forward this on to her.
The End of Term 2016 collection is still going strong, and we continue to receive email from interested folks about how they can help. Much of the content for the EOT crawl has already been collected and some of it is publicly accessible already through our partners. Last month we posted about ways to help the collection process. At this point volunteers are encouraged to help check the archive to see if content has been archived (i.e., do quality assurance (QA) for the crawls).
Here’s how you can help us assure that we’ve collected and archived as thoroughly and completely as possible:
Step 1: Check the Wayback Machine
Search the Internet Archive to see if the URL has already been captured. Please note this is not a specific End of Term collection search and does not include ALL content archived by the End of Term partners, but will be helpful in identifying whether something has been preserved already.
You may type in specific URLs or domains or subdomains, or try a simple keyword search (in Beta!).
1a: Help Perform Quality Assurance
If you do find a site or URL you were looking for, please click around to check if it was captured completely. A simple way to do this is to click around the archived page – click on navigation, links on the page, images, etc. We need help identifying parts of the sites that the crawlers might have missed, for instance specific documents or pages you are looking for but perhaps we haven’t archived. Please note that crawlers are not perfect and cannot archive some content. IA has a good FAQ on information about the challenges crawlers face.
If you do discover something is missing, you can still nominate pages or documents for archiving using the link in step 3 below.
Step 2: Check the Nomination Tool
Check the Nomination Tool to see if the URL or site has been nominated already. There are a few ways to do this:
- View all reports here
- Check this list here for a list of everything nominated or search here.
- You can also check our bulk lists here
Step 3: Nominate It!
If you don’t see the URL you were looking for in any of those searches, please nominate it here.
Questions? Please contact the End of Term project at eot-info AT archive DOT org.
[Editor’s note: Updated 12/15/16 to include updated email address for End-of-Term project queries (eot-info AT archive DOT org), and information about robots.txt (#1 below) and databases and their underlying data (#5 below). Also updated 12/22/16 with note about duplication of efforts and how to dive deeply into an agency’s domain at the bottom of #1 section. jrj]
Here at FGI, we’ve been tracking the disappearance of government information for quite some time (and librarians have been doing it for longer than we have; see ALA’s long running series published from 1981 until 1998 called “Less Access to Less Information By and About the U.S. Government.”). We’ve recently written about the targeting of NASA’s climate research site and the Department of Energy’s carbon dioxide analysis center for closure.
But ever since the NY Times last week wrote a story “Harvesting Government History, One Web Page at a Time”, there has been renewed worry and interest from the library- and scientific communities as well as the public in archiving government information. And there’s been increased interest in the End of Term (EOT) crawl project — though there’s increased worry about the loss of government information with the incoming Trump administration, it’s important to note that the End of Term crawl has been going on since 2008, with both Republican and Democratic administrations, and will go on past 2016. EOT is working to capture as much of the .gov/.mil domains as we can, and we’re also casting our ‘net to harvest social media content and government information hosted on non-.gov domains (e.g., the St Louis Federal Reserve Bank at www.stlouisfed.org). We’re running several big crawls right now (you can see all of the seeds we have here as well as all of the seeds that have been nominated so far) and will continue to run crawls up to and after the Inauguration as well. We strongly encourage the public to nominate seeds of government sites so that we can be as thorough in our crawling as possible.
Government information specialists know the value of the information that government agencies gather, create, assemble, and distribute, but wouldn’t it be nice to have a book that documents that value and provides examples of how that information is used? Wouldn’t it be nice to have a book that doesn’t just list useful databases, but describes the missions and histories of the agencies that produce the information?
Back in 2013, Dr. Miriam Drake, longtime director and dean of libraries at Georgia Institute of Technology, wanted to create such a book: A book about the value of public information and how the communities that libraries serve actually use that information. The result is this new book that we think deserves the attention of practicing government information professionals and teachers:
- Public Knowledge: Access and Benefits, Edited by Miriam A. Drake and Donald T. Hawkins, Foreword by Judith Coffey Russell. Medford NJ: Information Today, Inc. (2016).
Government documents librarians know and use FDsys (and now govinfo), and USA.gov, and the Catalog of Government Publications and specialty web sites like the Census Bureau’s American Factfinder and the Bureau of Economic Analysis and The National Archives and Congress, and GPO’s federated search engine metalib, and probably at least a few more. But after the basics, it is hard to keep track of the wealth of information available and how to find it. You might know, for example, that there are 123 U.S. federal government agencies that collect and distribute important statistical data, but how do you find it and which agency is best for which statistic? Have you ever used the Library of Congress’s Performing Arts Encyclopedia, or think about the non-government, public knowledge in the LoC, such as historic newspapers online? How many of the Databases, Resources & APIs at the National Library of Medicine have you explored? You’ve used USA.gov, but have you tried Science.gov or WorldWideScience.org? Are you helping your community find datasets, but you haven’t used OSTI data explorer?
And, if you have used some of those, but haven’t had time to understand the subtle differences between databases or agencies (e.g., do you know when to use NASA Technical Reports Server and when to use The National Technical Information Service?), you will find this book useful. This book will be useful for those who answer reference questions and work with communities who need information in almost any discipline. It gives the historical context of the development of the vast government information infrastructure and describes how agencies are changing rapidly and planning for the future. If you are a new or “accidental” government information librarian, or if you teach government documents, this book is for you.
And, yes, we wrote a chapter of this book, but we’d be praising its utility even if we were not part of it. The publisher has kindly allowed us to offer you a PDF copy of the chapter we wrote for this book.
- Beyond LMGTFY*: Access to Government Information in a Networked World. by James A. Jacobs and James R. Jacobs. (*LMGTFY = “Let me google that for you”)
Every chapter is different and every chapter is worthwhile. Here is a complete list of the chapters and authors.
Table of Contents
- The Relationship Between Citizen Information Literacy and Public Information Use. Forest “Woody” Horton Jr.
- Beyond LMGTFY: Access to Government Information in a Networked World. James A. Jacobs, University of California-San Diego Library, and James R. Jacobs, Stanford University Libraries.
- Government Resources in the Classroom. Susanne Caro, Maureen and Mike Mansfield Library, University of Montana.
- The U.S. Government Publishing Office. Miriam A. Drake and Donald T. Hawkins.
- The Library of Congress. Miriam A. Drake.
- The National Library of Medicine. Katherine B. Majewski, MEDLARS Management Section, and Wanda Whitney, Reference and Web Services Section, National Library of Medicine.
- The Department of Energy Office of Scientific and Technical Information, Part 1: Extending the Reach and Impact of DOE Research Results. Brian A. Hitson and Peter M. Lincoln, Department of Energy Office of Scientific and Technical Information.
- The Department of Energy Office of Scientific and Technical Information, Part 2: Bringing the World’s Research to DOE. Brian A. Hitson and Peter M. Lincoln, Department of Energy Office of Scientific and Technical Information.
- NASA’s Scientific and Technical Information for a Changing World. Lynn Heimerl, NASA STI Program.
- The National Technical Information Service: Public Access as a Driver of Change. Gail Hodge, Ha (Information International Associates).
- Federal Statistics Past and Present. Mark Anderson, Michener Library, University of Northern Colorado.
- Agricultural Information and the National Agricultural Library. Marianne Stowell Bracke, Purdue University Libraries.
- Hidden Government Information. Miriam A. Drake.
- The Future Is Open. Barbie E. Keiser, Barbie E. Keiser, Inc.
James A. Jacobs
James R. Jacobs