gpo
Demonstration videos of GPO's FDsys database
Submitted by jrjacobs on Fri, 2008-11-28 15:18.Check out the search demonstrations of GPO's FDsys (nee Future Digital System). GPO's Federal Digital System (FDsys) will "manage federal govt documents, allow them to be uploaded, accessed via the internet, included in the depository library program (italics added!), and preserved for the future." The video images are a bit fuzzy, but you can see that the basic utility of FDsys from an end-user's perspective is getting close to full functionality. I'm most interested in APIs and other tools and services for exporting large chunks of data and associated metadata for reuse, digital deposit into library repositories/LOCKSS caches etc and generally being able to expand on access, preservation and long-term sustainability. Hopefully, future video demonstrations will elaborate on those possibilities.
- part 1: simple search
- part 2: advanced search
- part 3: citation search
- part 4: boolean search
- part 5 is mentioned in part 4, but there's no video available as of 11/28/08 from GPO's youtube page.
Questions and comments should be emailed to pmo AT gpo DOT gov. Also feel free to leave comments here as well.
- jrjacobs's blog
- Add new comment
- Email this blog
- 270 reads
Give Your Feedback on FDL Video
Submitted by dcornwall on Wed, 2008-11-19 21:12.Today, thanks to subscribing to the "fdlp" tag on del.icio.us, I was introduced to the first video that GPO produced as part of it's "Easy as FDL" campaign:
Since GPO is allowing ratings and comments on this video, I really want you to go and watch, rate and comment. You need to have a YouTube account to rate and comment, but it's easy to set up. If you'd prefer not to set up a YouTube account, please leave your name and comment and I'll post it for you.
I rated the video a 3 out of 5. It's a great video for people already interested in the Federal Depository Library program. If I weren't a former depository librarian, I don't think I would have hung out until 3 minutes in when they started talking about what the program could do for me.
Don't get me wrong, I appreciated all the librarians and GPO staff who appeared in the video. Plus the production values were excellent and light years beyond what *I'll* ever come up with. It just didn't feel user oriented until the middle. And today's potential users won't wait that long.
Here are the suggestions I left at YouTube:
I'd strongly recommend flipping the content of this video and lead off with Cindy Elkins talking about the types of questions that can be answered at an FDL, then Mary Alice and the others highlighting material (Adventures of Echo the Bat, etc) that's available. Then end with background on the program. Hook people first, then explain. Finally, the end URL should be to the Depository Directory and not GPO Access. Though you should make videos about GPO Access!
GPO also posted several versions of the video and more background information at http://www.fdlp.gov/promotion/easyasfdlvideo.html.
Watch the video for yourself and let us know what you think, preferably at YouTube, but here will do.
Finally, despite the comments above, it is a GREAT THING that GPO is producing videos and other promotional content. Let us, the librarians who work with users every day, help them tweak what are decent products into real user creation machines. But bless them for giving us something to work with!
- dcornwall's blog
- 3 comments
- Email this blog
- 416 reads
Promoting the FDLP
Submitted by jrjacobs on Tue, 2008-11-04 14:11.The GPO just released a new video promoting the Federal Depository Library Program (FDLP), a network of 1250 libraries across the US where the public can get access to government information in a variety of formats. This is a nicely done, succinct explanation of the FDLP. It's nice to see friends and fellow government documents librarians expounding on the FDLP! Thanks GPO!!
- jrjacobs's blog
- Add new comment
- Email this blog
- 316 reads
FDLP Interactive Community Site
Submitted by rdavis on Thu, 2008-08-28 15:07.As we all know, the Federal Depository Library Program (FDLP) consists of libraries throughout the United States. While geographic separation is key to putting our Government’s information into the hands of the American people, Federal depository librarians have been at a bit of a disadvantage when it comes to connecting to their colleagues.
All that is about to change! The U.S. Government Printing Office (GPO) has developed an interactive community site that is available to Federal depository librarians.
Currently available in beta mode, I encourage the community to check out the site and provide feedback during the beta period. Located at http://community.fdlp.gov, the site offers the following features:
- Create an online profile that includes an avatar, contact information, biography, the ability to self-identify expertise, and more. Profiles are not publicly accessible for security purposes.
- Based on user profiles, members can search for other users. For example, you can search for all users from academic libraries in the state of Kansas who are members of ALA or all those that self-identify themselves as experts in Geography & Earth Science.
- Create buddy lists.
- Send private messages to users.
- Blog about issues that are important to you and the community. Blogs can include images, links, videos, and more.
- Comment on user blogs.
- Create photo albums and upload images.
- Add events to the community calendar.
- Add links to Web resources on a variety of topics.
As part of the beta launch, users can peruse the site and provide overall feedback, but will be unable to create accounts and populate/test the interactive features listed above. Users that would like to participate in a more hands-on test can sign-up to become a beta test user. We are limiting the closed beta test to the first 30 members of the Federal depository library community that sign up. Accounts for beta testers will be created and sent on or about September 3rd. Testing will be open for two weeks.
To sign up to be a beta tester, complete this form on the FDLP Desktop. To sign up to be a beta tester and to find out more information, complete this form on the FDLP Desktop: http://www.fdlp.gov/latest/betatesters.html
More features are coming to the FDLP Desktop in the coming weeks. As part of my last blog post here at FGI, here is a taste of what is coming:
- While blogs are great for expressing individual ideas and comments, it is not as conducive to discussion. Listservs, meanwhile, generate a lot of email in our already overwhelmed inboxes. Our next unveiling will be the FDLP Community Forum. Integrated into the FDLP Community site, thus creating a singular login, the forum will provide the community the ability to discuss a variety of issues/topics while also offering the ability to create sub-communities, search threads, bookmark threads/topics, share files, and much more!
- Also in the works is a redesign of the FDLP Desktop. We have learned a great deal since our initial redesign and are preparing to unveil the next generation. You may notice from the list above of the features of the FDLP Community site mirror several of those on the current FDLP Desktop. The upcoming re-release of the FDLP Desktop will be for library coordinators only and will be focused on disseminating FDL Program-specific content only. Most interactive features are moving to the FDLP Community site.
Stay tuned. We have more up our sleeves as well.
Once again, thank you for the opportunity to be FGI's guest blogger. I have thoroughly enjoyed the experience and will share my thoughts here from time to time in the future.
- rdavis's blog
- Add new comment
- Email this blog
- 849 reads
Catalog of U.S. Government Publications Enhancements Coming
Submitted by rdavis on Wed, 2008-08-27 10:02.Library Services and Content Management is continually working to improve the Catalog of U.S. Government Publications and the services it provides. One of the upcoming services that we are excited about is the creation of a login page for depository libraries that will enable them to take advantage of a range of authenticated services not otherwise available. These include:
- Selective dissemination of information. This will give depositories the ability to direct the system to send emails when resources in a particular area of interest are cataloged. Depository libraries will be able to set up notifications by item number or by SuDocs stem, for example;
- “Save records to local pc”. Currently the options are to email records to a defined email address up to twenty at a time, or to search, retrieve, and download up to one thousand records from the CGP per session.
- RSS feeds;
- Retained preferences that will persist across sessions;
- Links to FDLP-related pages including the FDLP Desktop and the Federal Depository Library Directory.
We are anticipating a demonstration of the FDLP login page at the Fall Conference and a subsequent December release of this functionality.
Also on the agenda is an enhanced Federal Depository Library Directory. We would like to ask for input from users for improvements we could make to the FDLD to enhance the user experience. Please submit suggestions through AskGPO at http://gpo.custhelp.com/cgi-bin/gpo.cfg/php/enduser/ask.php. Use the category Federal Depository Libraries, subcategory Catalog of U.S. Government Publications, then CGP Enhancements/Suggestions.
- rdavis's blog
- 4 comments
- Email this blog
- 1188 reads
Federal Document Authentication--What level is appropriate?
Submitted by rdavis on Sun, 2008-08-24 19:51.As I am sure you know, we at GPO have been talking with the library community for several years now about our authentication efforts. This year, we were able to move beyond the discussion phase and implement authentication technology into some of our top GPO Access applications. In early 2008, we integrated an Automated PDF Signing system into our GPO Access workflows, and we successfully released the digitally signed and certified FY 09 Budget of the United States and 110th Congress Public and Private Laws documents on GPO Access. Digitally signing these publications was just the stepping stone for implementing our authentication initiative. Upon approval from publishing agencies, all publications ingested into the Federal Digital System (FDsys) will be digitally signed and certified in the future.
In addition, we will implement authentication technology at the granular level. Granular content, as described in relation to the FDsys, is content that is broken into smaller content units such as chapters, parts, or sections. Our next challenge is to identify at what level of granularity content should be authenticated and digitally certified for each content format. I am very interested in feedback on your thoughts on the level of granularity GPO should authenticate content to share with the team developing FDsys. I am also interested in learning more about your opinions and expectations for the future in relation to GPO’s authentication initiative. For more background on our authentication initiative, please visit http://www.gpoaccess.gov/authentication/.
- rdavis's blog
- Add new comment
- Email this blog
- 478 reads
Identifying Value in Being a Federal Depository library
Submitted by rdavis on Thu, 2008-08-21 10:07.As one means of seeking input for the strategic plan on the FDLP's future, I am sending a letter to each Depository Library Director this week asking them to identify the value depository designation creates at the local level for the library, its staff, and users. The letter also actively seeks success stories and anecdotes about the value of the depository to feature on the FDLP Desktop.
What are the various ways your library derives value from the FDLP? How do your users benefit by using depository resources? Do you have success stories to share or anecdotes? What are ways GPO can improve the value of the FDLP? How can GPO assist in improving the value of the depository to you, your library and community?
- rdavis's blog
- 1 comment
- Email this blog
- 748 reads
Subject: GPO and Library Services are Going Green!
Submitted by rdavis on Tue, 2008-08-19 18:28.GPO’s Library Services and Content Management (LSCM) unit is committed to carrying out our mission of "Keeping America Informed" by producing and distributing a vast array of Federal government information products and has been doing so for over 140 years. LSCM is making strides to, not only keep and strengthen this commitment, but to do so in an eco-friendly manner. LSCM has looked for ways to improve existing services and practices while doing its part to preserve the earth for generations to come.
Reducing paper usage has been one major area of focus within LSCM. As many GPO resources are making an electronic transition, GPO is doing its part to save on paper waste. Some of the important resources that have gone electronic are:
- The Federal Depository Library Directory (FDLD): Through a new, dynamic online interface The FDLD provides important information on every depository library, such as mailing address, Web site, Director, Depository number and more. Additionally, depositories can edit their own library’s information online.
- The Federal Depository Library Handbook: Now online as a living document, the Handbook contains legal requirements, program requirements, and guidance for depository operations. Each chapter also includes best practices, tips, and resources for library administrators.
- At the recommendation of the Depository Library Council to the Public Printer, GPO requested of the Joint Committee on Printing that there be a waiver of the requirement of Title 44, Section 1711 to print the Monthly Catalog of U.S. Government Publications and the Congressional Serial Set Catalog. This was approved and GPO has instituted an online replacement using the OPAC module of our Aleph Integrated Library System. The Catalog of U.S. Government Publications with the searchable subsets of serial set, periodicals, and serials records online has proven to be a successful new online resource.
- The List of Classes: Previously, this publication was published bi-monthly, and two copies were sent to each depository. In current practice, this publication is published twice per year and one copy is sent to each depository. Electronic files of data from the List of Classes are updated monthly and uploaded to GPO’s Federal Bulletin Board on the first Friday of each month.
- Administrative Notes: Now available in electronic form only.
- Item selection update cycle materials: Now online functions.
LSCM’s Depository Distribution business unit has undergone some eco-friendly changes as well. Process changes in invoice management have heavily minimized the number of photocopied pages that are produced in the packaging and preparing processes. Also, previous methods for preparing depository shipments formerly utilized upwards of 10 zone sheets per shipment. New processes utilize only one. In regard to box preparation, large boxes were formerly utilized, and filler was added if boxes were not at full capacity. Now, smaller boxes are being used, and fill has been eliminated.
LSCM’s use of the Customer Relationship Management (CRM) system for answering customer inquiries has significantly reduced paper waste. The CRM provides an electronic record of customer transactions, which previously was recorded via paper logs.
In each of LSCM’s intricate processes in producing and distributing Federal government publications, all materials which can be recycled are recycled. Cardboard, microfiche, printer cartridges, and paper are recycled throughout each process in LSCM. Furthermore, LSCM management has undertaken a mission to educate staff with small tips to reduce waste, such as optimally utilizing paper, electricity conservation, and water conservation.
As we work to help GREEN the GPO, tell us what you think. Any suggestions? How are all of you doing this at your libraries?
- rdavis's blog
- 1 comment
- Email this blog
- 671 reads
Some exciting things have been happening at GPO in the world of digitization
Submitted by rdavis on Wed, 2008-08-13 13:55.As you have likely heard by now, we have a goal of digitizing all retrospective federal publications back to the earliest days of the Federal Government. A Request for Proposal (RFP) for Mass Digitization Opportunities has now been released via Federal Business Opportunities. Here's a link to this proposal and additional information on GPO's digitization initiatives. Proposals are due by September 19, 2008.
We are in search of a cooperative, mutually beneficial relationship with a private or public sector participants where the uncompressed, unaltered files created as a result of the conversion process are delivered to GPO at no cost to the Government. These files will serve as the digital master copies that will be preserved and used for the creation of access derivatives within GPO's Federal Digital System. In exchange, the contractor will be able to maintain a collection of files produced in the process for inclusion in their collections (e.g., search indices, book search sites). This content will be made available online, free of charge from GPO.
Also, if you haven't yet seen it, we have re-launched the Registry of U.S. Government Publication Digitization Projects, which contains records for projects that include digitized copies of publications originating from the U.S. Government.
The Registry...
- serves as a locator tool for publicly accessible collections of digitized U.S. Government publications;
- increases awareness of U.S. Government publication digitization projects that are planned, in progress, or completed;
- fosters collaboration for digitization projects;
GPO is actively soliciting all interested parties who plan to digitize federal publications within the scope of the FDLP to contribute to the
registry of digitization projects.
I am very interested in hearing what you think about GPO's direction regarding digitization and where you would like to see us go.
- rdavis's blog
- 1 comment
- Email this blog
- 1054 reads
GPO Forecasting the Future With FDLP Partners
Submitted by rdavis on Mon, 2008-08-04 18:46.In administering the Federal Depository Library Program (FDLP) in partnership with Federal depository libraries, GPO relies heavily upon Title 44 of the United States Code, GPO's A Strategic Vision for the 21st Century (PDF) document, the Depository Library Council document Knowledge Will Forever Govern (PDF), along with policy documents, whitepapers, and of course feedback from our FDLP partners.
At the Spring Depository Library Council Meeting in Kansas City, Missouri, GPO had a session on Shaping and Transforming the Future of the FDLP. A deliverable from this session is that we are preparing a draft Strategic Plan for the next Federal Depository Library Conference and Fall Depository Library Council Meeting.
I am very interested in your thoughts on additional sources of information GPO should consider as we prepare this document, thinking 10 or more years out into the future, and the impact of potential technology changes. As an example, if you have not seen The Elon University/Pew Internet Project site Imagining the Internet: A History and Forecast is worth reading. Of particular note is the Forward 150, Back 150 section.
- rdavis's blog
- Add new comment
- Email this blog
- 660 reads
Partnering With GPO
Submitted by rdavis on Sun, 2008-08-03 09:01.GPO recognizes that with the ever-increasing amount of electronic U.S. Government information, we need your help! Since 1997, depository libraries have worked with GPO to ensure permanent public access to electronic content and to provide services to assist other depositories and the public by becoming a GPO partner.
Our recent partnerships include:
- Government Information Online: Ask A Librarian
- Homeland Security Digital Library
- Historical Publications of the U.S. Commission on Civil Rights
Does your library have a project, resource, or service that would benefit the depository library community and the public? Consider a partnership with GPO and have a direct impact upon citizens' access and use of government information. Learn more about GPO's partnership program.
The ever-increasing amount of electronic U.S. Government information requires a team effort.
- rdavis's blog
- 3 comments
- Email this blog
- 830 reads
Right to govt information the cornerstone of the GPO
Submitted by rdavis on Fri, 2008-08-01 12:45.For my first blog post, I want to begin by saying that I am honored to be the guest "Blogger of the Month" and thrilled to share my thoughts on various tools, technologies, trends, and other events that impact my professional passions.
Our Founding Fathers fought to ensure that the people have the right to its Government's information. That is the cornerstone of the U.S. Government Printing Office's (GPO) Federal Depository Library Program (FDLP) and a cause to which I have dedicated the past 16 years of my Federal service.
Technology and innovation are core elements necessary for change. As the Director of Library Services & Content Management and the acting Superintendent of Documents at GPO, I am constantly seeking ways in which GPO and the FDLP can improve the dissemination of Federal information to the public as well as ways in which GPO can better serve our library partners.
Many of the topics that I will be covering this month will relate to the challenges of keeping up with the latest technology, improving our services to our partner libraries, and ways in which we can stay innovative. I am eager to read your feedback and encourage readers to stay in touch even after I pass the torch to next month's blogger.
- rdavis's blog
- Add new comment
- Email this blog
- 594 reads
GPO's draft regional libraries report and FGI comments
Submitted by jrjacobs on Sun, 2008-07-06 09:51.A few weeks ago, the Government Printing Office released their draft report entitled, Regional Depository Libraries in the 21st Century: A Time for Change? and asked for comments until June 30. I'm not sure how many comments they received, but wanted to publish comments we submitted. Lynne Bradley, Director American Library Association Washington Office, DID submit comments that were endorsed by the Association of College & Research Libraries (ACRL), the Association for Library Collections & Technical Services (ALCTS), and the Government Documents Roundtable (GODORT). GODORT republished Ms. Bradley's letter on their wiki.
While we are in general agreement with ALA's letter calling for increased flexibility of Title 44 (*not* wholesale changes in the title) and increased appropriations for GPO initiatives and "regional depository libraries to help offset the costs of storing and preserving government property," our comments deal with the more philosophical issues embedded in the draft report. Please let us know what you think.
I. Delete from the report all uses of the adjective "legacy" when referring to collections. The use of the word "legacy" as an adjective comes from computer science and is used to indicate things that are "outdated" and "undesirable." When the report uses the phrase "legacy collections" it implies that it is referring to unwanted and outdated collections. (The report uses "legacy" as an adjective in only one other context: in its reference to sections 1911 and 1912 of Title 44 USC as "Legacy Sections" -- apparently in order to define these section as out of date and undesirable.) Thus, the use of the phrase "legacy collections" is either inaccurate and misleading, or imprecise.
In its place GPO should use phrases that accurately describe the collections it wishes to discuss. For example, in place of "legacy collections" the report could uses phrases such as "collections without adequate bibliographic records" or "collections of print materials" or "collections without digital equivalents" or other phrases that accurately describe the collections GPO is referring to.
If GPO does wish to refer to unwanted out of date materials it should describe them that way explicitly rather than use the term "legacy."
II. The report should more explicitly and accurately address the difference between roles and responsibilities that are legally mandated and those that have been assumed without a legal mandate.
Specifically, we object to the following sentences of the report (Section V.B. pages 16-17) that gloss over these differences. (These sentences refer to Public Law 103-40, The Government Printing Office Electronic Information Access Enhancement Act of 1993.)
The implementation of the GPO Access Act ushered GPO into the online age and accelerated the paradigm shift in the FDLP that changed GPO’s relationship with depository libraries. Regional depositories have the responsibility for permanent public access in the tangible publication environment. In the online information environment GPO has assumed primary responsibility for ensuring content and permanent public access. [emphasis added]
We suggest the following wording instead:
While the GPO Access Act specifically required GPO to "provide a system of online access" and to "operate an electronic storage facility for Federal electronic information," it did not specify any change in the roles of the depository libraries. It added new roles for GPO, but did not reduce, alter, or delete the roles of depository libraries.
Since 1993, Congress has consistently provided funds to GPO for the "distribution" of government publications to designated depository libraries. This wording was carefully chosen. In 2000 the House attempted to substitute the wording "on-line access" for "distribution," but that language was rejected.
Nevertheless, GPO has chosen to implement this law in a way that is shifting the relationship between GPO and depository libraries. GPO has chosen to assume responsibility for permanent public access to digital materials and has chosen not to offer digital deposit as an option to FDLP libraries.
This has resulted in a paradigm shift in access, preservation, and service within the FDLP. Instead of relying on FDLP libraries and their different locations, funding, and technological infrastructures, GPO has chosen to implement policies a) that do not "distribute" digital objects to FDLP libraries, b) that make it difficult for FDLP libraries to build local digital collections, and c) that create a preservation system that depends on a single centralized collection with a single funding source.
While these choices seemed appropriate 15 years ago, much has changed over the years. Many libraries are developing institutional repositories and other digital collections. In a survey in August of 2005, 85% of responding FDLP libraries expressed "high" or "very high" interest in being able to "pull" content from GPO and 65% were equally interested in GPO "pushing" digital content to FDLP libraries. In the current survey of Regionals, 52% expressed a willingness to receive digital files on deposit. Commercial and open source software for managing digital collections is now widely available. As we look at new models and roles for FDLP libraries, we need to consider true digital deposit as a viable and important option. We need to look beyond the now-old model of relying solely on GPO having primary responsibility for ensuring content and permanent public access.
- jrjacobs's blog
- 1 comment
- Email this blog
- 1157 reads
EPA Tagging Results - Ready and Promising
Submitted by dcornwall on Wed, 2008-05-07 19:11.Our report on our experiment in using del.icio.us to tag EPA documents originally harvested by GPO is now completed and available for your review and comment at http://freegovinfo.info/node/1825.
For more information about this project, including a list of tags assigned to documents by project participants, please see http://freegovinfo.info/epatagging.
Our thanks to the project participants!
- dcornwall's blog
- Add new comment
- Email this blog
- 585 reads
EPA Tagging Results and Future Directions
Submitted by dcornwall on Wed, 2008-05-07 18:57.Back in January we asked people to use del.icio.us to tag a sample of 32 documents taken from the 100 EPA documents posted by the Government Printing Office (GPO) to http://www.gpoaccess.gov/harvesting/index.html.
We asked people to tag documents from 1/18/2008 through /18/2008. A spreadsheet of the results is available at http://spreadsheets.google.com/pub?key=pybymZBlZ80PVat2ggty2GA.
This brief article informally discusses some of our results, offers some lessons learned, and offers suggestions for future projects. Finally, a short list of articles on other research relating to tagging is presented.
1) Findings
- Number of tagged documents - 31
- Average number of people tagging a given document - 2.5
- Highest number of taggers for a document - 8, for the document "Environmental Results Under EPA Assistance Agreements"
- Average number of deduplicated tags per document - 11.25
- Number of documents with descriptions - 31, with a majority of documents having more than one human generated description.
2) Some Promising Results
While we would have liked to have seen more participation (see below under "study limitations"), these initial results are somewhat positive. There is some interest in tagging. Tagged documents tended to receive meaningful descriptions beyond what a brief bibliographic record would provide. For example, for the document "Air Sealing: Building Envelope Improvements", we have the following descriptions from five users:
* Mount Desert Spring Water was able to win a bid to provide bottled water and water coolers to the University of Maine. Mount Desert Spring Water was successful because the water coolers it provided were energy efficient and the lowest cost to the Universi - samchap
* Describes the benefits of proper air sealing for homes. EPA awards the EnergyStar when legal minimum standards are exceeded. - mkvs
* Conserving energy in your house by having it sealed correctly - bookswoman
* "Air sealing the building envelope is one of the most critical features of an energy efficient home." "25-40% of energy" "ENERGY STAR qualified homes, constructed to exceed [building] codes with air sealing, can offer a better quality product." - keyvowel
* This Energy Star news release describes ways homeowners can reduce home heating and cooling costs by implementing air sealing techniques. - tadamich
Without question, the first description is problematic, but the other four descriptions are in agreement about what this document is about AND provide more relevant information than a brief bibliographic record.
For the most part, the tags we got were also meaningful and descriptive. Staying with the document "Air Sealing", we have the following tags:
Air, air-sealing, airsealing, building-insulation, efficient, energy,
energy-efficiency, Energy-Star-Branding, energyconservation, energystar, epa, EPA-advertising, globalwarming, greenhousegases, home-building, home-building-techniques, home-construction, home-improvement, homes, hvac, indoor, leakage, money-saving, quality, sealing, ventilation
Contrast that with a brief bibliographic record that simply has title, agency, and URL. How would people know that this document is part of the EnergyStar initiative, or that it was related to home building or energy efficiency? Clearly, in this instance and in a number of other project documents, there was a clear value added.
3) Limitations of current study
Our promising results were limited by three factors, the most important was the lack of participation. We estimate that about ten people participated in our tagging project. The available research on tagging is pretty firm on stating that good social tagging requires many users. Some say 100 or so is good, others suggest higher numbers. Our numbers are clearly too low. There are also too many instances (12) when a document was tagged by a single user. This could greatly bias how a document gets tagged. Consider if the only description of "Air Sealing" had been the mistaken one about water coolers. That would have been worse than useless. But even in this instance, a user pulling up this document while searching for water coolers could have provided a more accurate description.
The low number of taggers also made it difficult to see how much tag agreement existed among the various taggers.
Another problem was self-inflicted. We forgot to instruct people on tag construction. These were our original instructions:
1) Visit http://www.archive.org/search.php?query=epapilotproject and go to a document on the list. Open the pdf file in a separate browser window.
2) In del.icio.us, tag the page for the Internet Archive record (i.e. not the PDF file) after examining the PDF file.
3) In the del.icio.us "notes" field, write a one or two sentence description of what the document is about.
4) In the tags field, please use epapilotproject, for:freegovinfo and then any tags that you feel describe this document.
del.icio.us uses a space separated tag system. In other words, a space begins a new tag. So tagging something as "air quality" results in the two tags of "air" and "quality" and not the more helpful tag of "air quality" This resulted in some of the tagging becoming meaningless. If we had asked people to put dots or dashes in multiple word tags, we would have gotten more meaningful tags. We still got some useful tags because some of our taggers were used to the del.icio.us system, but we shouldn't have assumed that everyone tagging would know how to construct multiword tags in del.icio.us. On the other hand, this problem might have been less noticeable if we had more taggers per document.
Our final problem is one we think could be avoided in future projects. That is people tagging different files with the same document title. We asked people to bookmark the Internet Archive page for a given document, which has a link to the PDF file. We specifically asked people NOT to tag the PDF file because del.icio.us doesn't populate the title field of bookmarked PDFs. But one person in our project consistently bookmarked a document's PDF file instead of the Internet Archive page and this separated that person's tagging from everyone else's and made it more difficult to compile tagging info for every document.
4) What next? Some suggestions
Our findings indicate that tagging does have potential to add value to web harvested documents that do not receive full cataloging, but for this benefit to be fully realized, there must be more taggers. When we realized we didn't have the number of taggers we wanted, we headed for the literature and found some articles
listed below under "References Consulted." They offer some interesting guidance for other document tagging efforts.
While all of the papers below talked about user motivation, I think Tim Spalding said it best in a post titled "When tags work and when they don't: Amazon and LibraryThing":
"Something is going on here—something with broad implications for tagging, classification and "Web 2.0" commerce. There are a couple of lessons, but the most important is this: Tagging works well when people tag "their" stuff, but it fails when they're asked to do it to "someone else's" stuff. You can't get your customers to organize your products, unless you give them a very good incentive. We all make our beds, but nobody volunteers to fluff pillows at the local Sheraton."
The EPA documents are sort of like fluffing pillows at the local Sheraton, to me at least. My primary interest isn't environmental documents and EPA documents are not a major component of my library's depository collection. In addition our particular sample was unintentionally heavy on flyers, applications, and brochures. It could be that another agency's documents, say NASA or DoD might get more attention.
There's another angle too. In my anecdotal experience, librarians don't see web stuff as theirs, so they don't spend much processing time on it. Of if they are concerned about web documents, perhaps their administration does not. So how could we make them owners and think of web harvested materials as "their stuff" so they'll make their "documents beds"? A few suggestions follow:
1) For the EPA documents, GPO could partner with libraries that do have a strong environmental collection. Perhaps candidate libraries could be determined through item selection analysis.
2) GPO might wish to consider doing a depository survey to see what agency depositories would most like to see web-harvested. The survey could include a question asking libraries if they would tag if the desired content was harvested.
There wouldn't have to be a commitment to tag every document, but to tag some of the documents.
While GPO should continue with web harvesting no matter what, we wouldn't blame them for not moving forward with a documents tagging initiative if the depository community failed to register interest in such a project.
3) If GPO re-harvests EPA or moves on to another agency, it should consider setting up RSS feeds for newly harvested documents. Subject specialists from inside and outside the library community could take part in tagging. Again, GPO would need to start with some broadly popular agencies to have a chance of recruiting a significant number of taggers.
4) If GPO or another organization does a large scale tagging project, significant thought should go into tagging conventions. Not the vocabulary itself -- research seems to show that once an item reaches 100 tags or so, the proportion of tags stays constant. That is to say that agreed upon terms appear to predominate over idiosyncratic or spam tags (See Golder and Huberman below for details). What needs to be spelled out is how multi-word tags should be constructed -- is it air-quality, air.quality, or air_quality? They all mean the same thing, but del.icio.us and other tagging services interpret them differently. A consistent new word marker or a choice of tagging site that supported spaces inside tags will make any tagging project go smoother.
These are our thoughts. What are yours? Look at our spreadsheet. Check out the item pages on del.icio.us and read the articles below. Then let us know what you think about the future of social tagging for government documents.
References Consulted
- "HT06, Tagging Paper, Taxonomy, Flickr, Academic Article, ToRead" by Cameron Marlow, Mor Naaman, danah boyd, Marc Davis http://www.danah.org/papers/Hypertext2006.pdf
- The Structure of Collaborative Tagging Systems
by Scott A. Golder and Bernardo A. Huberman
http://www.hpl.hp.com/research/idl/papers/tags/
http://www.hpl.hp.com/research/idl/papers/tags/tags.pdf
- "Can Social Bookmarking Improve Web Search?" by Paul Heymann, Georgia Koutrika, and Hector Garcia-Molina
http://heymann.stanford.edu/improvewebsearch.html
http://dbpubs.stanford.edu/pub/showDoc.Fulltext?lang=en&doc=2008-2&format=pdf&compression=&name=2008-2.pdf
- "When tags work and when they don't: Amazon and LibraryThing"
Thingology Blog, posted by Tim Spalding Tuesday, February 20, 2007
http://www.librarything.com/thingology/2007/02/when-tags-works-and-when-...
- Add new comment
- Email this book
- 804 reads



Recent comments
14 hours 38 min ago
18 hours 5 min ago
1 day 19 hours ago
1 day 21 hours ago
1 week 1 day ago
1 week 1 day ago
1 week 1 day ago
1 week 3 days ago
1 week 4 days ago
2 weeks 1 day ago