Paste your Google Webmaster Tools verification code here

Home » Posts tagged 'pdfs'

Tag Archives: pdfs

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

Do Not Assume PDF files are all permanent

Government information relies heavily on the PDF format. Indeed, PDF files are used so widely that it is tempting for us to assume that PDF files are, by definition, a safe way of preserving information for the long term. Would it surprise you to learn that the Digital Preservation Coalition (DPC) lists PDF files as “Endangered” and that even PDF/A files are “Vulnerable”?

These judgements are in the DPC’s newest list of “digitally endangered species.”

The Bit List includes 74 content types and groups them (more…)

The good and the bad of PDFs

Following up on Can Proprietary Formats make Government More Open? :

Josh Tauberer of govtrack.us, points us to The good and the bad of PDFs (OpenGovData.org wiki) in which Kevin Lyons, who works for the Nebraska legislature, wrote up some guidelines for PDF in government.

Lyons reminds us that not all PDF files are equal and he enumerates some of the advantages and disadvantages of encapsulating government information in PDFs.

Given how popular the PDF standard itself is, it shouldn’t be a surprise that the term PDF actually covers a wide variety of different types of files. While all PDF files fit the PDF standard, there are several different subtypes of PDF that are helpful in the government world.

PDF is now ISO standard

PDF (Portable Document Format) has been approved by the International Organization for Standardization (ISO). The new standard is ISO 32000-1.

ISO 32000-1:2008 specifies a digital form for representing electronic documents to enable users to exchange and view electronic documents independent of the environment in which they were created or the environment in which they are viewed or printed. It is intended for the developer of software that creates PDF files (conforming writers), software that reads existing PDF files and interprets their contents for display and interaction (conforming readers) and PDF products that read and/or write PDF files for a variety of other purposes (conforming products).

See also: PDF now ISO standard, By Joab Jackson, GCN, 07/03/08.

With Adobe relinquishing control of PDF, the ISO Document Management Applications Technical Committee will review any changes made to the format.

The openly published standard provides the technical information required for writing software programs that can create and read PDF files, ensuring that organizations will always have some tools available to render PDFs, even if Adobe stops shipping its PDF viewer.

PDF/A (Electronic document file format for long-term preservation) was approved by ISO earlier.

Stupid Gov’t PDF Tricks

Stephen Abrams of OPAC vendor SirsiDynix  talks government documents and relates his pet peeves about how governments use PDFs to hide information. As Stephen is Canadian, I’m not sure what government he’s talking about, but some of his complaints sound familar to me:

3. Worse, let’s create a 10,000 page PDF and try to ask any citizen to download and print that! If your report is too short to make it too big, just append all your data into the appendices and make it HUGE.

5. Place your PDF on your website and don’t link it to with an index, table of contents, press release or some other finding tool. Make sure there are no links for the seacrh engine crawlers to crawl! You have plausible deniabliity and can say with a straight face that it’s available on the web!

6. And my favourite government opacity strategy? Only place a minimum of metadata on the PDF on the web. Say, just a number like 1237D-f but make sure it’s not linked to any real number and just represents a non-sequential accession number for the web file. Then it will be nigh on impossible to find it.

 

His other tricks make a good read too, but then I’d be reproducing his entire posting. He ends his posting with thanks to librarians and catalogers. Stephen doesn’t usually talk about docs, he’s more of a Web 2.0/Library 2.0 sort of guy, but his Stephen’s Lighthouse blog is always interesting reading.

Archives