Paste your Google Webmaster Tools verification code here

Home » post » Peter Suber on “Trojan Horse” PDFs

Our mission

Free Government Information (FGI) is a place for initiating dialogue and building consensus among the various players (libraries, government agencies, non-profit organizations, researchers, journalists, etc.) who have a stake in the preservation of and perpetual free access to government information. FGI promotes free government information through collaboration, education, advocacy and research.

Peter Suber on “Trojan Horse” PDFs

Last month, there were stories in the press about a Canadian company, Remote Approach, that is developing tools to help “identify, manage and measure” the use of PDF documents “in real-time.” Now Peter Suber writes about this and related technologies in the context of eprints of open access scholarly literature.

Suber reports that “Remote Approach is working on executable scripts embedded in PDF files that will report back to their creators whenever the files are opened, even after they have been copied and redistributed.” He notes that there is an obvious issue of reader privacy but there is also a deeper problem:

‘The deeper problem is that once we allow scripts in text files for benign purposes, it will be very hard to block, let alone detect, scripts for malign purposes. Malign scripts could subvert fair-use rights and open access.’

He also notes press reports of “a feature that would let a company block a document from being read if there’s no Internet connection.” He sees dangers in such technolgies:

‘Imagine downloading a copy of a self-archived PDF to your personal hard drive on Monday. On Tuesday, when you want to read it offline, you discover that it is unreadable. The publisher had remotely set it to deactivate when taken offline.

‘Imagine a new and “improved” script that can deactivate a file even for online reading. Imagine self-archiving such a PDF file on Wednesday. On Thursday, the publisher remotely deactivates it so that nobody can read it, even though it is still online.

‘As soon as publishers can remotely disable PDFs so that users can’t read them offline or from certain addresses online, then PDFs will be unsuitable for disseminating science and scholarship, especially in OA repositories. They won’t be suitable again until we have trustworthy tools for scrubbing them clean of the remote activation code.’

These are important issues for government information as well. Governments at all levels rely heavily on PDFs, but this kind of technology will, once available, surely spread to other document distribution formats where control, access, permission, rights, and authenticity are issues. This is very relevant to free, public, fully functional, permanent access to government information. Imagine trying to use the digital government documents that we do manage to get into our libraries (through deposit, if GPO will deposit them, or through downloading and web-crawling projects) if they are still controlled by GPO or the issuing agency in the way Suber describes.

There are many questions that need to be asked as GPO (and other government publishers) look for ways to use technology to address the issue of authenticity and as GPO attempts to provide “free and ready public access” to electronic documents while simultaneously distributing the same information “on a cost recovery basis.” [GPO Strategic Vision, p. 1] Some of these questions include:

  • Will GPO attempt to use technology to guarantee authenticity of digital government publications?
  • Will GPO rely on the same technologies that the private sector and entertainment industries use?
  • If so how will those technologies effect the privacy of users?
  • What guarnatees of permanent public access will GPO provide if it uses technologies capable of disabling access?

CC BY-NC-SA 4.0 This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Leave a comment

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.