David Rosenthal gave another fascinating talk about the state of the web and whether or not we can expect to preserve it by harvesting it. This talk was at the 2013 Spring CNI Membership Meeting in San Antonio, TX. David presents an edited text of his talk with links to the sources on his blog:
- Talk at Spring 2013 CNI, David Rosenthal, DSHR’s Blog (April 4, 2013).
David and co-presenter Kris Carpenter Negulescu note, among other things, that the days of a document-centered web are long over and that today, what most web pages do “is download and run programs in the current Web’s primary language, Javascript. Javascript is a programming language, not a document description language. Your browser is only incidentally a document rendering engine, its primary function is as a virtual machine.”
This presents problems for those wishing to preserve information. Among these problems:
- Database driven features & functions
- Complex/variable URI formats & inconsistent/variable link implementations
- Dynamically generated, ever changing, URIs
- Rich Media
- Scripted, incremental display & page loading mechanisms
- Scripted, HTML forms
- Multi-sourced, embedded material
- Dynamic login/auth services: captchas, cross-site/social authentication, & user-sensitive embeds
- Alternate display based on user agent or other parameters
- Exclusions by convention
- Exclusions by design
- Server side scripts & remote procedure calls
- HTML5 “web sockets”
- Mobile publishing
For more about these problems, see also: IIPC Future of the Web Workshop — Introduction & Overview, International Internet Preservation Consortium (May 17, 2012).
Read David’s complete post for a rich discussion of the issues.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Latest Comments