October 28, 2008
Data Doomed By Digital Dark Age
What stands a better chance of surviving 50 years from now, a framed photograph or a 10-megabyte digital photo file on your computer's hard drive?
The framed photograph will inevitably fade and yellow over time, but the digital photo file may be unreadable to future computers "“ an unintended consequence of our rapidly digitizing world that may ultimately lead to a "digital dark age," says Jerome P. McDonough, assistant professor in the Graduate School of Library and Information Science at the University of Illinois at Urbana-Champaign.According to McDonough, the issue of a looming digital dark age originates from the mass of data spawned by our ever-growing information economy "“ at last count, 369 exabytes worth of data, including electronic records, tax files, e-mail, music and photos, for starters. (An exabyte is 1 quintillion bytes; a quintillion is the number 1 followed by 18 zeroes.)
The concern for archivists and information scientists like McDonough is that, with ever-shifting platforms and file formats, much of the data we produce today could eventually fall into a black hole of inaccessibility.
"If we can't keep today's information alive for future generations," McDonough said, "we will lose a lot of our culture."
Contrary to popular belief, electronic data has proven to be much more ephemeral than books, journals or pieces of plastic art. After all, when was the last time you opened a WordPerfect file or tried to read an 8-inch floppy disk?
"Even over the course of 10 years, you can have a rapid enough evolution in the ways people store digital information and the programs they use to access it that file formats can fall out of date," McDonough said.
Magnetic tape, which stores most of the world's computer backups, can degrade within a decade. According to the National Archives Web site by the mid-1970s, only two machines could read the data from the 1960 U.S. Census: One was in Japan, the other in the Smithsonian Institution. Some of the data collected from NASA's 1976 Viking landing on Mars is unreadable and lost forever.
From a cultural perspective, McDonough said there's a "huge amount" of content that's only being developed or is available in a digital-only format.
"E-mail is a classic example of that," he said. "It runs both the modern business world and government. If that information is lost, you've lost the archive of what has actually happened in the modern world. We've seen a couple of examples of this so far."
McDonough cited the missing White House e-mail archive from the run-up to the Iraq War, a violation of the Presidential Records Act.
"With the current state of the technology, data is vulnerable to both accidental and deliberate erasure," he said. "What we would like to see is an environment where we can make sure that data does not die due to accidents, malicious intent or even benign neglect."
McDonough also cited Barack Obama's political advertising inside the latest editions of the popular videogames "Burnout Paradise" and "NBA Live" as an example of something that ought to be preserved for future generations but could possibly be lost because of the proprietary nature of videogames and videogame platforms.
"It's not a matter of just preserving the game itself. There are whole parts of popular and political culture that we won't be able to preserve if we can't preserve what's going on inside the gaming world."
McDonough believes there would also be an economic effect to the loss of data from a digital dark age.
"We would essentially be burning money because we would lose the huge economic investment libraries and archives have made digitizing materials to make them accessible," he said. "Governments are likewise investing huge sums to make documents available to the public in electronic form."
To avoid a digital dark age, McDonough says that we need to figure out the best way to keep valuable data alive and accessible by using a multi-prong approach of migrating data to new formats, devising methods of getting old software to work on existing platforms, using open-source file formats and software, and creating data that's "media-independent."
"Reliance on open standards is certainly a huge part, but it's not the only part," he said. "If we want information to survive, we really need to avoid formats that depend on a particular media type. Commercial DVDs that employ protection schemes make it impossible for libraries to legally transfer the content to new media. When the old media dies, the information dies with it."
Enthusiasm for switching from proprietary software such as Microsoft's Office suite to open-source software such as OpenOffice has only recently begun to gather momentum outside of information technology circles.
"Software companies have seen the benefits of locking people into a platform and have been very resistant to change," McDonough said. "Now we are actually starting to see some market mandates in the open direction."
McDonough cites Brazil, the Netherlands and Norway as examples of countries that have mandated the use of non-proprietary file formats for government business.
"There has been quite a movement, particularly among governments, to say: "ËWe're not going to buy software that uses proprietary file formats exclusively. You're going to have to provide an open format so we can escape from the platform,' " he said. "With that market demand, you really did see some more pressure on vendors to move to something open."
Image 2: Jerome P. McDonough says an unintended consequence of our rapidly digitizing world is the potential of a "digital dark age." Photo by L. Brian Stauffer
On the Net: