A few weeks ago I began putting together MITH’s new digital curation workstation. The primary reason for the workstation was to build a testbed for the BitCurator environment, an open source suite of digital forensics (DF) tools that have been repurposed for the curation of born-digital materials. While there are commercial DF workstations available on the market (for example, see Digital Intelligence’s FRED system), their cost can be prohibitive, especially compared to the ever-diminishing cost of desktop workstations.
What I wanted to come up with was a system under $1000 that would allow access to as many forms of digital media as possible. After researching the different options I ultimately chose a Windows 7 64-bit workstation with an i7 Intel processor, 24GB of RAM, and a 2TB SATA hard drive. When building a digital curation workstation running BitCurator, perhaps the most critical component is the system RAM. This is because BitCurator is designed to run in a virtual machine on a host operating system (hosts can be Windows, OSX, or Linux). As a general rule, the more RAM you have to dedicate to a virtual machine, the better performance you can expect. However, these specs reflect an optimal configuration; if you want to repurpose an existing workstation instead of buying a new one, you can run the BitCurator environment on any PC with a 64-bit capable CPU (most Intel and AMD CPUs have been able to run 64-bit operating systems for the last few generations), 2GB of RAM, and a 250GB SATA hard drive.
Choosing the system components was just the first step in building our digital curation workstation, however. The primary challenge most digital archivists face is getting physical access to their media. To address this, the MITH digital curation workstation includes an “All-in-One” memory card reader which allows access to everything from Sony Memory Sticks to micro SD cards. For an optical drive, you want a drive that can read as many formats as possible, so a drive that can read Blu-ray disks and is backwards compatible with older DVDs and CD-ROMs is ideal (you’ll want to check and make sure your drive can effectively read burned media as well). Zip disks, though comparatively short-lived, are still common enough that a born-digital curation workstation would be incomplete without a Zip drive. USB Zip drives are still available new from the manufacturer, and are also easily found on auction sites such as eBay.
Perhaps the media types that present the biggest challenge are 3.5” and 5.25” floppy disks. For 3.5” floppy disks I recommend a USB 3.5” floppy disk drive because 1) they are still readily available, and 2) because USB devices integrate into the BitCurator environment more easily than those connected to a floppy disk controller on the motherboard. For the 5.25” drive we used the FC5025 device by Device Side, a USB peice of hardware that, when coupled with a 5.25” drive, allows a Windows, Mac or Linux PC to read a wide variety of 5.25” floppy disk formats, including Apple DOS 3.2 and 3.3, MS DOS, Commodore 1541, and Atari 810--just to name a few (see the above link for a full listing). For more on accessing 3.5” and 5.25” floppy disks, I recommend an article by Doug Reside (MITH alum and digital curator at the NY Public Library) titled: “Digital Archaeology: Recovering your Digital History”. Doug’s article is particularly helpful because not only does he outline the various media types and their required drives, but he also tells you where you can find the drives themselves, many of which are no longer manufactured.
Once complete, we didn’t have long to wait before using our new digital curation workstation. Travis Brown, one of the directors here at MITH, came to me with a stack of 5.25” floppy disks containing some of Neil Fraistat’s early work on Percy Shelley’s manuscripts. Back in the early 90s, Neil had transcribed the Prometheus Unbound portions of Percy Shelley’s notebooks (ms Shelley e.1, e.2, e.3), painstakingly recreating in WordPerfect 4.2 each of Shelley’s hand-marked notations. For example, lines that were struck through in the manuscript were likewise struck through in the WordPerfect documents, along with word changes and emendations. If we could recover Neil’s early digital transcriptions, they could serve as a foundation for the work being done on the Shelley-Godwin Archive, another project here at MITH. Using the FC5025, we were able to access the disks and copy the file contents onto the workstation’s hard drive. We then used the current version of WordPerfect to convert the transcriptions into a modern document format. The time saved by recovering Neil’s original transcriptions goes beyond just that needed to retype the original documents; it includes the careful validation work done by Neil’s collaborators at the Bodleian Library. Taken together, their work represents a significant example of early digital humanities work, work that is now available to us because of the tools described above. From here the electronic transcriptions will be used to form a foundation for further TEI encoding and be an important part of of the Shelley-Godwin Archive.
This is just one example of how we here at MITH anticipate being able to use our new digital curation workstation, and, we think, makes a pretty compelling case for similar workstations being an essential part of any digital humanities center.
Porter Olsen is a Ph.D. candidate in the University of Maryland Department of English and a Graduate Research Assistant with the BitCurator project at MITH.