At MITH we have been experimenting with the networked, distributed transcription and encoding of manuscripts during the first phase of our work on the Shelley-Godwin Archive, a project that aims to provide the digitized manuscripts of Percy Bysshe Shelley, Mary Wollstonecraft Shelley, William Godwin, and Mary Wollstonecraft. The Archive will thus bring together for the first time ever the widely dispersed handwritten legacy of this uniquely gifted family of writers. The result of a partnership between the New York Public Library and MITH, in cooperation with Oxford’s Bodleian Library, the Archive also includes key contributions from the Huntington Library, the British Library, and the Houghton Library. In total, these partner libraries contain over 90% of all known relevant manuscripts.
The most immediate goal for the Archive’s current first phase is to provide access to page images under open licenses of as many of these manuscripts as possible, through a series of public releases. These began on Halloween with the release of the fully transcribed and encoded Frankenstein Notebooks, containing all known draft and fair copy of the novel. Frankenstein will be followed in the Spring by the fully transcribed and encoded fair-copy manuscripts of Percy Shelley’s greatest poem, Prometheus Unbound. Typically, given the limits of funding and labor, the digitized manuscripts of the Archive will be publicly released in one of three forms of development:
- page images with transcriptions that are fully corrected and TEI-encoded (as with Frankenstein and Prometheus Unbound);
- page images with transcriptions that have not yet been corrected (as will be the case for most of Percy Bysshe Shelley’s manuscripts at the Bodleian Library);
- page images only.
The curatorial status of each page in the Archive is color-coded so that during the first phase users will understand the relative trustworthiness of transcriptions. In the Archive’s subsequent phases the color-coding will also serve as an indication of what type of curatorial work users might best contribute.
The innovative technical infrastructure of the Shelley-Godwin Archive builds on linked open data principles and emerging standards such as the Shared Canvas data model and the Text Encoding Initiative’s Genetic Editions vocabulary in order to open the contents of the Archive to widespread use and reuse, and to support distributed user curation in subsequent phases of the project. Users of the Archive can see and search for additions, deletions, substitutions, retracings, insertions, transpositions, shifts in hand, displacements, paratextual notes, and other variables related to the composition process.
Currently, users of the Frankenstein Notebooks have several available views of the data, beginning with the choice of selecting a “physical” or “logical” view to order the page images. For instance, the page images can be browsed in their Notebook order, or by chapter sequence. Once that choice has been made and an image has been selected, the default page view aligns side by side the page image and transcription. Other possible views include the XML-encoded transcription, or a redacted clear-reading text that omits deletions, inserts additions in their appropriate location, transposes text as indicated in the draft, and arranges text fragments in their intended sequence. Users can also zoom in on individual page images until they fill the entire window, or choose to limit their view of the transcription so that only the text written by Mary or Percy Shelley is highlighted. Those interested in searching the Frankenstein Notebooks for the word “monster,” for instance, will see thumbnails of all the pages containing that word. This list can be filtered so that only the pages with the word “monster” in Mary Shelley’s or Percy Shelley’s hand appears, or once can choose to see only the pages in which the word “monster” has been added, or only those in which it has been deleted in either or both hands.
The kind of networked, distributed transcription and encoding a the heart of the project has been pioneered during the Archive’s first phase by a team of students in two graduate seminars at the University of Maryland and the University of Virginia, who transcribed and encoded roughly a third of the manuscript pages of Frankenstein, overseen by an expert encoder and a Shelley scholar. By scaling up such experiments in its next phase, the Archive will help to move textual scholarship into the classroom and, eventually, out to the public so as to make students and citizen humanists active, knowledgeable, and critical participants in the great cultural transition now underway from a Textual to a Digital Condition.
By making the Shelley-Godwin Archive material massively addressable in a form that encourages user curation and exploration, we will be transforming it into what some are calling an “animated archive,” an archive as work-site rather than simply a point of access, that can ultimately take the form of a commons through which various discourse networks related to its texts intersect and interact. Most important of all, we will be pioneering, modeling, and building an open source participatory platform through which other archives dependent on manuscripts can effect similar transformations—helping literary manuscripts to thrive in a digital world.
News of the Archive’s launch appeared in such venues as the the New York Times, the Washington Post, and the Chronicle of Higher Education, and some 60,000 unique visitors viewed the site within 24 hours of its launch. We invite you to see for yourself by visiting the project’s website.