TILE Blog

TILE 1.0 Release

We’ve officially released version 1.0 of TILE. Here are the release notes since 0.9:

Release Notes for TILE 1.0

General

  • Interface added for tagging and annotating manuscript images (Image Annotation)
  • Interface added for automatically tagging lines using basic image analysis tools (Auto Line Recognizer)
  • Dialog tools for loading and saving data
  • Support for TEI P5 formatted XML data
  • Support for XML data using Alto metadata scheme
  • Improved visuals for attaching metadata to transcript lines
  • Fixed label attachment bugs
  • Improvements to the Auto Line Recognizer (ALR)
  • Improved workflow and accuracy using gray-image detection for ALR
  • New documentation for the TILE User Guide

Library Updates

  • jQuery 1.6.2 update

UI

  • ALR instructions made to be clearer
  • Changed version message area

For Developers

 

Posted in News | Comments closed

Seadragon and Djatoka

In an earlier post, I suggested that it might be cool to link the open source version of Microsoft’s Seadragon to Los Alamos National Laboratory’s Jpeg2000 server, Djatoka.  Since that post, the folks at OpenSeadragon cleaned up a lot of the JavaScript  making it much easier to do this.  I’ve essentially just rewritten the “get tile url” function so that it loads the tile dynamically from Djatoka rather than from a static file.  There’s a bit of wacky math because of the odd shapes of the tiles and the way Djatoka processes tile size, but nothing too crazy.  The code is at github and I’ve put up a demo on this server.

Posted in Uncategorized | Comments closed

Introducing TILE 0.9

We’re excited to announce the redesigned website for and public release of The Text-Image Linking Environment (TILE), a web-based tool for creating and editing image-based electronic editions and digital archives of humanities texts. This initial release of TILE 0.9 features tools for importing and exporting transcript lines and images of text, an image markup tool, a semi-automated line recognizer that tags regions of text within an image, and plugin architecture to extend the functionality of the software.

There are a number of ways to try TILE 0.9 and learn more. You can visit the MITH-hosted sandbox version that allows you to use the tool online, or download a customizable version of the software.

If you’d like to learn more, we’ve made end-user and developer documentation available, and we’re ready to answer your questions on our forums.

Supported by an NEH Preservation and Access Grant, TILE is a collaboration between the Maryland Institute for Technology in the Humanities (Doug Reside, Dave Lester) and Indiana University (Dot Porter, John Walsh).

Posted in News | Comments closed

TILE partners with EMiC

TILE has formed a partnership with Dean Irvine and Meg Timney and their work with Editing Modernism in Canada.

More to come, but for now here’s a link to EMiC.

Posted in Uncategorized | Comments closed

A Simple Page Turner

I’m not really sure the world needs another page turning application. The Internet Archive already has a pretty good open source one here that has some great features and is fairly simple to use.   There’s also METS navigator from Indiana University.  Still, when project co-PI Dot Porter recently tweeted to ask for a simple, no frills page turner with no server dependencies for a TEI project she was working on, few tweeps tweeted back.   So, in case anyone finds a very simple TEI-centric page turner useful, I’ve written one and uploaded it here.

Unzip the file and you’ll find, in the root directory, a file called “config.js”

If you open it in a text editor (TextWrangler for Mac or Notepad++ for Windows are favorites) , you’ll find a JSON object with a bunch of parameters.   If you use the HTML provided in the zip file, you will only need to modify the value of  “imageFile” to point to the TEI file that lists your images and the imagePath to point to the directory where the images live (if that information is already in the TEI file, just change this parameter to an empty string (“”).

Note that the code that builds the list of images is in “getImages.js.”  This code assumes a TEI document with images listed in the “url” attribute of the “graphic” tag.  If you don’t want to use TEI and you’re fairly comfortable with jQuery you can simply replace the code in this function to generate the image list in a different way.

You can see the code in action here.

Download: PageTurner.zip

Posted in Uncategorized | Comments closed

The Open Source Seadragon

Many image-based archives today make use of an interface technology first made popular by web-based mapping sites: “deep zoom.” Deep zoom (sometimes called tiled zoom) allows users to download only the particular portion of a very large image they need at any moment. For example, if you are looking at a neighborhood in Google Maps, you do not download the many gigabytes (perhaps terabytes?) of map data Google has available; Google simply sends you enough data to fill your screen at the zoom level you requested. Google and similar sites accomplish this by generating many copies of an image at various levels of resolution, and then cutting the copies into small squares, or tiles, that can be sent to the user as needed. This technique is sometimes called an “image pyramid.” At the base are the many tiles that compose the highest resolution copy of the image, and, at the tip, a single, low resolution image (with many levels of intermediary resolutions in-between).

Of the many deep zoom interfaces to have emerged over the last few years for viewing high-resolution images on the web, Microsoft’s Seadragon is both one of the most beautiful and easiest to use. Unlike many of its competitors, the software seamlessly transitions between zoom levels without noticeably replacing the tiles, an effect that is arguably both more aesthetically pleasing and more functional than a page refresh as the user need not relocate their position each time a new zoom level loads. Additionally, Microsoft has provided a well-developed code library for annotating images and binding regions of the image to particular actions when clicked.

The software is available in several versions based on several technologies: a proprietary version based on Microsoft’s Silverlight platform serves as the basis for the company’s Pivot visualization suite and offers the most functionality. A JavaScript version, sometimes called Seajax, offers most of the important functionalities is available for non-commercial use under somewhat ambiguous licensing terms. The World Digital Library makes good use of this version of the code, but for many projects with funding that depends on clear, common, open source licenses, the terms under which this code is made available is unacceptable. However, after an email exchange with the extraordinarily helpful Donald Brinkman of Microsoft Research (a wing of the company that demonstrates to me that Microsoft is at least as relevant to digital humanities research as any of the technology companies more frequently discussed of late), I learned that Microsoft has, in fact, released nearly all of the code for Seajax in their Ajax Control Toolkit (ACT) under a very liberal New BSD license. This is a very common license used by many other popular software libraries, and so should satisfy the open source requirements of most funders and cultural institutions. Building the deep zoom tool out of the pieces in this library, though, is not necessarily intuitive, however, and so I think it is worthwhile to use this space to document the process through which it can be accomplished.

The first step (regardless of what version of Seadragon is used) is cutting the large source image into a Deep Zoom Image (or DZI) tile set. By far the easiest way to do this is through Microsoft’s Deep Zoom Composer, a Windows-only, free, but proprietary piece of desktop software. However, there are also a set of third party tools, some open source, which can also generate the tiles on Macintosh and Linux machines. Microsoft maintains a list of these tools on the Seadragon website.

After tiling the images, you’ll need to download the Ajax Control Toolkit source code. It can be found on Microsoft’s CodePlex, but I’ve also mirrored it here in case later versions break or change the relevant functionality (the New BSD license should permit this sort of redistribution). Unzip it and copy the Seadragon folder from /Client/MicrosoftAjax.Extended/Seadragon to the directory where you keep your JavaScript. You will also need the MicrosoftAjax.js file at /SampleWebSites/AjaxClientWebSite/Scripts/MicrosftAjax/MicrosoftAjax.js

In the premade Seajax.min.js file Microsoft hosts, these files are compressed (or, in Microsoft terms, minified) into one file. If you have a Windows machine, you can do the same with the Microsoft Ajax Minifer (free from CodePlex here) which probably speeds up performance a bit, but is not ultimately necessary.

If you’re going to use the unminified version, the Seadragon scripts will need to be linked in your HTML document in a particular order; I have pasted the necessary code below for convenient copying and pasting (note that I assume you have pasted the Seadragon directory and the MicrosoftAjax.js file into the same folder as your HTML file):

<script language="JavaScript" type="text/JavaScript" src="./MicrosoftAjax.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.Config.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.Config.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.Strings.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.Profiler.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.Point.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.Rect.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.Spring.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.Utils.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.MouseTracker.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.ImageLoader.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.Buttons.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.TileSource.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.DisplayRect.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.DeepZoom.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.Viewport.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.Drawer.pre.js"></script>
<script language="JavaScript" type="text/JavaScript" src="./Seadragon/Seadragon.pre.js"></script>

You’ll also need to create a function to start the Deep Zoom script. In a response to a question on Microsoft Live Labs’ forums, Microsoft representative Aseem Kishore notes that the sample ASP.NET code in /SampleWebSites/AjaxConrolToollkitSampleSite/Seadragon/Seadragon.aspx translates to following JavaScript

Sys.Application.add_init(function() {
$create(Sys.Extended.UI.Seadragon.Viewer, {
"controls":[],
"overlays":[],
"prefixUrl":"/AJAX/AjaxControlToolkit/Samples",
"xmlPath":"sample.xml"
}, null, null, $get("ctl00_SampleContent_Seadragon"));
});1

The $create function above instatiates the Seadragon Viewer with the following parameters:

  • controls: an array of HTML elements that can serve as interface controls for built-in or custom functionality. We will ignore this for now.
  • overlays: An array of HTML objects to be drawn on top of the image and, optionally, scaled with each zoom. These would be useful for annotations.
  • prefixUrl: A string which contains a URI (with a http:// prefix) pointing to the directory that contains the DZI/XML file and all of the tiles.
  • xmlPath: A string which contains the file path to the DZI/XML file (where the root is assumed to be the location of the HTML file).

I’m not yet sure what the two nulls do exactly, but the final $get function selects, by id, the HTML element into which the viewer should be inserted. In the example above there would be an element (perhaps a DIV) in the html with id=”ctl00_SampleContent_Seadragon”.

Finally, I wrap all of this in a function which is called on page load. For instance:

function init(){
viewer = null;
Sys.Application.add_init(function () {
viewer = $create(Sys.Extended.UI.Seadragon.Viewer, { "controls": [],
"overlays": [], "prefixUrl": "http://mith.umd.edu/tile/seadragon/Hamlet", "xmlPath": "Hamlet/hamlet.xml" }, null, null,
$get("SeadragonContainer"));});}

The standard way to call Ajax Control Toolkit functions on page load seems to be something like the following (used on many forums and help pages):

Seadragon.Utils.addEvent(window,"load",init);

You can see my example page with all of this working here.

Note that on the demo page, Firefox seems to have problems when the code executes faster than the images load and throws the following error:

uncaught exception: [Exception... "Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIDOMCanvasRenderingContext2D.drawImage]" nsresult: "0x80040111 (NS_ERROR_NOT_AVAILABLE)" location: "JS frame :: http://mith.umd.edu/tile/seadragon/Seadragon/Seadragon.Drawer.pre.js :: anonymous :: line 117" data: no]

This problem does not seem to occur in Chrome or Safari, nor have I previously experienced this bug in Firefox, so I don’t believe this is simply Microsoft favoring Internet Explorer. A helpful Microsoft representative looked into the problem but was unable to solve it. Notably, the World Digital Library (which uses the less liberally licensed Seajax version of the Javascript) does not seem to trigger the same error.

Nonetheless, the open source Seadragon shows a great deal of promise and could become a standby for libraries and archives looking for deep zoom interfaces to display their content. It would be especially interesting if someone could write a translator to get the JPEG2000 image streaming, Djatoka, to generate DZI stacks on the fly.

Next time I will describe how to make use of the overlay and control features to create an annotation tool for Seadragon images.

1 Actually, Kishore’s code used “AjaxControlToolkit” instead of “Sys.Extended.UI” (in bold in the example). Another developer on the forum changes the namespace “AjaxControlToolkit” (bolded above) to “Sys.Extended.UI” in a follow-up comment. The change is necessary if you are only working with the Seadragon code and ignoring the rest of the package (as we are).

Posted in Deep Zoom | Comments closed

External review of TILE

Our external evaluator, Melissa Terras, has submitted her external review of year one of our work on TILE. It is a very helpful analysis and is available here for those interested.

Posted in Uncategorized | Comments closed

How does TILE relate to TEI?

One question that we frequently get about TILE is how it relates to TEI. TEI is the Text Encoding Initiative, the de facto standard (or, more properly, a set of flexible guidelines) for humanities text encoding. The most recent version to TEI, P5, includes guidelines for incorporating images into text editions: linking the TEI document to image files representing the document (either its source or, for a TEI document containing annotations, the object of those annotations), noting specific areas of interest on the images, and linking the areas of interest to sections of the TEI document corresponding to them (either transcribed text appearing in the images, or annotations on the images).

So, how does TILE relate to TEI? Although the directors of the TILE project have long been involved with the TEI Consortium and are active users of TEI, and although the TEI community is one of the major intended audiences of TILE, TILE is not a TEI tool as such. It does not rely on TEI for its internal design and, unlike the Image Markup Tool (http://tapor.uvic.ca/~mholmes/image_markup/), which has as its output a single TEI-conformant document type, TILE is being designed to enable output in a variety of formats. Given the needs of the TILE partner projects, initially TILE will provide output in TEI (any flavour, including the EpiDoc customization), using facsimile or SVG for the image-linking mechanism, and in the IMT flavour of TEI, as well as in METS. However, when complete, TILE will be flexible enough to provide any output that can be defined using the TILE API – including output not in XML.

One result of this flexibility is that, again unlike the IMT, TILE will not be “plug and play”, and processing of the output will be the responsibility of projects using the software. This will require a bit of work on the part of users. On the other hand, as a modular set of tools, TILE will be able to be incorporated into other digital editing software suites that would otherwise have to design their own text-image linking functionality or go without. We hope that the flexibility of output makes TILE attractive for the developers of other software, and that the variety of text-linking functionality is supplies will make it equally attractive to editors and other end-users.

In a future blog post, we’ll discuss TILE functionality in detail.

Posted in Uncategorized | Comments closed

Some Thoughts on TILE Partner Projects

Newton, Swinburne, Kirby: One of these things is not like the other?

TILE is a community-driven effort, with many partners. As one of those partners, my role, at least as I see it, is to provide use case scenarios that help guide the development of the TILE tools, to implement the tools in the context of some projects that we hope will provide challenging testing environments, and to provide feedback that will lead to evolution and improvement of the TILE tools. I have other roles and responsibilities in TILE, related to earlier phases of tool design, metadata modeling, and such, but the bringing the tools to bear on my various projects is to me the most interesting and exciting part of TILE.

The projects I bring to the table are The Chymistry of Isaac Newton, The Algernon Charles Swinburne Project, and Comic Book Markup Language (or, CBML). Three projects, one on early modern science; another on Victorian poetry, fiction, and criticism; and a third on twentieth-century popular culture. Three admittedly diverse research projects. Given the range of topics covered by these three projects, folks sometimes wonder, and sometimes ask, What the hell do these projects have to with one another? How do they cohere as part of a unified research agenda.

In this blog post, I’ll try to begin answering that question in a general sense and then look more specifically at what the projects, as a group, have to offer the TILE enterprise.

The larger research agenda is not about Newton, Swinburne, or Kirby. (That’s Jack Kirby, by the way, one of the most influential creators in the history of comics. Working at Marvel comics, along with Stan Lee, Kirby transformed the comic book industry in the 1960s with a new “Marvel method” of creative collaboration and the development of characters such as the Fantastic Four, the Hulk, the Avengers, and others.) The larger research agenda is about exploring the digital representation of complex documents, not just texts, but documents—manuscripts; printed books; comic books; the original, annotated artwork for comic books—documents in all their glorious materiality. The various, often fading and messy inks of Newton’s manuscripts that make us wonder when or if Newton is using his own recipe “To make excellent Ink.”

Newton's Excellent Ink

And then we have Swinburne’s poems. His Atalanta in Calydon, with a binding designed by Dante Gabriel Rossetti. The large blue foolscap paper on which Swinburne composed most of his works. The visual documents, artworks, by Whistler, Rossetti, and others, that inspired many of Swinburne’s poems (“Before the Mirror,” “A Christmas Carol“, “Hermaphroditus“). Comic books with their yellowed newsprint held together by rusty staples, the panels of artwork and word balloons and narrative captions, the Sea Monkey advertisements, and fan mail.

sea monkeys advert

Any research into the many theoretical, technical, practical and other issues related to digital representations of complex document types would be seriously disadvantaged by a focus on a homogenous set of documents from any one particular historical period or genre. By examining 17th-century scientific manuscripts, 19th-century literary manuscripts and published books, and twentieth-century pop culture artifacts, I bring to the problem a reasonably diverse set of documents with a large and varied set of issues and challenges. And in the context of TILE, a Text and Image Linking Environment, these documents provide a rich suite of text-image relationships. In all cases, transcriptions of text need to be linked to facsimile page images. Newton’s manuscripts have additional graphic elements, in the form of alchemical symbols, diagrams, and Newton’s own pictorial illustrations. As mentioned above, Swinburne has poems inspired by visual art. Swinburne wrote a book-length study of Blake’s poetry, critical remarks on the Royal Academy Exhibition of 1868, “Notes on Designs of the Old Masters at Florence,” and a famous defense of Victorian artist Simeon Solomon. In these works, Swinburne’s texts share complex relationships with external, graphic documents. Comic books intricately weave together textual and graphic elements, and digital representation of these documents requires mechanisms to link these elements and describe the relationships.

Rich textual-graphic relationships are one feature shared by these diverse document types. Many documents in these three projects also share a richness of authorial and editorial annotation.

So with Newton, Swinburne, and CBML, we have three diverse projects being pursued under the umbrella of larger investigations into the issues related to representation of complex documents in digital space and in the context of larger, linked information environments. Our other TILE partner projects bring similarly complex documents. We hope this community of people and projects will provide a robust foundation on which to develop a widely usable suite of open source text-image linking tools.

Posted in Uncategorized | Comments closed