General Overview
The following is a general overview of
procedures used throughout this project. Descriptions of each element is
described in the Element Descriptions.
The texts in this project are encoded with
Extensible Markup Language (XML) to facilitate rendering and searching the
digital text. For this reason, it is essential to mark in the digital text both
the structure and the appearance of the original source. For
example, if the structure of the original source comprises mutliple
subdivisions, these are nested in the digital text like the subsections of an
outline might be nested inside levels of headings. Likewise, if a portion of
the original text appears in italics or underlined, these textual treatments
are noted in the markup of the digital text.
Essentially, the more the digital text is
encoded to indicate the structure and appearance of the original source, the
more options the end-user has in searching and viewing the bibliographic codes of the
resulting digital text.
Each digital document should be prepared
from a specific edition which will be mentioned in the header of the digital
text. If appropriate, the digital text can be proofed against an additional
edition.
Omissions
The following portions of the original
source are omitted from this project:
- All preliminaries, such as front matter
and title pages, except in those cases when the preliminaries are considered
authorial artifacts.
- Editorial comments except those for
which the author might be responsible and those in which significant textual
variation is indicated.
- Catchwords
- Page breaks
Since 1987, the
Text Encoding Initiative (TEI) has provided
mark-up standards that help scholars to encode various types of literary texts
for online research and teaching. To mark-up or encode in this
sense means making an interpretation of a text explicit through metalanguage.
Unlike HTML, the metalanguage of XML is focussed on the meaning of data, not
its presentation. As TEI explains it, "With descriptive instead of procedural
markup the same document can readily be processed in many different ways, using
only those parts of it which are considered relevant."
See TEI Consortium for more information.
For example, in HTML, the
presentation of a title is: <center>The Title</center>
in XML the meaning is
encoded: <title>The Title</title>.
The Document Type Definition (DTD) used for this project is a subset of the TEI. The DTD is
used to define the legal elements of the XML document. The XML document should
be parsed against the DTD before submission to EADA, as the digital document
must be formed according to the parameters established by the DTD before the
XML document can be included in the EADA database.
For example, if you are using a text
editor like XMetal, you will be required to provide a "rule" document or the DTD.
Each Electronic Edition is named in all
lowercase with the source author's last name and the first word or words of the
title that make that title unique. It is important that each title be unique to
avoid file duplication and overwriting.
For example, Anne
Bradstreet's "To My Dear Children" and "To My Dear Husband" would be named
"bradstreettochildren.xml" and "bradstreettohusband.xml"
back to the
top
The TEI-Header for each document is
structured according to pre-established guidelines. Each header comprises
significant metadata that will help identify and categorize each document,
including publication information and particular editorial decisions.
A Web
header example and an XML header template for
downloading are provided as examples. (Note: to view the XML header template, please right-click and save the file as "EADAHeader.xml".
Each document has a basic division of
<div0>. Each internal division is marked with appropriate subsections
(i.e., <div1>, <div2>, and so on.) Subsections may include numbered
sections or chapters or whole poems depending on the structure of the original
source. Often, the subsections can be identified by the presence of distinct
titles or headings.
<text> <body>
<div0> <head type="main" rend="all-caps">[Head Text
inserted here]</head>
<div1>
<head>[Head Text for second <div1>]</head>
<p n="1">[First Paragraph of <div1>]</p>
</div1>
<div1>
<head>[Head Text for second <div1>]</head>
<div2>
<head>[Head Text for
<div2>]</head>
<div3>
<head>[Head Text for
<div3>]</head> <p
n="22">[First Paragraph of <div3>]</p>
<p n="23">[Second paragraph
of <div3>]</p>
</div3>
</div2> </div1>
<div1>
<head>[Head Text for second <div1>]</head>
<div2>
<head>[Head Text for
<div2>]</head> <p
n="24">[First paragraph of <div2>]</p>
</div2> </div1>
</div0>
</body></text>
back to the
top
Characters that are non-keyboard (that is,
do not appear on the main letter and number keys on the computer keyboard) are
encoded with special unicode references. Special characters are captured in the
text with standardized numerical character references. These values can be
found at http://www.unicode.org/charts/.
For example, the m-dash in
the following lines
count my vain sighs for
nought? For such is joy and such the price of pain.
would be encoded as,
<l>count my vain
sighs for nought? </l> <l>—For such is joy and such
the price of pain.</l>
back to the
top
All changes to the typography of the text (i.e., font changes) are encoded to facilitate bibliographic code rendering.
Font changes (e.g., titles, foreign, and emphasized words) are recorded in the 'rend'
attribute of that element. The following are possible values for the rend
attribute:
- rend="italic"
- rend="bold"
- rend="underline"
- rend="strike-out"
- rend="superscript"
- rend="subscript"
- rend="small-caps"
For example, a title that
appears in italics would be tagged as <title rend="italics">Tarry shadow
of my scornful treasure</title>.
back to the
top
All paragraphs <p> are numbered sequentially through the entire text.
Line <l> and line group <lg> numbering is started again for each new set of <lg>s within a text. That is, if the whole text is one poem or song, etc., all the line groups and lines are numbered sequentially throughout the entire text. If the text contains several poems or songs, etc., each separate group of line-groups and lines is numbered sequentially wihin that group, because each group would be numbered as a separate <div#>. In this case, line numbering begins again for each group or <div>.
Please note: If the source is numbered, the electronic file reflects that system regardless of source errors. Otherwise, numbers are
entered by the editor sequentially throughout the document.
The example below represents a long poem.
<lg n="56">
. . . <l n="325">And to conclude, I may not tedious
be,</l> <l n="326">Man at his best estate is
vanity.</l> </lg>
<lg n="57">
<head rend="italic">Old Age.</head> <l
n="327">WHAT you have been, ev'n such have I before:</l>
<l n="328">And all you say, say I, and something
more.</l> . . . </lg> back to the top
Punctuation is usually recorded according
to the source. While punctuation appears within the element for larger
structures like <p> and <l>, most often, punctuation appears
outside of a tagged string whenever appropriate.
In the example below, it is appropriate to
include the punctuation within the <l> tags as the punctuation is part of
the line of verse, but it is not necessary to include the punctuation within
the <hi> tags as the comma is not italicized in the text.
<l
n="29">By fraud or force usurp'd thy flowring crown,</l>
<l n="30">Or by tempestuous warrs thy fields trod
down?</l> <l n="31">Or hath <hi
rend="italic">Canutus</hi>, that brave valiant
<hi rend="italic">Dane</hi></l>
back to the top
Spacing is limited to one space between
characters. Spacing should always appear outside of a tagged string whenever
possible.
In the example below, spacing has been
extracted as the tags note the structural difference between the various
strings.
<closer><salute>My Lord <lb/>Your
Lordship most <lb/>Humble
Servant,</salute><signed>George
Alsop</signed></closer>
In the following example, strings that are
tagged within sentences are tagged without spacing. The spacing is left around
the tagged string.
<l
n="358">That but for shrubs they did themselves
account.</l> <l n="359">Then saw I <hi
rend="italic">France</hi> and <hi
rend="italic">Holland</hi>, sav'd <hi
rend="italic">Cales</hi> won,</l> <l
n="360">And <hi
rend="italic">Philip</hi> and <hi
rend="italic">Albertus</hi> half
undone.</l> back to the top
Any questions about this description
should be sent to Tanya Clement, MITH Program Associate, at
tclement@wam.umd.edu. |