September 9, 2010

Exam reading: “Electronic textual editing”

Electronic Textual Editing” describes the main data archiving standards effort for the humanities. It’s not really a dynamic read- how thrilling can a collection of essays on XML and database construction really be? But it’s a useful overview of the TEI:

Summary: A collection of essays dealing with editing and archiving issues with electronic texts. Focuses on the Text Encoding Initiative: project to create best practices and markup languages (SGML, then XML) for the humanities. It can be broken into three main parts: general guidelines for creating and digital editing of scholarly editions, case studies and lessons for editing both older and modern texts, and specific technical methods (e.g., digitizing documents, dealing with character encoding and markup). For scholarly editions, important considerations are accuracy in documentation and thorough inclusion of text variants. Digital editions/collections allow researchers to create quite accurate versions of a text (e.g., scanned copies), collect multiple versions of said document, and dynamically link them all. The functions of markup language include labeling (and linking) sites of variability among texts, and replicating structural/layout elements in electronic text versions of originally print documents. A few of the case studies had some interesting points. The digitizer of the Canterbury Tales points to the importance of having explicit principles for transcription before starting, and discusses how reading an electronic version of the text changes the editing & reading experience. For the creator of an electronic Thomas Edison archive, the major task seemed to be developing a good database to link text- & image-based documents. For poetry digitization, it was key to pay attention to both words and layout.

Comments: Glossed over the detailed technical essays, and focused on what I thought were the most salient points. Most authors were quite keen on XML for its formatting abilities, which I’ve used derivatives of (XHTML & CSS). As I’m not involved with archiving or creating digital editions, this was more of an overview of this area of T&T.

Links to: McGann (TEI, digital archives); Headrick (classification systems in general)



