Paolo Monella, In the Tower of Babel: modelling primary sources of multi-testimonial textual transmissions

The process of creating a scholarly edition of a literary work and its textual tradition is based upon a comparison (collatio) of the representations of the text in different primary sources. In order to do so, a digital scholarly edition must rely on digital modelling of primary sources, formalised in a way that allows the computer to compare them. As highlighted by scholars such as Tito Orlandi and Raul Mordenti, a problem under this respect is posed by the fact that each witness within a textual tradition (a papyrus, manuscript, early print edition etc.) implements a different encoding system to represent the same text. Discrepancies between such systems range from non-overlapping alphabets (e. g., in Latin, the existence of a u/v or i/j distinction) to other handwriting or print conventions (including punctuation, capitalisation, scribal abbreviations, word boundaries, use of space on the page etc.). In order to make the representations of the text of different primary sources digitally comparable, a uniform layer of digital modelling of each witness' text is necessary. TEI P51 implies this 'alphabetic regularisation', while providing methods for encoding relevant idiosyncratic scribal conventions. Ideally, however, for each textual witness (*) one layer (layer A) should model its graphical representation of the text, mirroring its specific encoding system (alphabet, writing conventions etc.). This should constitute our digital representation of the witness' graphical representation of the text; (*) a second layer (layer B) should constitute our digital representation of the text of that witness; The two modelling layers should be formally and explicitly distinct, though interrelated. For instance, where a Latin manuscript has the 'qq'-like abbreviation for 'quoque' the philologist: (1) should use a specific digital convention to encode the abbreviation in layer A (e. g. an XML entity specific for the modelling of that manuscript, like \\&AbbrQuoque;) (2) then, should recognise that abbreviation as the representation, in the scribe's graphical encoding system, of 'quoque' (as an entity within the Latin linguistic system shared by the scribe and the philologist), and provide - in layer B - a representation of that portion of the text in their own digital encoding system (e. g. a sequence of Unicode keys like \#0071 for 'q', \#0075 for 'u' etc.). In addition to exposing these views and discussing the related open issues, in my talk I shall explore how TEI P5 can address the theoretical modelling issues sketched above. These theoretical issues have a direct impact on the creation of digital scholarly editions of ancient texts with multi-testimonial textual traditions - a field that still counts few projects, particularly in Classical literatures. Also, the larger and more ambitious frame encompassing this enquiry is the long-term goal of integrating representations of primary sources in the existing TEI-encoded corpora of ancient texts through a standard and interoperable, yet theoretically grounded, model.

Paolo Monella Curriculum
DH bibliography
Paolo Monella home page