Longer abstract (536 words)

Talk title

The table of signs in the scholarly digital edition: a Saussurian approach (original title La tabella dei segni nell'edizione scientifica digitale: un approccio saussuriano) → Details

Abstract

In principle each textual document features a specific writing system. This applies very clearly to handwritten documents: medieval manuscripts, modern handwritten notes, diaries and drafts such as those left by Saussure or Wittgenstein. Peculiar features include the very composition of the graphematic and alphabetic systems (u/v, i/j), allographs (s/ſ, u/v, i/j, c/σ/ς), abbreviation systems, punctuation and other conventional marks.

The current approach of TEI to the issue of grapheme encoding simply consists in recommending to use the Unicode standard. This is sufficient, on the practical side, when we encode printed documents based on contemporary typographical writing systems, whose set of graphic signs (graphemes, diacritics, punctuation etc.) can be considered standard for most editorial and processing purposes.

However, the TEI 'Unicode-compliance' principle is not sufficient to define graphemes in handwritten writing systems. Let us assume that manuscript A has two distinct graphems 'u' and 'v', while manuscript B has only one 'u' grapheme. If we identified both the 'u' of the first manuscript and the 'u' of the second manuscript with the same Unicode codepoint (U+0075), our encoding would imply that they are the same grapheme, while they are not. Each of them, instead, is defined contrastively by the net of relations in the context of its own writing system, as Saussure taught us, and the net of contrastive relations of manuscript A is different from that of manuscript B, because the latter does not have a 'u/v' distinction. This is even more evident with other graphic signs such as punctuation, whose expression (shape) and content (value) varied enormously through time.

This is why Tito Orlandi ("Informatica testuale", 2010) suggests to declare and define formally, for each document edited, each graphic sign that the encoder decides to distinguish, identify and encode in his or her digital edition. This is what he calls a "table of signs".

After discussing this issue at the 2013 TEI yearly conference, I came to the conclusion that the very nature of XML, which is the current technological framework of TEI, requires that the encoder formally declares only "non-standard" graphemes that are not already declared in Unicode. This is the purpose, for example, of the TEI gaiji module, which only allows for a description of "non-standard characters", i.e. graphemes and other signs not included in Unicode.

The TEI Guidelines currently suggest that encoders define as few "characters" as possible, while I am suggesting that they should declare and define all signs encoded in the edition.

This does not mean that it is technically impossible to implement a "table of signs" for each document in a digital edition. In my prototypal critical digital edition of the "De nomine" by Ursus from Benevento (IX century), I implemented a GToS ("Graphematic table of signs") as a simple CSV file, probably the simplest digital format for a table. I then created software (in JavaScript) that processes the GToS as an essential component of the edition model.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.