Interaction of various emendation types

abbreviation expansion phrase-level encoding
abbr expan sic orig corr expan

Discussion of how different kinds of textual emendation interact at the encoding level

For any of the elements in which a source reading is supplemented by an emendation (abbr, orig, sic), the nature of the emendation becomes complex when two or more of these elements (or the phenomena they represent) are present. When correcting a typographical error or expanding an abbreviation in an old text, for example, should the corrected or expanded reading reproduce the long s in the original? When expanding an abbreviation, should the expanded version use the typography of the original (for instance, the long s)? Because of the way it encodes these alternatives, P4 makes it a little difficult to express the full complexity of some possible readings; P5 offers a more flexible alternative, about which see . Here we provide here some basic strategies which will handle most situations when using P4.

1. Tag the letter, not the word

This conundrum can be largely avoided if you apply the encoding to a single letter rather than to the entire word, e.g.: happi<sic corr="n">h</sic>e&s;&s;e rather than <sic corr="happinesse">happihe&s;&s;e</sic>.

This in itself is a good reason to limit the scope of such encoding to the minimum possible number of characters. We strongly recommend placing sic, orig, and abbr on the individual letter or minimum string wherever possible. Situations where this approach is applicable include:

  • typographical errors involving a single incorrect or omitted letter or two transposed letters
  • old-style typography
  • abbreviations involving one or two letters (such as manuscript abbreviations, brevigraphs)

2. Problems that persist in single-letter tagging

Taking the single-letter approach will not prevent occasional interactions between sic, orig, and abbr: there may be a few cases where two or more of these elements apply to a single letter. In such cases, you need to think about which of the possible results you want to be able to produce. For instance, take the word convocatiĊ where the final letter is a lower-case o with a macron indicating an expansion, but this o is in fact a typographical error. If we encode this as convocati<sic corr="um"><abbr expan="on">&amp;omacr;</abbr></sic> we can produce three of four possible results: the uncorrected reading (&omacr;); the expanded, uncorrected reading (on), and the corrected, expanded reading. In this case, the fourth option (the corrected, unexpanded reading, which might hypothetically be &umacr;) is unattested. Since this reading is entirely hypothetical, its omission makes some sense in this case. But you should consider your own situation carefully and think about which readings you want to be able to produce through your encoding.

3. When you have to tag the entire word

There are also cases where the entire word must be encoded. Examples include:

  • whole-word abbreviations
  • typographical errors involving whole-word transpositions or omitted words

In these whole-word cases, the value of corr, expan, and reg should usually be considered an unavoidably modern piece of information, which has been altered from the source reading and therefore has no reason to conform to it in other respects. In particular:

  • whole-word abbreviations: the expansion should be a modern word (since the act of expanding is itself a modernization of the text)
  • whole-word corrections of typographical errors: if the correction involves supplying letters or words which are missing from the text, these should be supplied in modern form (again, since the act of correction is a modernization). This includes choice of i/j, u/v and w/vv, as well as the use of long s and the choice of variant spellings. The only exception would be cases where the correction involves a word already present in the text (e.g. a whole-word transposition), in which case you should use the spelling and form of the word as it already appears in the text.