Special characters: inverted characters [007]

Abstract

Treatment of characters which are printed upside down in the source

Discussion

The WWP collection includes many examples of characters printed upside down; most frequently these are “n” and “u”, but also other characters.

The WWP encodes inverted characters using an entity reference and a <sic> element, e.g.

<sic corr="a">m&inverteda;n</sic>

 The reason for this apparent redundancy is that neither the element nor the entity reference by itself is sufficient to guarantee the appropriate output in all cases. If an entity reference is used alone, then we will have problems in cases where two inverted letters should be mapped to two different corrected readings, since both entity references must be resolved to the same character. Similarly, if a <sic> element is used alone, then there is no way of representing the actual inverted character itself.

Our transcription strategy is founded on an “encode what you see” philosophy, independent of hypotheses about what the printer intended. Thus if a letter *looks* exactly like an “n”, even if it is in a place where a “u” should be, we encode it as an “n” and we do not try to imagine whether it could “really” be an inverted “u”. Similarly, if we see a letter which looks exactly like all the other "w"s in the text (except for being upside down), we encode it as an inverted “w” even if it is functioning in the text like an “m”.

1. In cases where an inverted letter is printed, and the correct reading is the same letter rightside up, we encode the letter with an entity reference and a <sic> element:

morld (where the m is an inverted w)

     <sic corr="w">&invertedw;</sic>orld

2. Similarly, in cases where an inverted letter is printed, and the correct reading is some other letter, we encode the letter with an entity reference and a <sic> element:

mother (where the m is an inverted w)

     <sic corr="m">&invertedw;</sic>other

The only difference between these two cases is that in case 1, the correct reading requires the inverted letter to be interpreted as itself, turned rightside up, whereas in case 2, the correct reading requires the inverted letter to be interpreted as another letter which happens to resemble it.

3. In cases where rightside up letter is printed, and it appears that what was intended was that the letter be used upside down as a substitute for some other letter, the WWP expresses agnosticism about causes and simply encodes the letter that appears as a typographical error, using SIC:

Her wother nursed her... (rightside-up “w” probably intended to substitute for an “m”, but inserted upside down...)

     <sic corr="m">w</sic>other

4. In cases where a letter has been printed which might be rightside up or upside down, in the absence of evidence to the contrary, we assume that it is rightside up. The encoder should compare it carefully with other letters in the text to see whether similarities or differences might indicate exactly which letter is printed. If (as might be the case with “n” and “u”) two letters look exactly the same (except for the inversion), the encoder will follow the “assume it’s rightside up” rule. If, however, there are slight differences (if the u has a longer tail, for instance) these can override the “assume it’s rightside up” rule.

So in the case of the word “trnth” (which we assume should read “truth”):

If the two letters look identical:

     tr<sic corr="u">n</sic>th

If the two letters look different, and we’re sure it’s a “u” upside down:

     tr<sic corr="u">&invertedu;</sic>th

If the two letters look different, and you’re sure it’s an “n” rightside up:

     tr<sic corr="u">n</sic>th

list all entries

search

about

wwp