Dates, errors in

date phrase-level encoding
date sic corr corr docDate choice

Encoding errors in dates

In transcribing primary sources, you may sometimes encounter a typographical error in a date, and you may wish either to signal the error (so that it is not misinterpreted as a transcription error) or to correct it. These activities differ from providing a regularized value (as described in Dates: format for the value and when attributes), in that they say that the date as printed is factually incorrect, rather than saying that it is expressed in a different dating system or in a more verbose form.

In P4

To signal or correct an erroneous date, you should use sic, with the corr attribute to supply a correction. This element should go inside the date element, and should apply to whatever part of the date is incorrect. This is fairly simple if the date you’re transcribing is in a standard format, e.g.

The Bastille fell on <date value="1789-07-14">July 
<sic corr="1">2</sic>4, 1789</date>.

Note that only one regularized value can be provided here; the choice of which to regularize (the error or the correction) will depend on your purpose in regularizing. If you plan to generate time lines of events, using the corrected value will be of more use. If you want to enable readers to search for specific dates mentioned in the text, the intended date (July 14) will also probably be the more likely target of the search. Readers who are interested in the date as printed can still search the text itself (since the reading July 24, 1789 is preserved in the transcription).

This method is appropriate for errors which are certainly inadvertent, and would have been corrected by the original publisher if noticed. For errors which are intentional—that is, errors that reflect a false belief—you will need to decide whether you wish to treat the text as a record of fact or as a textual artifact. In the latter case, the error may be of interest in itself. In the former case, it may be more useful to provide a corrected reading, and possibly a correct value attribute as well.

In P5

In P5, the treatment of typographical errors is somewhat different, with the error and the correction being grouped together within a choice element. The choice expressed (between an error and its correction) may be encoded at the level of the date, saying in effect that the date is incorrect, as in this example:

<choice>
   <sic>
      <date when="1789-07-24">July 24, 1789</date>
   </sic>
   <corr>
      <date when="1789-07-14">July 14, 1789</date>
   </corr>
</choice>

This encoding represents the error as a misconception about the date rather than as a typographical error (substituting a 2 for a 1). Because it presents both dates in regularized form, it provides for searching and processing on both the error and the correction, but we suspect that this functionality may only rarely be useful. This encoding is also somewhat cumbersome and reduplicative. To represent the error as a single-character typographical error, we recommend the following simpler encoding:

<date when="1789-07-14">July 
   <choice>
      <sic>2</sic>
      <corr>1</corr>
   </choice>4, 1789
</date>