Forme work, encoding within

delimiter forme work catchword
mw fw sic corr supplied unclear del gap

Discussion of the types of encoding which may appear within the mw element

The content of fw elements is textually somewhat different from the content of the main text stream. In a sense, the words that appear within the forme work are not part of the text’s linguistic structure: they function as part of the text’s infrastructure, but they do not signify in the same way as the rest of the text. If the word Jane appears as a catchword, it does not serve as another personal name, but rather simply as the repetition of a string of characters which does serve as a name.

For this reason, within the content of fw elements very little further encoding is needed or appropriate. We recommend that names, dates, emphasis, and other phrase-level encoding should not be used. Encoding may be included to signal and correct discrepancies, to represent transcription problems such as illegibility, or to capture basic renditional information.

The following are specific cases where we suggest that further encoding may be useful within fw:

  1. Where there is a typographical error in the forme work, we recommend encoding it using sic so that the error is flagged (and can be distinguished as an error in the source rather than a transcription error). For catchwords, we do not recommend correcting the error, since there are no circumstances where a corrected catchword is likely to be useful, and the discrepancy can be seen and understood by the reader in any case. See the entry on catchwords for information on the treatment of discrepancies between the catchword and the main text.
  2. Where there is highlighting in part of the forme work, we recommend encoding it using hi, regardless of its cause. Rendition that applies to an entire fw element should be recorded on the rend attribute of fw. See Forme work: renditional issues for more information on encoding rendition in forme work.
  3. If the content of the forme work is obscured, illegible, or deleted, we recommend encoding this as usual using supplied, unclear, del, gap.

Finally, characters that are usually encoded as delimiters using rend (e.g. quotation marks, brackets, and so forth) should be encoded as #PCDATA, since they do not function as delimiters for the catchword and will never need to be suppressed or altered on output.