In general the WWP does not capture the details of typographic design, including the appearance of a particular font or of particular letters in a font (e.g. swash letters, variant letter forms, ligatures between kerned characters). We preserve some aspects of earlier typographical practices which carry interpretive meaning or are of interest to modern scholars, and tends to ignore those which seem to have no bearing on textual meaning.
Characters of text are transcribed using the corresponding ASCII character, or with the appropriate entity reference. The appearance of the original character is never used as a motivation for choosing a different ASCII character (for instance, the accidental resemblance of a small C to an inverted comma, or of the numeral one to the letter I). Even in the case of the letters “n” and “u”, which are frequently printed upside-down and resemble one another closely, an attempt is made to determine what letter was actually used.
Long s is considered (for the time being, less confidently) to be a distinct character and is captured using an entity reference (&s;).
Ligatures: The WWP does not transcribe ligatures, which are letter combinations joined together for convenience because of their frequent use. Examples include st, ct, fl, and other letter combinations involving long s or f. These letters should be transcribed as if they were not joined.
Digraphs or ligatures which represent the Greek/Latin ae and oe letter combinations are an exception to this rule. These are characters which express a different phonetic quantity than the two letters taken separately, and hence need to be preserved. They are transcribed using entity references: æ (ae ligature), Æ (AE ligature), œ (oe ligature), and Œ (OE ligature).