Regularization: silent [144]


Features which the WWP silently regularizes, including details of spacing, delimiters, type size, and typography


The WWP silently regularizes, or declines to record precise information about, a number of features of the presentation of the source text:

1. Space after punctuation, between words, or between words and punctuation. We silently regularize the space between words to one space, the space between a word and the following punctuation to zero spaces, and the space after the end of a sentence or after a colon to one space. We regularize the spacing around em-dashes to zero. We do not record variations in white space within a line where this results from the tightness or looseness of the line.

2. Delimiters on page numbers and signatures (e.g. parentheses, brackets, other marks of punctuation). We do not encode delimiters on page numbers or signatures. In some cases, characters which are ordinarily used as delimiters may be used informationally, for instance where they are used to distinguish between two separate sequences (e.g. between signature sequence A, A2, A3 and A., A.2, A.3). In such cases they are no longer delimiters and are transcribed along with the rest of the signature.

3. Length of dashes. We regularize all punctuational dashes (or sequences of dashes) longer than an em-dash to a single entity, &sdash; (superdash). However, in cases where dashes or hyphens are used to indicate a number of missing letters (e.g. in a concealed name), they are recorded exactly as they appear.

4. Exact appearance of rules and ornaments. We do not record any information about the appearance of ornaments. We do not record any information about the length or weight of ruled lines, or whether they are single, double, or ornamented. For more information, see the entries on ornaments and rules.

5. Ligatures. We do not encode ligatures which simply affect the kerning of two characters and do not involve the omission of any letters or the representation of a digraph. (Digraph ligatures which we do encode include œ, Œ, æ, Æ.)

6. Uneven baselines. We do not record the presence of an uneven baseline.

7. Exact location of marginalia and marginal notes. We do not indicate by any precise means (coordinates, exact positioning relative to any other feature) the location of any marginal material, printed or handwritten. We indicate its general position by the align() and place() keywords on the rend= attribute. (For more information, see entries on these keywords.) We also indicate its position implicitly, in some cases, by the position of the anchor for marginal notes.

8. Exact typeface and type size. We do not record the exact typeface (Baskerville) or the exact size (12 pt; Pica; Great Primer). We record whether the type is roman, italic, or black letter; we record shifts in type size implicitly where they indicate an element boundary.

9. Vertical white space. We do not record variations in vertical white space, except implicitly where these are the indicator of an element boundary.

10. Special letterforms, whether in print or in handwriting. We do not record the presence of special letterforms (e.g. swash characters, alternative forms such as rounded or square E) apart from the long s.

list all entries