Regularization of sizing and spacing, including regularization of vertical and horizontal space and of type size

White space on the printed page is difficult to measure accurately and meaningfully. Its significance, for most purposes, lies not in its exact quantity but in the separation it makes between the textual units on the page, and the simple fact that it makes their distinct existence perceptible to the reader. Thus for encoding purposes, the word and element boundaries that are captured in the encoded file are usually a sufficient representation of the white space in the source text. Variations from the norm (uneven spacing between words, or exceptional space between paragraphs) may indicate casual variation, sloppy printing, some compositorial exigency such as inaccurate estimates of paper or word count, or they may indicate some special textual division or emphasis. Unless you think spacing variations are likely to be significant in some analysis, we recommend that space be regularized following the suggestions below.

  1. 1. We recommend regularizing variable or extraordinary spacing within words (for instance, between the letters of a word in a title). Regularize interword spacing to a single space. In very tightly packed lines, there may be cases where you need to determine whether two words are in fact a single word (particularly in older texts and those where spelling and usage are variable). In a normally spaced line, where two words are extraordinarily close together, we recommend encoding them using sic so as to retain the original information, in case the proximity indicates a special usage that should not be silently regularized. See example 1.
  2. 2. We recommend that vertical spacing (such as the amount of space between text lines, between paragraphs, or between lines in a display page) not be captured except as evidence of a boundary between elements.



<p>Now the coach<sic corr="&blank;"></sic>horses begin to whinny...</p>