Down the rabbit-hole
I’m quite pernickety regarding typesetting of continuous text, or ‘body text’. Done well, it eases the path of the author’s words into the reader’s mind; done badly, it can get in the way. Consider the setting of plain narrative; this is about as simple as it gets—but that’s not very simple, as we’ll see.
There are so many elements in play that it’s hard to know where to begin; to reduce them to what I want specifically to talk about here, we’ll assume appropriate choices of typeface, type size and leading (pronounced like the metal, and meaning line spacing) have been made, also that page size and margin widths have been established.
In passing, and linked to margins, text line length, or measure, has two practical effects which place upper and lower limits on it: too long in relation to type size, and the reader gets lost trying to find the start of the next line; too short (if the setting is fully justified) and the word spacing from line to line becomes noticeably uneven, as seen frequently with narrow newspaper columns. These are two good reasons why line lengths in books are what they are.
So we have reduced our task to ‘pouring’ text at a predetermined type size into a series of ‘frames’ on successive pages; this sounds simple enough to be left entirely to the computer, but it’s not if our book is to look even moderately refined.
Typographers talk about the ‘colour’ of the text block on a page taken as a whole, by which they mean how dark or light it seems, and how uniform this effect is across the page. Achieving good colour in most books requires line-end hyphenation: with ragged-right text, where the word spacing will generally be uniform, this helps make the line lengths less unequal; with fully justified text, it minimises the variation in word spacing between one line and another; in both cases, overall appearance is improved. Line-end hyphens are not a visual problem unless they occur on more than two or three successive lines, and this can be controlled by software.
The location on the page of the first or last line of a paragraph can be unsightly. Last lines at the head of a page and first lines at the foot are worth avoiding (though few trade books respect this today) unless, of course, the line is the paragraph. ‘Widows and orphans’ they are called, though there is no consensus regarding which is which. I cheerfully murder them wherever possible as I massage the text into shape. A very short line ending a paragraph (anywhere on the page) is also worth avoiding; this is called a ‘runt’ by some, and one which is shorter than the next paragraph’s first line indent looks particularly odd.
What are our facilities for eliminating this catalogue of visual offences? Well, one common feature in wordprocessing and page layout programs is called something like ‘keep lines together’, and generally has a default value of two. The software does its best to ensure at least two lines from a paragraph are at the head and foot of each page. I never rely on this because it works by simply reducing the depth of the text on an affected page by one, or even two, lines. The reader may not notice this or be bothered by it, but it generally loses the symmetry of text depth on spreads, and can be obtrusive when there is ‘furniture’ at the foot, such as the page number, to draw attention to the resulting additional space. This measure may still be resorted to manually if absolutely no widows or orphans can be tolerated, and if the technique described below is not completely successful, but we should at least try to keep the number of lines of text the same on facing pages. I think I have used it once.
A better approach to the despatching of our waifs and strays is to progressively increase or decrease the word spacing of one or more paragraphs, possibly a page or more in advance of the location, until the problem is eliminated. Page layout software like Adobe InDesign facilitates this. But only certain paragraphs are eligible for this treatment, and you quickly learn to spot them: one to be shortened by a line needs a very short last line, and one to be lengthened needs a long one, otherwise the adjustment required in order to have any effect on line count is excessive; and the candidate must contain enough lines not to be made too tight or loose by the operation. Then, to maintain even colour on a spread, we may want to tighten or loosen other paragraphs on it, but taking care not to alter their number of lines. Unlike the ‘keep lines together’ method, this one does affect text colour, but generally not in a noticeable way.
Having taken these measures we may still be unlucky and spot some unsightly ‘rivers’ of white space flowing down the page; a further small tweak to word spacing will usually fix this without altering the number of lines in a paragraph.
There are some other ways to achieve the above aims, but to my mind they are ‘taboo’. One is to enter judicious soft line breaks here and there to force the text to reflow more pleasingly. This should never be done because it changes the text itself (and in a way unique to this setting)—a fact which will come back to bite us when we will least expect it or be able to recall why. Another way is to alter the character spacing—or even the widths of the characters, or glyphs. This looks bad and is an insult to both the reader and the type designer.
Sometimes a widow or orphan will not be eliminated, because in the local context the cure has side-effects worse than the affliction, or simply because there’s a limit to the amount of time it’s worth investing in the task.
I mentioned at the outset that we were dealing with narrative text. This was because it generally has paragraphs of a length where my preferred technique can be applied. With fiction, particularly the chattier kind, it often cannot due to the preponderance of very short paragraphs.
Text massaging to the level I’ve described has never been achieved by software, and I hope I’ve succeeded in indicating why. As a programmer myself I am sometimes tempted to go down rabbit-holes like this one, but I would be motivated only by the challenge and the fun, rather than any business justification. The problem to be solved is even trickier than I think it is.