The Raster Tragedy at Low-Resolution Revisited

The Raster Tragedy at Low-Resolution Revisited:
Opportunities and Challenges beyond “Delta-Hinting”

There are four aspects of converting fonts to digital that I consider key to understanding the font rendering process. In this chapter we will discuss these fundamentals in detail, using simple math and plenty of numerical examples. I feel strongly that without developing a healthy intuition for these fundamentals, we would keep groping in the dark, leaving most raster tragedies—and therefore “hinting”—in the realm of mystery.

Scalable fonts define the shape of each character with lines and curves. The lines and curves themselves are defined by control points. Control points are essentially dots on a drawing board, and the lines and curves just “connect the dots.” Lines connect two dots in a straight line while curves use a mathematical formula that defines how the connection should “curve” from one dot to the other. Collectively, the lines and curves are called outlines.

The control points of the lowercase ‘m’ of Times New Roman.
⇒ Hover your mouse over the illustration to “connect the dots.”

The important items here are the control points. These are the parts required to define the entire shape of the character. The rest is taken care of by the math that connects the control points. Moreover, a property of the outlines used in both TrueType and Type1 fonts allows for easy zooming in or out. All that’s needed to do so is to space all the control points further apart or closer together. Doing so readily yields a larger or smaller character. Its shape is the same shape, regardless of whether it is larger or smaller.

Specifically, each control point is composed of two coordinates, the x- and the y-coordinate. The x-coordinate is a number that specifies the exact location of the control point in the horizontal direction (how far to the left or right?). The y-coordinate does the same for the vertical direction (how far up or down?). Spacing all control points further apart or closer together then simply means to increase or decrease all numbers by the same percentage. This property makes the outlines scalable, and the percentage is called scaling factor.

The outline of the lowercase ‘m’ of Times New Roman.
Outlines are scaled by applying the same scaling factor to both the x- and the y-coordinate of every control point.
⇒ Hover your mouse over the illustration to scale the ‘m.’

Let’s assume we knew the scaling factor needed to zoom outlines to whichever size we want (cf and ). Then, to render the character, all that is left to do is to “color in” the outlines. Sounds simple enough, but this is where the problem starts.

In the digital domain the “coloring in” is severely restricted. It is as if we could apply the “coloring pen” only in a few discrete spots on the drawing board, and with each application we get a big square of ink.

The outline of the lowercase ‘m’ of Times New Roman, scaled to 12 point at 96 DPI, and about to be sampled (ie “colored in”).
⇒ Hover your mouse over the illustration to sample the outline.

It is an all-or-nothing process. Each time the pen is applied, it will light an entire square. Each time it isn’t applied, it will leave the square blank. But there is nothing in-between. There are no intermediate positions on which to apply the pen, nor are there different pens that produce smaller squares.

The correct technical term for this process is sampling. At discrete spots, spaced by regular intervals in both x- and y-direction, the process tests if a particular spot is “inside” the outline, or “outside.” Since we’re not using different pens on different parts of the character, it doesn’t matter how deep inside or how far outside it is. It’s either “in” or “out”—period.

Accordingly, the sampling process assigns the value of 1 to each “inside,” while each “outside” gets 0. The technical term for these values is samples. The distance or rate at which they are spaced is called the sampling rate.

Except of course that despite all these technical terms, the sampled character looks nothing like the timeless elegance of the original outlines. True! But that’s not because there is anything wrong with the sampling process nor its terminology. We simply haven’t fulfilled the necessary prerequisites to satisfy the Sampling Theorem.

The sampling theorem is the theory behind just about anything digital, particularly behind the round trip from analog to digital and back. The theorem says that in theory this round trip is possible without any loss if—if and only if—the sampling rate is at least twice the rate corresponding to the smallest feature to be sampled (Nyquist Limit, or Nyquist Frequency).

For instance, when a violin is recorded, the recording device will choose a sampling rate of 44.1 kHz or 44,100 cycles per second (or higher). Never mind the somewhat odd number; this is the standard that was agreed upon for CDs some 30 years ago. What’s important is that, since the young human ear can hear “features” or harmonics up to about 20 kHz or 20,000 cycles per second, a sampling rate of 44.1 kHz satisfies the sampling theorem: 44,100 is more than twice 20,000. At least in theory this should work.

Unfortunately, we don’t get to choose the sampling rate. The pixels that represent our samples on our screens have already been “chosen.” The “chosen” sampling rate is the number of pixels or dots per inch (DPI for short). Typical computer screens today have a resolution of somewhere between 96 DPI and 120 DPI, which is way too low to satisfy the sampling theorem.

To get an idea how far off we are, let’s have another look at the above ‘m.’ This time we’ll scale it to be sampled at 8 pt and on a 96 DPI screen, which represents somewhat of a worst case scenario. At this small size and low resolution, the serifs measure 3/16 of a pixel across. This is just about the smallest feature on the ‘m.’

The outline of a lowercase serif of Times New Roman, scaled to 8 point at 96 DPI. The serif measures 3/16 of a pixel across.

Now, the theory says that the sampling rate should be such that this feature measures at least 2 pixels. Hence in our case the sampling rate is off by a factor of 2 divided 3/16, for which I get 10 2/3. In other words, the sampling rate should be 10 2/3 times the 96 DPI we’re stuck with, or 1024 DPI!

Times New Roman, lc ‘m,’ rendered at 8 point and 96 DPI.
⇒ Hover your mouse over the illustration for 8 point and 1024 DPI.

At 8 point and 96 DPI, we’re lucky to get anything remotely resembling a lowercase ‘m.’ Even when sampled at 1024 DPI, the ‘m’ continues to look like a bit of a “digital caricature” of the analog original.

The trained eye will notice that it could use some “hinting” to render the three equal stems with equal pixel counts, and some font “smoothing” to soften the “jaggies.” We’ll get to both of that. In fact, since we really don’t get to choose the sampling rate, and since we’re way below the minimum sampling rate, we’ll discuss a lot of “hinting” and “smoothing.”

By now, we understand the process of rendering fonts on a digital device as a sampling process. However we don’t get to choose the minimum sampling rate required to sample the smallest features of the fonts. Hence we will have to find workarounds to make this shortcoming more tolerable. The keywords here are workarounds and tolerable. Workarounds aren’t real solutions, and they may not be equally tolerable to everybody.

Recall the outlines are considered scalable. Scalability without any restrictions or constraints is the ultimate goal, because this makes text truly portable (device independent). But the unconstrained scaling led to features that were too small to be sampled. Hence let’s try to somehow constrain the scaling mechanism in such a way as to prevent select features from becoming too small.

The outline of the uppercase ‘H’ of Times New Roman, sampled at 11 point and 96 DPI. The coarse sampling “misses” the crossbar and several serifs.
⇒ Hover your mouse over the illustration to constrain the outline. With the constrained outline, coarse sampling no longer “misses” any parts.

In the above illustration, I have constrained the scaling mechanism. I made a deliberate choice that, no matter how small the outline is scaled, there shall always be a minimum distance of one pixel between pairs of control points suitable to define the size of a feature. This choice is a compromise: It doesn’t increase the smallest features enough to satisfy the sampling theorem, but it gets us at least one pixel where the theory would have asked for two. Moreover, it gets us one pixel for both the crossbar and the serifs.

Times New Roman, UC ‘H,’ rendered at 8 to 24 point and 96 DPI. The coarse sampling “misses” several crossbars and many serifs.
⇒ Hover your mouse over the illustration to constrain the rendition with a minimum distance of 1 pixel. With this constraint, coarse sampling no longer “misses” any parts.

The sampled characters are still a mere effigy of the original, but at least they look like they’re a serifed ‘H.’ The unconstrained sampling would have missed several crossbars, rendering the ‘H’ as a pair of ‘I.’ This is intolerable. Likewise, unconstrained sampling would have missed many serifs. This may or may not be intolerable. If space gets tight, it may be preferable not to render some of the “inner” serifs at all.

Readers familiar with TrueType may object that I forgot to turn on drop-out control. Drop-out control is a stop gap measure that inserts pixels deemed missing as a result of coarse sampling. However, at the level of pixels, it is not always obvious if a pixel is missing. Likewise, once determined missing, it is not always obvious where to insert the pixel. This may explain why TrueType offers several types of drop-out control.

Times New Roman, UC ‘H,’ rendered at 11 point and 96 DPI, using TrueType’s drop-out control feature. From left to right, the selected SCANTYPE[] is 1, 4, and 5, respectively.
⇒ Hover your mouse over the illustration to turn drop-out control off.

Notice how different types result in different positioning of the crossbar and different lengths of the serifs. Accordingly, I simply left drop-out control off, in favor of a method that lets me decide which features I want to caricature, and how.

The way I have constrained the scaling mechanism, so far I can assert that the sampling will yield features which are sampled with at least one pixel. But, as the previous illustrations made painfully obvious, sampling will also yield equal features which are rendered with seemingly random pixel counts. For instance, in the examples below, the two stems are seemingly randomly sampled with one or two pixels.

Times New Roman, UC ‘H,’ 10 to 12 point at 96 DPI, constrained to render all parts of the character with a minimum distance of 1 pixel. While this doesn’t “miss” any crossbars or serifs, it renders substantially equal stems and serifs with seemingly random pixel counts.

If the scaling mechanism were truly scalable, shouldn’t it scale like features down to like sizes? It certainly should, and it actually does. The lack of scalability is not caused by the scaling operation per se, at least in theory, but by the sampling. Once sampled, it looks as if the samples had been produced from a set of outlines wherein the stems’ edges have been constrained to the nearest sample boundaries.

Times New Roman, UC ‘H,’ 12 point at 96 DPI, constrained to be rendered with a minimum distance of 1 pixel as above. Additionally, the edges of the (vertical) stems have been rounded to the nearest pixel boundary.
⇒ Hover your mouse over the illustration to “un-round” the stems' edges. Notice how both outlines yield the exact same set of pixels.

The technical term for this phenomenon is aliasing. Two equally sized features sample to unequal sets of samples, or vice-versa. According to the theory, this is to be expected, because our sampling rate is (still) way too small. We will have to find another workaround. And no, the workaround is not anti-aliasing just yet. I feel strongly that without understanding the all-or-nothing process of “on” or “off” pixels it will be a lot harder to understand pixels with intermediate shades of gray or color.

So, let’s have a closer look at the above ‘H’ at 12 pt and 96 DPI. Specifically, let’s have a look at the x-coordinates of the control points on either side of the left and right stem. I determine the following x-coordinates:

	Left stem		Right stem
	Left stem, left edge	Left stem, right edge	Right stem, left edge	Right stem, right edge
identifier	xLl	xLr	xRl	xRr
x-coordinate (in pixels)	1 50/64	3 18/64	8 13/64	9 45/64
stem width (in pixels)	1 32/64		1 32/64

Times New Roman, UC ‘H,’ 12 pt at 96 DPI, x-coordinates of left and right edge of both stems, along with their widths.

TrueType maintains scaled control points to a precision of the nearest 1/64 of a pixel (cf ), hence all the fractions. Different scalable font formats may use different number representations, but the following argument will be the same.

If we look at the widths of the stems, by subtracting the x-coordinates of the left edges from the respective right edges, we get 1 32/64 pixel for either stem. For instance, for the left stem

3 18/64 − 1 50/64 = 210/64 − 114/64 = 96/64 = 1 32/64

and analogously for the right stem. In other words, for the left stem the following equation holds

xLr − xLl = w

or, putting left edge plus width equals right edge,

xLl + w = xLr

and likewise

xRl + w = xRr

for the right stem. So far so good, but despite all the math we haven’t learned anything new. To do so we repeat the above exercise, except that this time around we will use the aliased x-coordinates that look like they have been constrained or rounded to the nearest sample boundary. Notice that in order to distinguish this from the previous exercise, I’ve tagged all the identifiers with a prime (′).

	Left stem		Right stem
	Left stem, left edge	Left stem, right edge	Right stem, left edge	Right stem, right edge
identifier	x′Ll	x′Lr	x′Rl	x′Rr
aliased x-coordinate (in pixels)	2	3	8	10
aliased stem width (in pixels)	1		2

Times New Roman, UC ‘H,’ 12 pt at 96 DPI, aliased or rounded x-coordinates of left and right edge of both stems, along with their widths.

With the rounded numbers, the widths of the stems (aliased stem widths) are no longer the same. More formally

x′Ll + w ≠ x′Lr
x′Rl + w ≠ x′Rr

More math, more numbers, but what have we learned? We have learned that scaling by itself is benign, but scaling plus sampling or rounding causes things to no longer add up!

The true width w of the stems is 1 1/2 pixels, but sampling produces either 1 or 2 pixel. Since we do not get to choose the sampling rate, we do not get to choose the pixel size, either. Hence we will have to make up our minds as to what we consider the “real” true width by rounding both stem widths to the same size w′. This will be either 1 or 2 pixels. If we choose w′ = 1, then the first equation will add up, but the second won’t. If instead we choose w′ = 2, then the second equation will add up, but not the first.

Generally, and this is the core point of the argument, we will encounter many situations where two unrounded coordinates add up

a + b = c

but the corresponding rounded coordinates

a′ + b′ ≠ c′

do not add up. Formally

f(a) + f(b) ≠ f(c)

where f represents the rounding function, and the parenthesis () mean that f applies to everything between those parenthesis. Since we know that the unrounded coordinates do add up, we can substitute c = a + b

f(a) + f(b) ≠ f(a + b)

on the right hand side. This tells us that individually rounding a and b and then adding them up is not the same as adding up a and b and then rounding the sum. Math calls such a function non-linear. Conversely, to be linear, it has to satisfy

f(a + b) = f(a) + f(b)

for any pair of numbers a and b. It is not good enough to satisfy the above equation sometimes—it has to do so always—for any values of a and b.

Even more math, equations, parenthesis—really!—why does this matter? It matters because it shows how rounding or sampling makes font rendering non-linear. Once it is non-linear, it is no longer scalable. The two concepts linearity and scalability are equivalent here.

The startling conclusion is that rounding or sampling makes the allegedly scalable font format non-scalable!

Not that we have rounded anything in the first place, or at least not explicitly so. It’s the pixels, produced by the sampling, that make it look as if some things had been rounded, and rather carelessly so. Consequently, we will have to become very careful just exactly what we round, and when.

Specifically, we will look for a workaround that—at least partially—restores the linearity lost in the sampling process, as illustrated below.

Times New Roman, UC ‘H,’ 12 point at 96 DPI, and constrained to render with a minimum distance of 1 pixel as above.
⇒ Hover your mouse over the illustration to constrain the outline some more. The additional constraints no longer render substantially equal parts with unequal pixel counts.

In the above illustration, I have constrained the scaling mechanism some more. Again, I made a couple of deliberate choices. First, for both stems I chose to round the x-coordinate of one of their edges to the nearest sample boundary. Next, I computed a value for the rounded stem width. Finally, I defined the x-coordinate of the other edges as the sum of the previously rounded first edge plus (or minus) the rounded stem width. Thus, formally, I get

x″Lr = x′Ll + w′
x″Rl = x′Rr − w′

which defines the right edge of the left stem x″Lr as the sum of the left edge of the left stem x′Ll plus the width w′. Likewise, it defines the left edge of the right stem by subtracting the width from its right edge. Notice the double prime (″) e.g. on x″Lr. I used this to distinguish it from x′Lr. They are not the same—that’s the whole point of the argument!

Notice that the nature of this argument is very general. It doesn’t apply to stem widths only. Rather, it applies any time we want equal parts of a character to be rendered with equal pixel counts. For instance, in the above illustration, the serifs have substantially equal dimensions, and hence we would expect them to be rendered with substantially equal sets of pixels.

Accordingly, this illustration has been generated by properly constraining both stems, the crossbar, and all serifs. Here is what this looks like on two preceding examples:

Times New Roman, UC ‘H,’ 10 to 12 point at 96 DPI, constrained to render substantially equal parts with equal pixel counts.
⇒ Hover your mouse over the illustration to see what this looks like without these constraints.

Times New Roman, UC ‘H,’ 8 to 24 point at 96 DPI, constrained to render substantially equal parts with equal pixel counts.
⇒ Hover your mouse over the illustration to see what this looks like without these constraints.

The choices or priorities of the constrained scaling mechanism have to be made very carefully. Prioritizing the wrong edge of a stem or crossbar can cause the positioning of the entire stem to be off. This is most easily noticed on a crossbar when it ends up rendered below center while it was designed above (cf ).

Two stems can cause the black-body width of a character to be rendered too wide or too narrow. For a given advance width, this can throw off the apparent side-bearing spaces (cf ). Conversely, prioritizing black-body width and side-bearing spaces can throw off the rendered advance width and in turn the scalability of text layout (cf ). We’ll see examples for all of the above in the following chapters.

Let’s recap this now: Sampling causes font rendering to become non-scalable. Sampled features no longer add up. To prevent this from happening, I have constrained the scaling mechanism. Doing so involves prioritizing which feature(s) must remain scalable, and in turn, which feature(s) may get compromised. With the above caveats in mind, this partially restores the scalability of font rendering.

By the end of the previous section, I have constrained the scaling mechanism to sample features too small to be sampled, and to sample equal features with equal pixel counts. The reason behind this is to restore the scalability of font rendering lost in the sampling process—or at least restore it to the degree that this is possible at all. Ultimately, of course, we’ll always wind up with pixels or samples, which make the overall process inherently non-scalable.

But before we reach that point there are a few more workarounds to make font rendering “as scalable as possible.” Recall that math calls a function linear if it satisfies

f(a + b) = f(a) + f(b)

for any pair of numbers a and b. Strictly speaking, this is merely a special case. In general, to be considered linear, said function must satisfy

f(λ·a + μ·b) = λ·f(a) + μ·f(b)

for any pair of numbers a and b and for any pair of proportionality factors λ and μ. To see what this tells us in terms of scaling and constraining fonts, I’ll discuss two sets of proportionality factors next.

For the first set of proportionality factors, I’ll consider the (black body) width of a glyph vs. its height, or the weight of the (horizontal) crossbars vs. the weight of the (vertical) stems. Both situations have in common that they can be described by a simple proportion. The width of the glyph is so-and-so many percent of the height. The weight of the crossbar is a percentage of the weight of the stem. It may be 100% for a monoline font like Courier, or way less than 100% for a font like Bodoni.

Formally

c = λ·a

where a is e.g. the height of the glyph, c is its (black body) width, and λ the “width-to-height ratio.” Now, to be linear, math requires that

f(λ·a) = λ·f(a)

which I get from the general case by putting b = μ = 0 (it’s really any number, so I can use b = 0, and any proportionality factor, so I can again use μ = 0).

This tells us that if things were truly scalable, I should be able to take a number a, “shrink” it by a factor λ, and round it, and that this should be the same as taking that same number, round it, and then “shrink” it. In other words, I should be able to swap the order of rounding and “shrinking” and the outcome should be the same.

Not so! Sampling is non-linear, rounding is non-scalable! This explains why we may see a font like Verdana rendered with a stroke design contrast as if at some sizes it were Optima (cf ), or glyphs like a lc ‘o’ seemingly randomly rendered as “square” even though it is clearly designed “non-square” (cf ).

Like I said before, we have to be very careful just exactly what we round, and when we do so. Accordingly, if it is a priority to render the contrast between horizontal and vertical strokes or the proportions between widths and heights of glyphs as designed, then we will have to constrain the scaling mechanism accordingly. Not doing so may or may not represent the designer’s intent.

For the second set of proportionality factors, I’ll consider the under- or overshoots of round glyphs vs. the baseline and cap height (or x-height), or the weight of (vertical) round strokes vs. the weight of (vertical) straight strokes. Informally speaking, both situations have in common that one of the numbers is “a little bit more” than the other number.

For instance, the top of the UC ‘O’ is a little bit above the top of the UC ‘H.’ Likewise, unless it’s a monoline font, the widest part of the (vertical) round stroke of the UC ‘O’ is a little wider than that of the UC ‘H.’ More formally

c = a + Δa

where a is e.g. the top of the UC ‘H’ (the cap height), c is the top of the UC ‘O,’ and Δa is the small amount by which the round glyph overshoots the cap height. To be scalable, math requires that

f(a + Δa) = f(a) + f(Δa)

which I get from the general case by choosing b = Δa and λ = μ = 1. This tells us that I should be able to take a number a, add “a little bit” Δa, and round the result, and that this should be the same as taking that same number, round it, and then add “a rounded little bit.”

Chances are that at small type sizes and device resolutions said “little bit” will have rounded to nothing. Hence, adding it to anything else will add nothing. For instance, if the cap height is 10 pixels, and the overshoot is about 2% of the cap height, Δa amounts to 0.2 pixels, which rounds down to 0. This is good, because if we have only 10 pixels for the cap height, there is no way we can render a 2% overshoot.

However, to get this, once again we have to be very careful just exactly when and what we round. Let’s say the cap height was 10.4 pixels before rounding, and thus the 2% overshoot amounts to 0.208 pixels. Adding the overshoot to the cap height we get 10.608 pixels for the top of the UC ‘O.’ Now round the cap height, and separately round the top of the UC ‘O,’ and one rounds to 10 while the other rounds to 11 pixels.

Clearly, this is not good. 11 vs. 10 pixels renders a 10% overshoot. Most likely, this does not represent the designer’s intent. So, at the risk of repeating myself, we really have to be very careful when and what we round.

The undershoots vs. the baseline, and similarly the weight of (horizontal) round strokes vs. the weight of (horizontal) straight strokes should be the same math, apart from a negative sign

f(a − Δa) = f(a) − f(Δa)

which reduces to the previous equation by choosing μ = −1. After all, subtracting 2% should be the same as adding “negative 2%,” right?

Well… this depends just exactly what the rounding operation does. Let’s assume it rounds up for .5 and above, else down. Let’s assume that the cap height is 25 pixels, hence our 2% overshoot winds up at 25.5 pixels, and accordingly our 2% undershoot winds up at −0.5 pixels. Strictly speaking, rounding up would take this to 26 and 0 pixels, respectively. The result is that the overshoot shows up, while the undershoot doesn’t!

This may or may not be what you wanted. You may have expected the undershoot to round “up” to −1. If you are surprised at all that −0.5 would round up to 0, repeat the above exercise for the “baseline” and “x-height” of superiors. Now both numbers are positive and, if they are “something-point-five,” they will round up!

Consequently, beyond being very careful about what and when we round, we must know the exact nature of the rounding operation in the first place! This knowledge will have to be respected in the way we constrain the scaling mechanism, if the symmetry of under- and overshoots or similar situations are at all a priority. Absence of symmetry may or may not represent the designer’s intent.

By now, we have pretty much covered the simpler cases of restoring scalability of font rendering. We have seen how we get small features to render at all, how we get equal stems and crossbars to render equally, how we may have to tackle various proportions, and how we might have to work on the symmetry of under- and overshoots. Much like the math behind the under- and overshoots, we may have figured out how to have the vertical round strokes of the UC ‘O’ render a little “thicker” than the vertical straight strokes of the UC ‘H.’

In practice, though, things may not be quite as simple as separating the round from the straight strokes. For instance, the vertical round part of the top bowl of an UC ‘B’ may not be quite as “thick” as the bottom one. The vertical round part of the UC ‘S’ may in fact be thinner than the vertical straight strokes of the UC ‘H.’ Yet it would seem reasonable to assume that rendering like strokes with like numbers of pixels or samples would best represent the intent of the design.

Moreover, a font may have been converted from another format. For instance, a font may have been digitized in Ikarus but now is printed in Postscript, or it may have been designed in Fontographer but now is displayed in TrueType. Amongst other, these font format conversions will have to scale the outlines between the different internal resolutions (or “em-squares”) used by the respective formats.

We have already seen what happens with equal stems when they are scaled. They may end up unequal, albeit by merely one design unit of the targeted format. In other words, 2 stems may have been designed with “dead-on” equal weights in one font format, but after conversion to another font format they become unequal.

Accordingly, trying to “stack-rank” the various stroke weights, and asserting that they render “a little thicker” if they appear to be designed “a little thicker,” and recursively throughout the entire (!) ranking, might get a bit tedious, to say the least.

Instead, after systematically inspecting all occurring stroke weights, we may identify clusters of “substantially identical” stroke weights which should render with “substantially identical” pixel or sample counts. Following is an example:

Histogram of the (vertical) stem weights of all uppercase characters of Times New Roman. The horizontal axis represents the weights of these stems, while the vertical axis illustrates how often they occur.

In the above histogram of the (vertical) stem weights of all uppercase characters of Times New Roman it is fairly easy to see 3 clusters of “substantially identical” stem weights, labeled ‘A,’ ‘B,’ and ‘C.’ They represent the following sets of stems:

Cluster A is the dominant cluster. It represents all the vertical “straight thick” stems.
Cluster B represents most of the vertical “round” stems, including the characters ‘B’ (bottom bowl), ‘C,’ ‘D,’ ‘G,’ ‘O,’ ‘P,’ ‘Q,’ and ‘R.’ The top bowl of the ‘B’ lies between clusters A and B, while the 2 “round” strokes of the ‘S’ show up to the left of cluster A.
Cluster C represents all the vertical “straight thin” stems. This includes the “thin” stems of the characters ‘M,’ ‘N,’ and ‘U.’

In contrast to , instead of computing a rounded stroke weight from 2 equal stroke weights as input, we use all stroke weights of the same cluster as input to compute a rounded stroke weight. Subsequently, we use this stroke weight as “surrogate” whenever applicable. Specifically, when constraining the scaling of a stroke, we will use the corresponding surrogate instead of the actual stroke weight—the type size and resolution permitting.

The above example yields a number of surrogates small enough to apply either of the previous strategies for restoring scalability to the surrogates, instead of the individual stroke weights. For instance, cluster C could be “tied” to cluster A by “proportional scalability” (cf ), while cluster B might be “tied” to cluster A by “relative scalability” (cf ). Within a cluster, the individual stroke weights will be constrained “by association”—again the type size and resolution permitting.

Notice that the above example only considers vertical strokes of uppercase characters. Horizontal and diagonal strokes have been omitted to make it easier to understand the argument. In practice, we would have to factor in horizontal and diagonal strokes, of course—whether by inclusion in the same set of clusters or in separate sets of clusters.

For the same reason lowercase characters and other groups of characters have been excluded. In practice, we would have to factor in all groups of characters, along with their “ties” within and across character group boundaries (cf also ). However, this adds a degree of complexity which is beyond the scope of this website.

Up until now, we have looked at increasingly intricate situations where we somehow managed to restore a certain degree of scalability of the font rendering process. This has worked—or sort of worked—because we were operating within the limited precision of pixels or samples, because very few features were involved (straight stem vs. round stem, cap height vs. overshoot), and/or because we have traded scalability of some features at the expense of others.

For instance, if full pixels are all we have to represent a 1.4 pixel wide straight stem, then a 1 pixel wide rendition of that stem is as precise as it gets. Not particularly precise, but the best we can do in that context. Likewise, rendering a 1.6 pixel wide round stem with 1 pixel only may be “relatively” precise in the sense that it is our best shot at making it “a little wider” than the straight stem. The most precise way to approximate this “little bit wider” is to render this difference with 0 pixels.

However, by rendering the 1.6 pixel wide round stem with a single pixel only, we have accumulated a rounding error of 0.6 pixels. Outside of the above context, this would be bad, because it is not even as good as the limited precision of full pixels would allow. But within the context of trying to make the round stem “a little wider” than the straight stem, this may still be tolerable. The keywords here are context and tolerable.

Let’s repeat the above exercise assuming we had a chance to render detail with the precision of quarter pixels (cf ) instead of full pixels. Accordingly, instead of deciding whether the 1.4 pixel wide straight stem should be 1 or possibly 2 pixels wide, we can now pick “intermediate” values like 1.25 or 1.5 pixels. Among these two, the closer rendition would be 1.5 pixels. Likewise, the “little bit” by which the round stroke exceeds the straight stroke can be rendered by 0.25 pixels. This would make the round stroke a total of 1.75 pixels wide.

Yet, even with the precision of quarter pixels, we have again accumulated a rounding error. 1.75 pixels are not the same as 1.6 pixels. While the rounding error has diminished to 0.15, outside of the above context we could have done better still: 1.6 is closer to 1.5 than to 1.75. But that would not have allowed us to render it a “little bit wider” than the straight stem. Hence, once more, in this context the larger error may be more tolerable.

We could repeat this exercise with increasingly higher rates of fictitious precision. Eventually, we may find that, at least for this particular example, we wind up with no rounding error whatsoever. There are two ways to get there: sheer chance or infinitely high precision (i.e. infinitely small pixels or samples). Either case would be called exact. Anything less is at best precise.

Granted, this may sound like splitting hair, but precise is not the same as exact. Depending upon the degree of precision, it may be close, very close, or not so close, but unless it is “dead-on,” it is not exact. That last bit of precision, however small the error may be, can sometimes make or break the difference.

To give you an idea, I have come across control points that were off by 1/5 of 1/64 of a pixel (that is, 1/320 of a pixel), yet this presumably invisible error got me an extra pixel on an italic stroke! Talk about Raster Tragedy…

It seems as if—no matter how precise we get—there is always a residual chance for error. That’s the simple reality! Thus, ultimately, digital fonts will always remain non-scalable. However, that’s neither a good enough reason to give up on scalability altogether, nor to systematically ignore rounding errors. The engineering challenge is to determine, when we can afford a little error here and there, and when we should try to avoid these little errors at all cost.

This gets me back to the keywords context and tolerable. In the context of a single character, rendered in “black-and-white” pixels, it may very well be tolerable to “squeeze” the above round stroke into a single pixel. It may be equally tolerable that the advance width of said character may be off by 0.5 pixel, or even more.

Think about trying to render equal stems, equal counter-forms, and substantially equal left and right side-bearing spaces on a lc ‘m,’ and you’ll get the idea, why more than 0.5 pixel may (have to) be tolerable (cf lc ‘m’ illustrated in ).

Conversely, in the context of multiple characters, these little rounding errors can add up, and very quickly so. Legend has it that in the past, clever—if somewhat criminal—computer programmers working at large banking institutions have become rich by accumulating similar little rounding errors. They simply transferred all the round-offs to their own bank accounts…

Clearly, this is not tolerable, albeit in a slightly different context. In the context of “zoomable” applications and scalable operating systems, the accumulated round-offs may not have quite that dramatic a side-effect. But they may and will (!) add up (cf text layout example illustrated at the end of ). Hence it will be necessary to keep a keen eye firmly on the “big picture.”

*****

To my understanding, the above presentation is the simplest way to explain the sampling process of digital fonts. If I tried to make it any simpler, taking out the math, the numbers, and the hair-splitting for good measure, the explanation would remain vague. Hand waving comes to mind. We would continue to grope in the dark, like trying to grapple with the sampling beast blindfolded. But with the “nuts and bolts” understood, the following plain English summary of the four fundamentals shouldn’t sound vague anymore:

There are not enough pixels and the ones we have are too large.
We have to exaggerate features to render them with such pixels.
We have to distort some features to force others to be tolerable.
All these workarounds don’t work equally well in all contexts.

We will explore the potential of “smaller pixels” in the next two chapters, with plenty of illustrations on how to exploit this to reduce exaggerations and distortions. Later, we will discuss different kinds of workarounds in different contexts, again with many illustrations. Hopefully, altogether, this will help to develop a healthy intuition for the cause of a wide range of Raster Tragedies.

← previous chapter, ↑ top, next chapter →