The Raster Tragedy at Low-Resolution Revisited:
Opportunities and Challenges beyond “Delta-Hinting”

In this chapter we will finally get to discuss the important questions. We have established the fundamentals, a range of opportunities, and a number of challenges. With this “infrastructure” we now should be ready to reason with reason.

CAUTION: Many illustrations in this chapter are original size renditions of characters. They have been prepared for an LCD screen with a vertical RGB sub-pixel structure. To view these illustrations the way they were intended to be seen, please be sure to double-check your computer’s settings per the check-list introduced in , if you haven’t done so already.

“Hinting” is a font industry term that loosely refers to “improving the appearance of small text at low resolution.” Many people seem to use this term quite confidently. But when scratching below the surface, “hinting” starts to mean different things to different people.

Some of the concepts may have been developed when bi-level (“black-and-white”) rendering was the only method available, while some concepts may simply reflect the first encounter with this mysterious activity—and hence terms like “super-hinting” or “delta-hinting” and what not.

Frankly, I find “hinting” a rather silly word. Seriously! It doesn’t begin to explain the process—particularly when using an imperative programming language like TrueType. That’s why I keep putting it in quotes. But I also found that as soon as I try to use what I think would be a more appropriate term, I seem to lose my audience. Hence “hinting” it is—but here is my definition of “hinting:”

Definition: “Hinting” designates the act of constraining the scaling of outline fonts to yield sets of samples that best represent the type designer’s intent at any point size, on any raster output device, having any device resolution, using any rendering method, in any software context, and for any end-user.

Compared to the 1-dimensional world of bi-level (“black-and-white”) rendering, where the ppem size was all you needed, the above definition calls for targeting no less than 6 independent dimensions. Specifically:

Today’s anti-aliasing methods for font rendering may have “smoothed” the pixilation, but they don’t seem to have made “hinting” all that much simpler. Anti-aliasing comes with many opportunities, but implementing and managing the current and future diversity surely represents a bit of a challenge.

The end-users try to read text off of a computer screen. This text is presented to them as sets of pixels. But the end-users rarely understand these pixels as the combination of a particular “hinting” strategy with a particular rendering method. Rather, it’s “just” text. It may look “blurry” or “pixilated,” it may show “color fringes” or look like a “computer font,” but most of the time it really is “just” text.

End-users having “issues” with their text experience may have a hard time getting their questions answered. There are many discussions in the blogosphere with plenty of trolls and fanboys contributing their personal opinions and anecdotal evidence. Some websites show a fair level of education and insight but miss the importance of “hinting” (cf the discussions below) or factors beyond the font and software makers’ control (cf chapter , particularly ).

But even if you wind up on an educated internet forum, when it comes to “hinting,” there is a fair chance that its members are, at times, talking at cross purposes, if not trying to avoid the elephant in the room. There are simply too many variables.

Is it the “font smoothing” that makes text “smooth” but “blurry” or is it the “font-hinting” that makes text “sharp” but “hammered” into pixel boundaries? Why does sub-pixel rendering (aka ClearType) look more pixilated in Windows than sub-pixel rendering (aka Quartz) on a Mac? Or sometimes, anyway? Does “hinting” put “clear” into ClearType or does “hinting” compromise font design? What does “hinted for ClearType” mean, in the first place?

These are all good questions. Yet when it comes right down to it, beyond the challenges discussed in chapter the answers lie in the individual combinations of “hinting” strategies and rendering methods. The font makers choose the “hinting” strategy. You, the end-users, don’t get to choose that—so far, anyway (but cf ). At least you get to choose the rendering method—unless the application software tries to second-guess your preferences (cf ).

The following table illustrates a range of combinations of “hinting” strategies and rendering methods.

Rendering Method→
↓ “Hinting” Strategy
Strategy 1
Strategy 2  
Strategy 3  
Strategy 4  
 
Strategy 5  
 
Strategy 6      
Strategy 7      
Strategy 8      

“Hinting” vs Rendering: A few key characters from the font Calibri, “hinted” with a range of strategies (top-down) and rendered with different methods (left to right).
⇒ Be sure to use the above buttons to change the type size, device resolution, and gamma correction, and to hover your mouse over any of the illustrations to enlarge it to 400%
(Notice that this is not the RTM version of Calibri)

For the above illustration I have deliberately chosen characters with straight vertical, horizontal, and diagonal strokes. Out of the 3 horizontal strokes or bars, two are adjacent to the baseline and x-height while the 3rd is “in the middle.” Likewise, there are round strokes with under- and overshoots. The “hinting” strategies are the following:

  1. Weights of all strokes are rounded to full pixels. Vertical and horizontal strokes are positioned on pixel boundaries, and so are baseline and x-height. For any of the above point sizes, no under- or overshoots are rendered. Diagonal and round strokes are adjusted to obtain an acceptable pixel pattern—for bi-level rendering only.
  2. Weights and positions of vertical stems are generally rounded to fractional pixels (sample boundaries) but their positions are adjusted to render them with a “black core.”
  3. Weights and positions of vertical stems are rounded to fractional pixels without further adjustments.
  4. Weights and positions of horizontal bars are generally rounded to fractional pixels (sample boundaries) but their positions are adjusted to render them with a “black core.”
  5. Adjustments to render with a “black core” are turned off for horizontal bars “in the middle.”
  6. The advance width is rounded to the nearest sample boundary, instead of the nearest pixel boundary. This requires text layout by fractional pixel positioning (“sub-pixel” positioning). It chiefly affects inter-character spacing, but not individual character renditions per se. Notice, however, that one and the same character may be rendered with different “color fringes” at different positions within a line of text (cf ).

    In theory, fractional pixel positioning could be implemented for full-pixel anti-aliasing, as well. I simply never got around to doing so in VTT, hence I can’t show you. For sufficiently small combinations of point sizes and device resolutions I would expect this to look even “blurrier” than when implemented in [hybrid] sub-pixel anti-aliasing.

    Asymmetric sub-pixel anti-aliasing can be laid out by fractional pixel positions while maintaining full-pixel positions for all horizontal “features.” For sufficiently small type sizes and device resolutions this need not reveal too much of its asymmetric nature. It simply constituted a discontinuity in the progression I wanted to illustrate, which is why I left it out.

  7. Both baseline and x-height are rounded to the nearest sample boundary, instead of the nearest pixel boundary. This has the effect of turning off rendition with a “black core” for the remaining horizontal bars.

    If the baseline of lowercase characters is at 0, which is a typical case, this appears to have an effect only on the parts of the glyphs adjacent to the x-height (intrinsic asymmetry as a result of the origin of the coordinate system located inside the em-square, as opposed to [well] outside thereof).

    This could be applied to full-pixel anti-aliasing as well, but without fractional pixel positioning it makes little sense, hence I left it out.

  8. The constraint to render any of the strokes with a minimal distance of 1 pixel is turned off.

    This can be applied to full-pixel anti-aliasing as well, regardless of fractional pixel positioning, but like asymmetric sub-pixel anti-aliasing laid out by fractional pixel positions it constituted a discontinuity in the progression I wanted to illustrate, hence I left it out.

    In theory, asymmetric sub-pixel anti-aliasing could turn off this constraint for (vertical) stems only. However, for sufficiently small type sizes and device resolutions this leads to “reverse stroke design contrast” (cf end of ): The horizontal bars will be rendered heavier than the vertical stems even though they were designed otherwise. This is unlikely the best representation of the designer’s intent.

The above list of “hinting” strategies is loosely ordered by how “true” these strategies are to the original outline. The higher the number the closer to the design it is. Notice however that not all rendering methods can “keep up” with all “hinting” strategies.

Obviously, bi-level rendering cannot use fractional stroke weights and positions. Likewise, asymmetric sub-pixel anti-aliasing cannot use fractional stroke weights and positions in the direction parallel to the long side of the LCD sub-pixels. Hence the non-applicable combinations of “hinting” strategy and rendering method have been omitted.

Now, if the exclusive goal is to be as “true” to the original outline as possible, even strategy #8 will not be good enough. In fact, even turning off “hinting” altogether (turning off all TrueType instructions) will not be perfect. Recall that TrueType rounds its “original” coordinates to the nearest 1/64 of a pixel. This may be close, but not necessarily “true” to the original outline—ignoring for now the unfiltered downsampling this incurs (cf ).

At the same time, attempting to “hint” as close to the original outline as possible takes the combination of “hinting” strategy and rendering method as far away as possible from the sharply defined high-contrast letterforms of printed text. The smaller the type size and screen DPI the farther it gets.

Now, being “true” to the original outline seems easy enough to define. Turn off all “hinting” and expect the rasterizer to be precise “enough” with scaling the original outlines (cf and ).

But just what is “sharply defined high-contrast?” Practically, a row or column of black pixels on a white background is as sharply defined a high-contrast line as an LCD can produce. It doesn’t get any better than that. Inconveniently, it can only go downhill from there. There are 3 contributing factors:

Beyond the above 3 contributing factors, overcorrecting gamma (due to non-defeatable software gamma correction, cf ) can further “help” to reduce perceived contrast, and so do “designer” websites that typeset the fine print in some light shade of anthracite against some pastel background and the like.

There is no way around it: Plain bi-level rendering of black text against a white background gets you the sharpest pixels, the “sharpest” strokes, and the highest contrast. At the same time this also gets you the most pixilated way of rendering text—not necessarily an added benefit.

It therefore stands to reason that we revisit the combinations of “hinting” strategies and rendering methods. But this time around, instead of pursuing the exclusive goal of “trueness” to the original outlines, we’ll have a look at it from the aspect of how much sharpness and contrast we are willing to give up for an acceptable level of smoothness.

Fact is that beyond bi-level rendering only a box filter applied to full-pixel anti-aliasing can render a single black pixel as black. Any “bleeding” filter will render a single black pixel as “less than black.” This applies in particular to any of the filters used in sub-pixel anti-aliasing. “Bleeding” filters will compromise sharpness and contrast from the very outset.

Accordingly, let’s have another look at “plain gray-scaling” with a box filter (cf ). The following table illustrates the same characters as above, but rendered exclusively with full-pixel anti-aliasing and a box-filter:

Rounding to →
↓ Type size
1
pixel
1/2
pixel
1/4
pixel
1/8
pixel
6 point
7 point
8 point
9 point
10 point
11 point
12 point
14 point
16 point
18 point

“Hinting” vs Rendering: A few key characters from the font Calibri, all rendered in full-pixel anti-aliasing using a box-filter, shown at different point sizes (top-to-bottom), using different rounding “granularities” for stroke weights and positions (left-to-right), and always adjusting stroke positions to render strokes with a “black core.”
⇒ Be sure to use the above buttons to change the device resolution and gamma correction, and to hover your mouse over any of the illustrations to enlarge it to 400%
(Notice that this is not the RTM version of Calibri)

The above table uses exclusively “plain gray-scaling” with a box filter, essentially combining strategies 2 (in x-direction) and 4 (in y-direction) as illustrated in , but with different rounding fractions (cf actual rounding fractions in ). The following fractions are used (from left-to-right):

Notice that rounding stem edges to full pixels represents “Standard Font Smoothing” as implemented in Windows. In theory, every font in Windows could be rendered like that. In practice, the introduction of “gray-scaling” faced a range of problems not unlike those discussed in , especially .

There weren’t any “explosions” because the resolution wasn’t increased asymmetrically as in ClearType. Rounding of stroke weights and positions was kept at full pixel “granularity,” probably because this was the easiest way to guarantee a “black core” without “re-hinting” all the fonts. But plenty of delta instructions wreaked the same kind of havoc as they later did for ClearType.

Maybe the real problem wasn’t well-understood at the time, maybe there was not enough time to understand the problem before getting the “font smoothing feature” shipped in Windows 95. At the time, the “work-around” was to introduce a set of “flags” instructing the operating system to deliberately revert to “black-and-white” rendering at ppem sizes at which the font maker may have deemed the “hard-wired black-and-white hints” unfit for “gray-scaling”.

In turn, this may have fueled the misconception that “gray-scaling doesn’t work at small ppem sizes.” It’s not the rendering method “gray-scaling” that doesn’t “work,” nor the box-filter used in the process. Rather, it’s the “hinting” that wasn’t prepared for anything but “black-and-white pixels” and the aforementioned “flags” that try to “hide” the short-comings of the “hinting.”

The most obvious of these shortcomings are indeed the delta instructions that were completely “hard-wired” into the code, as previously illustrated in for sub-pixel anti-aliasing, and repeated below for full-pixel anti-aliasing.

Full-pixel anti-aliasing applied to outlines that were “delta-hinted” and “hard-wired” for bi-level rendering. Notice the unnatural “thinning” caused by the delta instructions (Times New Roman UC ‘W,’ 12 pt, 96 DPI, RTM version of the font, but bypassing the flags that instruct the operating system to revert to bi-level rendering)
⇒ Hover your mouse over the above illustration to revert to bi-level rendering

Once the delta instructions are out of the code path, and once the stroke weights are no longer forced to full pixels, “plain gray-scaling” performs remarkably well, as illustrated below:

Full-pixel anti-aliasing applied to outlines that were “re-hinted” without “delta-hinting” nor forcing stroke weights to full pixels
⇒ Be sure to try the above buttons to select different minimum distance constraints for de-emphasizing serifs and hairlines and observe the effect they have at different point sizes: At 9 pt and 96 DPI, setting the minimum distances for strokes and serifs to 1/2 and 1/4 pixels has about the same effect as “turning off hinting,” while doing the same at 18 pt and 200 DPI doesn’t have much of an effect

Naïvely applying “gray-scaling” to “black-and-white outlines” is not unlike using ClearType on “black-and-white outlines” without the “jury rigged” rasterizer (cf ). It looks as if this doesn’t “work,” whether it is “gray-scaling” or ClearType, but in either case it is the “hinting” that doesn’t “work,” not the rendering method. “Hinting” makes or breaks the difference!

When implemented intelligently, full-pixel anti-aliasing has a number of advantages that may make it a viable alternative to sub-pixel anti-aliasing:

All of the above comes with a memory footprint that is a fraction of that of hybrid sub-pixel anti-aliasing. Full-pixel anti-aliasing, as implemented in Windows as Standard Anti-Aliasing, gets by on 4 bits per pixel for smooth characters in both x- and y-direction, while hybrid sub-pixel anti-aliasing requires 32 bits per pixel.

Now, whatever the rendering method and memory footprint, it sure looks like “hammering” the strokes into pixel boundaries yields the “sharpest” strokes and the highest rendering contrast. Depending on the type size, device resolution, and rendering method, this can make or break readability and affect accessibility. Following is an example I came across in Google Maps:

Caption below a historical landmark in Switzerland (Google Maps)
left: actual screenshot, right: cleaned up background
⇒ Hover your mouse over either of the illustrations to enlarge it to 400%

On my 120 DPI laptop, the size of the original illustration measures about 1/2 inch (1 1/4 cm) wide. I estimate the font size at 9 or 10 ppem, equivalent to 5.4 or 6 pt, or a little over half the type size that I would be comfortable to read. If I didn’t know what “stone” this referred to, I wouldn’t know for sure what I was looking at!

To illustrate what went wrong, I’ll try to recreate the above example with Tahoma at 9 ppem. I don’t really recognize the font, hence my “re-enactment” won’t generate the exact same set of pixels, but hopefully it’ll be close enough to illustrate the point.

“Hinting” on/off →
↓ Rendering Method
off on

“Hinting” vs Rendering: “Schillerstein” rendered in 9 ppem Tahoma with various rendering methods (top-down) and “hinting” off (left) or on (right)
⇒ Be sure to use the above buttons to change the gamma correction, and to hover your mouse over any of the illustrations to enlarge it to 400%

Surprising what “hammering” the strokes into pixel boundaries (aka “hinting”) can do, isn’t it?

For my combination of screen DPI and visual acuity—given the challengingly small type size—my “nod” goes to full-pixel anti-aliasing, with stroke positions and weights rounded to full-pixel boundaries at this ppem size, and rendered with a box-filter (, column “hinting” on). It seems to provide a “stronger signal” to my brain than bi-level rendering (), and a “sharper signal” than ClearType ().

If “hinting” were not an option, my “nod” would go to ClearType, simply because at this particular ppem size, ClearType happens to provide at least some level of contrast between the various ‘i’ and ‘l’ characters—color fringing and all. A bit “fuzzy” for sure, but if nothing else, it makes this word “decipherable” (aka “accessible”). Your mileage may vary.

Once more, there is no way around it: Color fringes are an unavoidable side-effect of LCD sub-pixel anti-aliasing. But “hinting” may be able to tame them. Different color fringes are generated depending on the sample boundary on which a stroke’s edges fall—or on which they are “hinted” to fall. The following table illustrates this for a 1 pixel wide (vertical) stroke (cf ).

1/2 px 1/3 px 1/6 px ±0 px +1/6 px +1/3 px +1/2 px

A 1 pixel wide, nominally black stem, rendered in ClearType, and positioned at increments of 1/6 of a pixel

For instance, the stems on the far left and right are positioned with an offset of ±1/2 pixel relative to the full-pixel boundary. This yields a combination of the colors “tango orange” and “royal blue.” Art class may call this combination complementary colors. When juxtaposed, this color combination may seem to “vibrate.”

The reason behind some of these “vibrations” is chromatic aberration. When light passes through the lenses of your eyes, it gets focused on the retina, much like light passing through a photographic lens and getting focused on film—or the light sensor of a digital camera. However, unless the lens has provisions to correct this, different colors or wavelengths get focused differently—like passing light through a prism.

When looking at a colorful scenery, red light gets focused behind the retina, as if we were far-sighted to red light. By contrast, blue light gets focused in front of the retina, as if we were near-sighted to blue light. Now, lenses suitable for color photography are designed to correct these aberrations to some degree. Look for designations like “APO” which stands for “apochromatically corrected.”

Alas, our eyes don’t seem to have this type of correction “built-in.” Depending on the context, we may see colors “popping out” of flat artwork, as if there was some kind of “3D-effect” (chromostereopsis). If the part of the image we’re trying to focus on is mainly made up of orange and blue, juxtaposed as above, the (ciliary) muscles responsible for focusing our lenses may frantically try to “lock-on” to both. In the process, focus may alternate between one and the other, and hence the perception of “vibrating” colors. As far as I understand, this can provoke symptoms like headaches and dizziness.

Even if you don’t experience any “vibrating” colors, you may still be uncomfortable with the low level of contrast between the various ‘i’ and ‘l’ characters above as rendered in ClearType. In turn, this begs the question, just how low is this contrast? In the presence of all these colors, can the level of contrast be quantified? How does it compare to text rendered in plain bi-level but at some level of gray against a background in another level of gray, instead of black against white?

The World-Wide Web Consortium (W3C) has an answer. They propose a metric that permits to map the perceived contrast of any pair of text and background colors to a number (“Contrast Ratio”) in range 1:1 through 21:1. On their scale, 1:1 corresponds to no contrast at all, such as gray text against an identical gray background [sic], while 21:1 corresponds to maximum contrast, achievable by black pixels against a white background. A minimum contrast ratio of 7:1 is recommended for “AAA Rating” for readability of text smaller than 18 point.

To put the low level of contrast between the above ‘i’ and ‘l’ in context, I’ll repeat the example as rendered in ClearType—for what ClearType deems black text against a white background. I’ll follow this by bi-level text rendered in blue against an orange background, corresponding to the color combinations produced by ClearType on the letters ‘i’ and ‘l.’ Last but not least, I’ll follow this by bi-level text rendered in gray against a gray background, with the levels of gray chosen by both “straight” desaturation, and desaturation to yield the same contrast ratio as per the W3C metric:

  ClearType
black text
on white
background
Bi-Level
matching
CT colors
of “ill”
Desaturated
from matching
CT colors
Desaturated
matching
CT Contrast
Ratio
Complete
Caption
Focusing
on “ill”
Brightness Difference
(≥125 for AAA Rating)
73.5 42 56
Color Difference
(≥500 for AAA Rating)
382 126 168
Contrast Ratio
(≥7:1 for AAA Rating)
2.16:1 1.70:1 2.16:1

“Hinting” vs Rendering vs Contrast: “ill” rendered in 9 ppem Tahoma in “black” ClearType on a white background, along with 3 strategies to render a matching level of rendering contrast in bi-level. None of the above examples has been subjected to any gamma correction in software (but cf , #6).

For the above table I have deliberately focused on the part of the previous example that was hard or impossible to read. “ill” rendered in bi-level with “royal blue” text against a “koromiko yellow” background produces the same color combinations as does “hinted” ClearType rendered in “black” against a white background.

For these color combinations the W3C metric yields brightness and color differences that all fail the minimum recommended threshold for contrasting colors. Specifically, when using ClearType to render nominally black text against a white background, the W3C metric yields a contrast ratio of 2.16:1 on the characters “ill.” The way the font is “hinted” and when rendered in ClearType, this misses the recommended minimum contrast ratio of 7:1 by a long shot. Be sure to recall that on the W3C scale the maximum contrast ratio is 21:1.

This example shows that even for black text against a white background, ClearType on its own improves contrast very little over bi-level rendering of middle gray against… middle gray!

Remember that this is text that is nominally rendered in black against a white background. Any use of color can further decrease this contrast ratio. Moreover, notice that these illustrations were taken before software gamma correction gets a chance to “wash out” contrast even more (cf ).

The reason why you may still find these “low contrast colors” easier to “decipher” than either of the desaturated gray level pairs is because they are, after all, different colors, however close their brightnesses may be. Assuming that your color vision is not overly impaired and that you are actually reading this on a color screen, this fact may be reflected by the “Color Difference” figures. While they fail the W3C threshold, they are still much higher for color-on-color than for gray-on-gray in this example. As to the “Color Brightness” figures I will have to defer any explanation to someone with actual insight into the matter.

Now, since the “color combinations” are determined by the sample boundaries on which the edges of the strokes are positioned, deliberate positioning (aka “hinting”) can help to reduce the effect of “vibrating” colors and increase the perceived contrast ratio. The dilemma of course is that this comes at the expense of restricting stroke positions to the one or two sample boundaries that minimize the effect of “vibrating” colors and maximize contrast. Maybe this is not “hammered” into pixel boundaries, but at least forced into some kind of a “straitjacket?”

What to do? Keep it “true” to the outline but “fuzzy?” Or “hint” it into a “straitjacket” to make it “sharp?” My conclusion is this:

If you want to read 9 ppem text, as opposed to merely look at an attempt to break through the Nyquist limit, “hinting” outlines into some kind of a “straitjacket” is unavoidable.

We have looked at the 9 ppem “Schillerstein” above, and I encourage you to draw your own conclusions. Granted, 9 ppem may be somewhat of an extreme example, but sadly, it’s not all that rare.

Now recall the Sampling Theorem introduced in . In simple terms the theory says that we need at least 2 pixels to render a feature or else we’re trying to break through the Nyquist limit. Accordingly, once we’re at a ppem size at which the vertical stems are 2 pixels or more, we should be fine—at least for the stems.

With 2 pixels to work with, anti-aliasing—including “plain gray-scaling”—can do a fine job rendering the stems. Strategy #3 will yield a “black core” without adjusting the stems beyond the nearest sample boundary. Rounding to sample boundaries, in turn, is required if you want to render like stems with like sample counts, and if you want to avoid the side-effects discussed in . But compared to other raster tragedies, rounding to sample boundaries are relatively minor distortions.

Keep in mind that the vertical stems likely are the “heaviest” features. Horizontal crossbars may be considerably thinner. Likewise, serifs may be even thinner than the crossbars. The 2 pixel limit merely helps with the vertical stems. Hence just because we’re doing fine on these stems doesn’t mean we’re “out of the woods” for the rest of the character. But at least we get the main “structural parts” of Latin characters (including Greek and Cyrillic).

To get an idea what kind of ppem size or combination of point size and device resolution a 2 pixel wide stem translates to, recall . Unless you are fortunate enough to use a screen with a resolution of around 200 DPI, this ppem size will translate to more than 8 pt. Even at 144 DPI, you’re looking at 11 pt to get 2 pixel wide Arial stems.

In turn, this means that “hinting” should enable a smooth transition between the strategy used for stems that are 2 or more pixels wide, and the strategy used for the smallest ppem size we want to read. The keyword here is smooth transition. This is unlike turning off “hints” at sufficiently large ppem sizes and unlike switching to “unhinted gray-scaling” at sufficiently small ppem sizes.

At the beginning of this section I have tried to illustrate ways to transition from (almost) “unhinted” to a “full-pixel straitjacket.” These strategies focused around positioning strokes to render them with a “black core.” But this is certainly not the only way to make this transition. In we have looked at the guidelines by Hersch & al. for “gray-scaling,” and the preceding illustrations may have given you an idea on how to do it for sub-pixel anti-aliasing.

When you decide for your own strategy, keep both the software context and the end-user in mind. For a font used in a print preview scenario the strategy may be different than when used for reading e-mail. Not every end-user responds equally to color fringes and other artifacts of anti-aliasing. This may seem to make it difficult to find the “best” compromise (but cf ).

Now then… remember that question at the beginning of this section? Is it the “font smoothing” that makes text “smooth” but “blurry” or is it the “font-hinting” that makes text “sharp” but “hammered” into pixel boundaries? The answer is… it depends. There are really many variables, and hence many separate questions:

Out of the many variables above, “hinting” may seem the “murkiest.” It’s up to the font makers what strategies they implement, and how consistently they do so. A font that is “hinted for ClearType” may have certain “tweaks” to optimize for “sharp” strokes, at least at some sizes and for RGB sub-pixel structures, or it may all but ignore rendering methods other than ClearType.

Likewise, assuming I understand this correctly, Quartz ignores any “hinting” that may be present. Instead, it favors the most faithful rendition of the outlines at the expense of attempting to make text “clear” at the smallest sizes. This would explain some of the perceived differences between ClearType and Quartz.

“Hinting” can be both very powerful and overpowering. Yet with all the power it has, it may not address the end-users’ preferences or satisfy their requirements with respect to color vision, visual acuity, or plain accessibility. “Hinting” seems “stuck” in the bitmap world. “Hinting” does not come across as being adaptive. We will revisit this issue in , particularly and .

Ever since Gutenberg introduced the Western World to movable type, printing had to target a specific page size. Whether it was “Octavo,” “Folio,” or “Broadsheet,” width and height of the page were known upfront, and hence its aspect ratio. Page layout, margins, text area, and type size were determined for that specific page size.

Conceptually, by the time the movable letters were tightly bound in the “forme,” the position (x- and y-coordinates) of every character or ligature on the page was known. Books and newspapers didn’t have “zoom” menus to get a closer look at the paper, or “buttons” to enlarge the type size for accessibility. Importantly, movable type was “analog.” Unlike today’s screens and printers, paper doesn’t come with a “resolution” (DPI).

By contrast, text layout on today’s computers is a lot more flexible. It harbors the opportunity to make reading accessible by “zooming in” or by enlarging the type size. But even single column layout comes with the challenges of “digital.” Screens and printers do have a “resolution” (DPI), and this resolution is likely different for different devices.

It can be advantageous for text layout to try to be resolution independent. In the context of resolution independent layout, your 2 page résumé prints the same way e.g. on a 360 DPI inkjet printer as it does on a 1200 DPI laser printer. It displays the same way on a 96 DPI desktop or on a 120 DPI laptop screen. If you see 2 pages on screen, you will get 2 pages from the printer. There is no need for multiple print-outs to check if it does indeed fit into 2 pages.

Granted, the pixels are coarser on screen than on the inkjet printer, and in turn on the 1200 DPI laser printer. But at least you can get the same layout, the same line breaks, and the same number of pages. The text doesn’t inadvertently reflow and cause the 2 page résumé to “overflow” to page 3. Just be sure to not allow your word processor to use “printer metrics” to lay out text.

To make text layout resolution independent, the respective software chooses a “high enough” internal resolution and calculates the position (x- and y-coordinates) of every character in terms of this internal high-resolution. If it helps to understand the argument, imagine the internal resolution being the same as the design resolution of the fonts used in the process. For many TrueType fonts this would be 2048 font design units, hence imagine the internal high-resolution at 2048 DPI.

To display or print a page, the software then merely has to translate the high-resolution character positions to the resolution of the individual output device. For instance, for the desktop screen the software may have to translate it to 96 DPI while for the inkjet printer this may be 360 DPI. The high-resolution character positions stay the same, and hence the layout stays the same.

Ahem Haven’t we discussed this problem before? We sure have! In we were looking at pairs of edges delimiting the width of strokes and the strokes being rendered with one pixel count in one instance and another pixel count in another instance. Substantially the same stroke was rendered once with 1 pixel, once with 2 pixels.

It is the same problem here, except that instead of 2 edges of a stem we have the left and right side-bearing points delimiting the advance width of the entire character. Doing the above translation from font design coordinates to screen coordinates, a pair of left and right side-bearings may translate to 8 pixels of advance width in one instance while only 7 pixels in another instance.

At first, this doesn’t seem like that much of a difference, until you start thinking about it a bit more. At the chosen type size, the font maker may have “hinted” the character to an advance width of 8 pixels for good reason. If the layout software “truncates” this to 7 pixels, the side-bearing space will get compromised.

In turn, the compromised character may “clash” with the adjacent character, making the resulting character combination hard to read, or even worse, seemingly changing the text, as illustrated below:

“Hinting” vs Text Layout: A possible “side-effect” of naïve resolution independent text layout is that independent scaling of the left- and right side-bearing points to the targeted device resolution may compromise the advance widths. In this example, a lc ‘c’ gets its advance width reduced by 1 pixel, causing a lc ‘l’ to “follow too closely.” The resulting character pair ‘cl’ reads like a ‘d’ (Verdana 11 pt at 96 DPI)
⇒ Hover your mouse over the above illustration to see the uncompromised advance widths.

Clearly, this is not good. Simply translating the high-resolution character positions to their low-resolution counter-parts is like scaling stems without “hinting” them. We have seen the ensuing Raster Tragedies. Accordingly, we need a better strategy.

Since the font maker has been meticulous with “hinting” for appropriate side-bearing spaces, it could be argued that the best strategy is to simply “lay down one character after another.” At least for the extent of a word, this strategy would thus follow the low-resolution advance widths, as “hinted” by the font maker, and without causing any “clashing” characters.

However, to maintain some degree of resolution independence, the strategy eventually would have to account for the accumulated differences between the low- and the high-resolution character positions. For instance, it could try to “re-sync” at the beginning of the next word. Put another way, the strategy would “cheat a little bit” with resolution independence within a word, but then “make up for it” in the “gap” between consecutive words—or at least hope to be able to do so.

While this strategy and its variations have considerable merit, it is not guaranteed to work always. I have seen combinations of fonts, type sizes, and screen resolutions where the accumulated rounding differences literally “ate up” the entire gap between consecutive words. This­does­not­make­for­a­particularly­pleasant­reading­experience. Refer to the table of average advance width errors in to get an idea how this can happen.

Now, depending on the context of the text layout we may not need resolution independence. For instance, in most email I read, the lines are either as long as the author typed them, or they “wrap-around” at the right edge of the mail window. If the line gets uncomfortably long, I can easily adjust the window. No typography in the layout, but no clashing characters either, and it can be very readable.

Similarly, when the internet was in its beginnings, authors of webpages did not have much control over page layout, margins, text area, and type size. Text could be a little larger or a little smaller than the “default,” and it could be bold, italic, or underlined. But, at the time, already for a simple “first line indentation” I had to use a few “non-breaking spaces” in a row. Again, not much typography in the layout, but still no clashing characters either.

Text was simply “poured” into a text area that essentially comprised the whole window. Yet it was very readable. On a small computer screen you simply got short lines, but you didn’t have to “pan” around to see the ends of the lines. On a large screen you got long lines, but you could shorten them to a comfortable length by adjusting the window. If you needed a larger type size, you could set your text size to “larger” or “largest” and the text was simply “re-poured” into the same text area with a larger text size.

This is the archetypal “lay down one character after another” strategy. It works because there are no restrictions on the layout. It doesn’t have to fit a certain number of characters or words on a line to match the line breaks of the printed output. Likewise, it doesn’t have to fit a certain number of lines on a page.

The page width is adjustable, and the page height is “bottomless.” The text size is adjustable, yet you don’t have to “pan” for the line ends. The text can be very readable—even to “tired” eyes. Just “page down” to read the next portion of the “bottomless” page.

Eventually Cascading Style Sheets (CSS) were introduced. CSS are a means of separating content (text, images, buttons, etc.) from its presentation (character, paragraph, and page attributes, etc.). In principle, this is an excellent idea: You no longer have to “hard-wire” font names and sizes into the text, and you get control over margins, line lengths, line spacing, and a lot more.

In practice, some aspects of CSS may not have been well understood by some web authors. Remember websites that said “This page optimized for 1024×768 pixels?” Web authors had started to “design” their websites. Being exacting about every pixel on the 1024×768 “page,” it looked as if they used the “lay down one pixel after another” strategy.

It may have looked “pixel-perfect” to them, on their screen, with their resolution (DPI), and their visual acuity. But it may have been hard to read for somebody else because their screen DPI is higher or their eyes are older. What was worse, it was often impossible to increase the type size without completely breaking the pixel-perfect layout! That is, if it was possible to increase the type size in the first place!

Next, browsers started to add “zoom” menus to change the 1024×768 “page size.” “Zooming” to 125% “fixed” the pixel-perfect website, “hard-wired” to 96 DPI screens, for displaying it on a 120 DPI laptop. But it also re-introduced the resolution independence problem, albeit from a “bottom-up” perspective, as opposed to the “top-down” perspective we took in the preceding sub-section.

Moreover, to some degree the “bottom-up” approach actually exacerbates the resolution independence problem. For example, assume the pixel-perfect website is set in 10 pt. 10 pt at 96 DPI correspond to a font height of 13 1/3 pixels (ppem). As discussed in , most TrueType fonts request this to be rounded to the nearest integer. Thus the actual font height will be 13 pixels, yielding an actual type size of 9 3/4 pt at 96 DPI.

Now the browser “zooms” the alleged 10 pt type size to 125%, which is 12 1/2 pt at 96 DPI or 10 pt at 120 DPI. 10 pt at 120 DPI correspond to a font height of 16 2/3 pixels (ppem). Again, this gets rounded to the nearest integer, or 17 pixels, yielding an actual type size of 10 1/5 pt at 120 DPI.

In other words, by forcing ppem sizes to be integer, the combination of type size and resolution rounds one way at one resolution (DPI) and the other way at the other resolution—another case of déjà-vu all over again. The result is that, in addition to the 125% “zoom factor” we asked for, the font has contributed its own “zoom factor” of approximately 105% (from 9.75 pt to 10.2 pt).

Notice that this extra 105% zoom factor is solely caused by rounding ppem sizes to integer. There is no further rounding involved in the above consideration. We are not talking about full-pixel vs fractional pixel character positioning. This error would happen even if we could position characters at infinite precision.

Also notice that we are not talking about “side-effects” of “hinting.” To that end, let’s have a look at an individual, knowingly well-“hinted” example:

“Hinting” vs Text Layout: A knowingly well-“hinted” lc ‘m’ about to be “zoomed” by 125% (Verdana, 10 pt at 96 DPI). The blue line on the right denotes the “un-hinted” and “un-rounded” advance width, while the green line denotes the “hinted” and rounded advance width.
⇒ Hover your mouse over the illustration to see the same lc ‘m’ at 120 DPI and compare the green lines. This should reflect a 125% “zoom,” but the effective “zoom” turns out to be 155% (!)

In the process of “zooming” the above lowercase ‘m’ from 10 pt at 96 DPI to 10 pt at 120 DPI, the advance width gets “zoomed” from 11 to 17 pixels. This corresponds to an effective “zoom factor” of 155%—a lot more than we asked for!

Notice that there is nothing wrong with Verdana. That’s the way Matthew Carter designed it—for good reason—and that’s the way Tom Rickner “hinted” it—again for good reason. However, what this example shows is the degree to which the layout must be “flexible” to absorb any errors caused by the “lay down one character after another” strategy.

The problem with the “zooming” is that anything but the fonts in the pixel-perfect layout will scale with the correct 125% “zoom factor.” For instance, if said pixel-perfect layout allocates a certain amount of space for a certain amount of text, say 200×400 pixels (W×H) for a “side-bar,” a 125% “zoom” will grow this “text box” to 250×500 pixels. This is spot-on.

If the “side-bar” were 201×399 pixels, said “zoom” would grow this to 251×499 pixels. Both of these numbers are “off” by 1/4 pixel—not quite spot-on, but probably close enough. Even if you could tell it is 1/4 pixel too tall, this is going to be the least of your worries. Hence, for the sake of the argument, let’s call this the correct 125%.

But if the text inside this “side-bar” grows by more than—or much more than—125%, chances are that the text will no longer fit.

“Hinting” vs Text Layout: A piece of text, rendered in 10 pt and 96 DPI Verdana inside a 200×400 pixel (W×H) text-box (left), and “zoomed” by 125% (right). The text has grown by more than 125%, resulting in an extra line.

The “layout” has just been broken! Granted, there are ways to deal with this situation in CSS, but the web authors need to be aware of it. If they are not, this leaves the browsers with the formidable challenge to… “fix the layout?”

I didn’t think so, either. The real problem, of course, is that with CSS it is easy to mix layout metaphors. Conceptually, part of the “layout” may be scalable (resolution independent), while the remainder is expected to be reflowable (resolution dependent). At the same time there is no easy way to define page layout intent in function of the actual “page size” (window size) and type size (end-user preference). Without knowing the intent of the design, the browsers may have a really tough time trying to fix “accidents” like the above example.

This begs the question, how much scalability (of fonts) is really needed? When you are preparing your résumé to fit on 2 pages, when you are creating the PowerPoint presentation of your lifetime, or when you are typesetting books, magazines, or newspapers for a living, then scalability is the rule, not the exception. You are using a computer to preview what you are going to get from the printer (or projector), but without wasting all the paper for “trial runs.”

The software you are using will match the printer (or projector) layout—have no doubts about that. But it will not “re-hint” your fonts to do so. Instead, you may find the inter-character spacing compromised and end up with “dashing” characters (‘c’ and ‘l’ displayed too closely together such that they look like a ‘d’). Hence the more scalable your “hinting” strategy is, the fewer unwanted “side-effects” you will get, and the more readable this preview will become.

Now, “we’ll-need-a-scalable-layout-mechanism-anyway” is no excuse for squeezing the internet into a fixed page size. Strive for a reflowable layout in contexts where a fixed page size makes no sense! This can be anytime the properties of the final target device (eg the screen) are not known upfront, such as

The “original” internet didn’t force content into fixed page sizes. Take notice, and gradually free today’s internet from fixed page layout “straitjackets.”

Until we get intelligent adaptive layout that always gracefully reflows while respecting the original design intent, let’s have a look at how much font “hinting” can help or hinder scalable, resolution independent text layout. Recall where we have looked at multiple rounding operations that compound to increase the advance width of an 11 point lowercase ‘e’ to 12 point.

To get an idea of the consequences these compounded rounding errors may have on resolution independent layout, I have developed a simple metric that yields a single number per font size, as follows:

  1. Determine (signed) difference between actual advance width as rendered and ideal advance width for entire alphabet (lowercase ‘a’ to ‘z’).
  2. Take weighted average (weighted arithmetic mean) of all ideal advance widths, the weights reflecting the letter frequency.
  3. Take weighted average of all (signed) differences.
  4. Express result from step 3 as a percentage of result from step 2.
I am aware that letter frequencies are different for different languages, and that even within the same language they may be different for different authors. Also, the following data has been compiled by simulating the behavior of the TrueType rasterizer in Excel, not by exercising the rasterizer on the actual font. Hence don’t take this as the last word on the issue.

Nevertheless I think that there are a few trends that can be observed, and semi-formal conclusions that can be drawn. Here is the “raw” data, obtained from the font Calibri:

96 DPI Vista SP1
natural widths
integer
ppem sizes
& widths
fractional
ppem sizes,
integer widths
Vista SP1
fractional
widths
fractional
ppem sizes
& widths
7 pt −0.56% −3.60% 0.00% −2.81% 0.16%
8 pt 7.12% 4.46% 2.16% 3.59% 0.48%
9 pt −0.40% −0.40% −0.40% 0.43% 0.42%
10 pt −0.95% −3.08% −0.15% −2.41% −0.18%
11 pt 2.98% 1.04% 0.32% 2.40% −0.39%
12 pt −1.09% −1.09% −1.09% 0.49% −0.07%
13 pt −1.64% −1.64% 0.01% −1.45% 0.37%
14 pt 0.41% 0.41% −0.34% 1.96% 0.34%
15 pt 2.72% 2.72% 2.72% 0.08% 0.08%
16 pt −2.31% −2.41% −0.75% −1.61% −0.04%
18 pt 1.39% 1.39% 1.39% 0.13% 0.12%
20 pt 0.00% 0.00% −0.73% 1.38% 0.19%
24 pt 0.60% 0.46% 0.46% 0.13% 0.02%
28 pt −1.08% −2.11% 1.88% −0.64% −0.06%
32 pt 1.24% 1.14% −0.56% 0.92% 0.09%
36 pt −0.16% −0.17% −0.17% 0.13% −0.11%

“Hinting” vs Text Layout: Improving the average advance width error of the typeface Calibri with advanced “hinting” in both full-pixel and fractional pixel positioning (“sub-pixel” positioning). A negative number indicates that, on average, weighted by letter frequency, the characters are too narrow, while a positive number indicates they are too wide. The closer to 0 (zero) this number is, the better.

This table illustrates that if you use the font Calibri at 11 point and on a 96 DPI screen, on average the advance widths will be rendered 2.98% too wide. Calibri is the default font in Word since 2007, 11 point is the default type size, and 96 DPI is the most frequent screen resolution. As far as I understand, Word 2007 uses “natural” advance widths (cf ), and so does the above table (column “Vista SP1 natural widths”).

This table also illustrates that if the font “hinting” tries to work around unnecessary compounding of rounding operations, at 11 point said error can be reduced to 1.04% (column “integer ppem sizes & widths”). The ppem size is still rounded to the nearest integer, but the unnecessary re-rounding to the nearest 1/16 has been “by-passed” (cf ).

This table moreover illustrates that if the font “hinting” accepts to handle fractional ppem sizes (cf ), at 11 point said error can be reduced to 0.32% even before committing to fractional pixel positioning (column “fractional ppem sizes, integer widths”). This has the advantage that individual characters can have their stem positions optimized for maximum rendering contrast (“sharp” stems).

Last but not least, this table illustrates that, when forcing integer ppem sizes, fractional advance widths (cf ) don’t necessarily fare much better than integer advance widths (column “Vista SP1 fractional widths”). In fact, the 2.40% weighted average error for “innocent” fractional pixel positioning fares worse than the 0.32% weighted average error for “educated” full-pixel positioning. Surprisingly, at −0.39%, even educated fractional pixel positioning fares worse—if only slightly so—than educated full-pixel positioning.

Following are a few more observations in no particular order:

As you can see, “hinting” can help, and sometimes quite significantly so, but it cannot guarantee perfect scalability. Rounding, whether to the nearest full-pixel or to the nearest sample, continues to harbor an element of chance. For the best end-user experience, the layout software will have to factor in these residual advance width errors.

The irony of all the technical advances and rounding errors is that, in some sense, we have not made any progress since Gutenberg printed the first book in Europe. Many websites continue to make assumptions about the page size, the aspect ratio, the font size, and the resolution (DPI) they are “designing” for. Select a minimum font size in FireFox—a feature much appreciated by yours truly—and be ready for broken layouts!

HTML, XML, CSS, or what have you: There is a need for an advanced concept that may look like “hinting” but has the goal to describe or formalize the design intent of page layout. The intent is not to position some “side-bar” with w×h in pixels at x,y in pixels. The intent is to position it “on the side,” maybe taking up such-and-such percentage of the text column, or fitting it within the height of your or my screen such that you or I can read it on-screen in its entirety without panning—nor printing.

Up to this point we have looked at a comprehensive—if somewhat abstract—definition of “hinting,” along with a plethora of illustrated examples. “Hinting” can do this, should do that, may be able to do so-and-so and what not, and it had better do so on any device, at any point size, for any end-user’s preferences, etc.

If you are in the business of making fonts, which usually includes some form of “hinting,” you may wonder how many delta exception instructions (cf ) this is going to take. The sheer combinatorial multiplicity may seem overwhelming. You may think that nobody is going to pay you for this Sisyphean task.

Therefore, in this section we will discuss ways to go beyond “delta-hinting.”

Consider, for example, the common “hinting” task of positioning a horizontal stroke or crossbar between two reference lines like the baseline and the cap height. A simple and often seen approach to do so is to “interpolate” one of the crossbar’s edges, followed by “linking” across the crossbar, as illustrated below.

A conventional “hinting” approach to an UC ‘B’ (Bodoni, pixels shown for 12 pt at 96 DPI)
⇒ Hover your mouse over the above illustration to add the “hints” (not shown: CVTs and minimum distances; not used: Deltas)

Be sure to hover your mouse over the above illustration to see how the “hinting” is added. Following is the code responsible for positioning the crossbar, along with its graphical representation:

VTT Talk
(Typeman Talk)
/* Anchor (round) points 19 & 4
   on baseline and cap height */
YAnchor(19)
YAnchor(4)

/* Interpolate & round point 33
   on bottom edge of crossbar */
YIPAnchor(19,33,4)

/* Link to point 24
   on top edge of crossbar */
YLink(33,24)
TrueType
SVTCA[Y]
MDAP[R], 19
MDAP[R], 4
SRP2[], 19
IP[], 33
MDAP[R], 33
MDRP[m>RBl], 24

A conventional “hinting” approach to positioning a crossbar, taken from the previous illustration.
Left: Visual representation; top: VTT Talk; bottom: TrueType source code. Notice that to simplify the argument no CVTs were used
⇒ Hover your mouse over the visual representation to show the YAnchor, the YIPAnchor, and the YLink VTT Talk commands.

The above snippet of code looks like a straightforward approach—until you see the results on a range of point sizes:

Results of a conventional “hinting” approach to positioning a crossbar (“bottom-up interpolate-and-link,” Bodoni UC ‘B,’ rendered in ClearType at 6 to 24 pt on 96 DPI)
⇒ Hover your mouse over the illustration to enlarge it to 200%

Notice how—at 6, 8, 10, 13, 15, 17, 22, and 24 pt—the crossbar appears to be “pushed up” above the position at which it was designed. This is not good. At 6 pt this is obvious to any end-user, while even at 22 pt this is very noticeable. Be sure to hover your mouse over the above illustration to enlarge it, and compare the position of the crossbar of 22 pt with that of 23 pt: the proportions by which the crossbar partitions the ‘B’ don’t seem to match the design.

Let’s try this again. The previous example “interpolated” the bottom edge of the crossbar, followed by “linking” to the top edge (“bottom-up interpolate-and-link” approach). The following example will reverse these roles: “Interpolate” the top edge, followed by “linking” to the bottom edge (“top-down interpolate-and-link” approach).

Without repeating the respective snippets of code, here are the results on the same range of point sizes.

Results of a conventional “hinting” approach to positioning a crossbar (“top-down interpolate-and-link,” Bodoni UC ‘B,’ rendered in ClearType at 6 to 24 pt on 96 DPI)
⇒ Hover your mouse over the illustration to enlarge it to 200%

Notice how—at 7, 9, 11, and 12 pt—the crossbar appears to be “pushed down” below the position at which it was designed. This is not all that much better than what we had before.

At this point in the process, if you are in the business of “hinting” fonts on a “deadline,” you may not have the luxury to analyze what is really going wrong here. Instead, you may add delta exceptions to the offending ppem sizes (cf ), and move on to the next character. The end-users won’t know how you got there.

Now imagine sitting in the boardroom, across [the short side of] the table from Bill Gates, giving a quick and casual demo of the graphical user-interface of VTT. At the time I was “hinting” an uppercase ‘B’ because it happens to be the first letter of my first name, as well. Once I got to the “interpolate-and-link” part for the crossbar, Bill interrupted: “Wouldn’t it be smarter if […]?”

At this point in the process I did not have the luxury to not have a good answer ready. But since I had done my homework—which we will get to shortly—the remainder of the demo remained “uneventful.” The point that crossed my mind was that while it is safe to assume he never “hinted” any fonts for a living, Bill Gates got the gist of what’s wrong with “hinting” within a minute or two. Scary—in more than one way.

So let’s talk about the homework. I had previously analyzed how the ad-hoc method of interpolating and linking accumulates rounding errors. The linked edge is subject to a minimum distance constraint and gets rounded on top of the rounding error already incurred by the “rounded interpolation.” Feel free to double-check this analysis, but we’ll need a more systematic strategy regardless.

For this strategy, consider this: The crossbar partitions the space between the baseline and the cap height. For reasons of type design beyond my qualifications to explain, the two partitions don’t have to be equal. For instance, for the above uppercase ‘B’ of Bodoni, the bottom partition, from the baseline to the bottom edge of the crossbar, amounts to about 52.4% of said space, while the top partition, from the top edge of the crossbar to the cap height, gets the remaining 47.6%. The crossbar is “a little bit above the middle.”

Developing a systematic strategy for positioning a crossbar: The crossbar partitions the space between the baseline and the cap height into a bottom and a top “counterform” following the proportions as annotated in the above illustration for Bodoni

Granted, Giambattista Bodoni himself hardly reasoned about his font design in terms of “partitioning percentages.” He designed the uppercase ‘B’ the way he saw fit, with the middle “hairline” positioned at the “visual center” of the baseline and the cap height. I’m obliged to assume that this was his design intent.

But how much is this in terms of pixels? That’s where the percentages come in. For instance, at 10 pt and 96 DPI, there are 9 pixels between the baseline and the cap height. Subtract 1 pixel for the middle “hairline” and there are 8 pixels left to partition. The bottom partition gets 52.4% thereof, which amounts to 4.19 pixels before rounding. Once rounded to 4 pixels, this leaves the remaining 4 pixels for the top partition.

Developing a systematic strategy for positioning a crossbar: Proportional allocation of pixels for the counterforms positions the crossbar more closely to the design intent
(Bodoni UC ‘B,’ rendered in ClearType at 10 pt and 96 DPI)
⇒ Hover your mouse over the above illustration to see the results of the conventional ad-hoc “bottom-up interpolate-and-link” approach)

Be sure to hover your mouse over the above illustration to see what this looked like with the ad-hoc “interpolate-and-link” method. I think the new strategy comes closer to the design intent.

Likewise, at 9 pt and 96 DPI, there are 8 pixels between the baseline and the cap height. After subtracting 1 pixel for the middle “hairline,” this leaves 7 pixels to partition. Again, the bottom partition gets 52.4% thereof, or 3.67 pixels. Once rounded to 4 pixels, this leaves the remaining 3 pixels for the top partition.

Developing a systematic strategy for positioning a crossbar: Proportional allocation of pixels for the counterforms positions the crossbar more closely to the design intent
(Bodoni UC ‘B,’ rendered in ClearType at 9 pt and 96 DPI)
⇒ Hover your mouse over the above illustration to see the results of the conventional ad-hoc “top-down interpolate-and-link” approach)

Once again be sure to hover your mouse over the above illustration to see what this looked like with the ad-hoc “interpolate-and-link” method. I think the new strategy again comes closer to the design intent. Bodoni designed the “hairline” to be positioned “a little bit above the middle,” not below.

This begins to look like a promising strategy. To use this strategy in a TrueType font, it needs to be translated into—well—TrueType code. But before rushing into coding, recall the idiosyncrasies of the TrueType rasterizer as discussed in . We’ll want to minimize any “surprises” in algebra first.

To that end, remember that 52.4% means 52.4/100, which is equivalent to part/total. The part we are looking at is the bottom partition, or height of the bottom counterform space, while the total is the sum of the bottom and top partitions, or the sum of the respective counterform heights.

In turn, this gets me back to the concept of the constraint: We’ll want the positioning of the crossbar to be constrained such that, in theory,

bottomPartition/sumOfPartitions = bottomPartitionPixels/sumOfPartitionsPixels.

The left-hand side of this constraint represents a ratio or proportion of distances in design coordinates (font design units), while the right-hand side represents the equivalent in device coordinates (pixels). In TrueType, the best we can do for the ratio of design coordinates is to use TrueType’s “original” coordinates (cf ), while the sumOfPartitionsPixels can be determined as discussed above (“measure” the number of pixels from the baseline to the cap height, and subtract the [number of] pixel[s] used for the crossbar).

This leaves bottomPartitionPixels as the only dependant variable, which we can determine by cross-multiplication, followed by rounding. Thus, in [pixel-] practice:

bottomPartitionPixels
= Round(bottomPartition × sumOfPartitionsPixels/sumOfPartitions).

In the following “first cut” of the TrueType code, the above variables are abbreviated as follows:

Also, “partitions” are referred to as “counterforms” or “cforms” for short, while the concept of a “crossbar” is abbreviated as “xbar.” Lengthy, “verbose” variable names—even if only used in comments—aren’t always as expressive as one might think. Shorter names can help to see complexities at one glance. Here is the code:

FDEF[], 101
/* CALL[],bl,bc,tc,ch,101
bl: baseline point
bc: bottom xbar point
tc: top xbar point
ch: cap height point
Function constrains xbar
between baseline and cap-
height while respecting
cform proportions */

#BEGIN
#PUSHOFF

/* S: bl,bc,tc,ch
get orig cform props */
SVTCA[Y]
DUP[]
#PUSH, 3
CINDEX[]
MD[O]
#PUSH, 4
CINDEX[]
#PUSH, 6
CINDEX[]
MD[O]
DUP[]
ROLL[]
ADD[]

/* S: bl,bc,tc,ch,b,T
get tot #px for cforms */
ROLL[]
#PUSH, 6
CINDEX[]
MD[N]
#PUSH, 5
CINDEX[]
DUP[]
SRP0[]
#PUSH, 5
CINDEX[]
DUP[]
MDRP[m>RBl]
MD[N]
ADD[]

/* S: bl,bc,tc,b,T,T′
calc #px for bot cform */
ROLL[]
MUL[]
SWAP[]
DIV[]
ROUND[Bl]

/* S: bl,bc,tc,b′
calc offs & adj xbar */
#PUSH, 3
CINDEX[]
#PUSH, 5
MINDEX[]
MD[N]
SUB[]
DUP[]
ROLL[]
SWAP[]
SHPIX[]
SHPIX[]

#PUSHON
#END
ENDF[]

“First cut” of a TrueType function implementing a systematic strategy to constrain a crossbar between a pair of “reference lines,” such as the “baseline” and the “cap height”

While this is what I would consider a “first cut,” or a “version 0.0,” make no mistake: this is working code! Both preceding illustrations involving an improved crossbar positioning strategy for Bodoni use this code. And so does the following illustration:

Deploying function 101 above for systematically positioning crossbars (Bodoni UC ‘B,’ rendered in ClearType at 6 to 24 pt on 96 DPI)
⇒ Hover your mouse over the illustration to enlarge it to 200%

Below is an enlarged copy of the above illustration of my positioning strategy, combined with the original ad-hoc “interpolate-and-link” method “underneath:”

Deploying function 101 above for systematically positioning crossbars (same as previous illustration except already enlarged to 200%)
⇒ Hover your mouse over the illustration to revert to the ad-hoc “bottom-up interpolate-and-link” method and observe the differences

Be sure to hover your mouse over the above illustration to see the difference. I think my positioning strategy comes as close to the design intent as the size of the pixels allows. None of the crossbars appear unnaturally “pushed up” or “pushed down,” yet none of the above ppem sizes required any delta exception to do so. The strategy is the rule, not the exception.

Now, while this code works well for the Bodoni UC ‘B’ from 6 to 24 pt, to turn it into “industrial strength” working code, there are a few aspects missing. I left them out for didactical purposes, to avoid making the above argument more complicated than necessary. But at least 4 improvements come to mind—not necessarily in this order:

As the textbooks from the world of academia would say, generalizing the above algorithm is left as “an exercise for the reader.”

Whether or not you’re inclined to turn the above prototype into an “industrial strength” solution, that’s up to you. With a solution like that, positioning crossbars can become routine—free of any worries about having to add delta exceptions. You’ll no longer need the time, money, and memory “footprint” for a number of individual deltas.

At the same time, with a solution like that, you can start to free yourself of any worries about the use of fractional ppem sizes (cf ). If you don’t need deltas to get your “hinting” to do as you meant, switching to fractional ppem sizes can become a “no-brainer”. As the analysis in has illustrated, this can improve the scalability of text, and reduce the chances of “dashing” characters in scalable layouts.

If, in the past, you have looked at “hinting” as a way of “reacting” to all kinds of raster tragedies, this may come as somewhat of a surprise. “Hinting,” in the sense of constraining the scaling of outline fonts, is all about acting, not “reacting.” You’ll want to know upfront what you want to target, think of a strategy to meet this target, and then implement said strategy in code.

This applies to TrueType in particular. Like it or not, TrueType is a programming language. Use it for encoding constraint behavior, but don’t use it for “pixel-popping.” Seriously! If you are using deltas for every pixel you don’t like, you’re not “hinting.” You are using a technophiliac disguise for “pixel-popping”—a job which you could get done faster with a bitmap editor.

Consider, for instance, a “round character” like an uppercase “O.” Depending on how the control points “fall” on the grid, you will get extraneous pixels, and generally asymmetric pixel patterns—the “usual” raster tragedies. Granted, the ‘O’ may not be designed with perfect symmetry, but the coarseness of the pixels at sufficiently small sizes simply may not permit to render any asymmetries.

Beyond “Delta-Hinting:” Applying “basic hints” to an UC ‘O’ leaves quite a few unlikable pixels “behind” for subsequent “pixel cleanup” (Calibri UC ‘O,’ bi-level rendering, 8 to 18 ppem, enlarged to 200%. Notice that this is not the RTM version of Calibri)

This looks sub-optimal. There are both pixels that look “out-of-place” and pixels that throw the character “out-of-symmetry.” That’s quite a few unlikable pixels.

At the same time there are not that many different ways to render a “quarter turn” with a limited number of pixels. Figure out what configuration of control points gets you which pixel pattern. Determine which pixel patterns are acceptable, and which aren’t. Now translate this strategy into [TrueType] code, and you’re done with the “clean-up” of the ‘O,’ for the rest of your “hinting career”.

Beyond “Delta-Hinting:” Adding a systematic strategy—implemented as a TrueType function—to the “basic hints” for an UC ‘O’ nicely “cleans up” all extraneous pixels (else same as above. Notice, however, that once more this is not the RTM version of Calibri)
⇒ Hover your mouse over the above illustration to revert to the version without “pixel cleanup”)

That’s quite a difference! Yet none of the above ppem sizes uses any delta exception whatsoever. In fact, nothing in the code to produce the above illustration is ppem specific. Said code does not contain any of the following instructions: DELTAPn[], DELTACn[], and MPPEM[].

How does this strategy compare to the RTM version of Calibri? Following is a juxtaposition:

Beyond “Delta-Hinting”—or not: Comparing the RTM version of Calibri, using individual delta exceptions and embedded bitmaps, with a systematic strategy for “pixel cleanup” (same as above, except this time around it is the RTM version of Calibri)
⇒ Hover your mouse over the above illustration to revert to the systematic strategy for “pixel cleanup” and be sure to observe the differences

Notice that the RTM version appears to “miss” a few extraneous pixels—most likely because no delta exceptions were applied, or because they weren’t deemed necessary at the respective ppem sizes. When you have to “delta-hint” every size individually, it may be a trade-off between getting every pixel right and not spending the extra time, money, and memory on seemingly uncommon ppem sizes.

By contrast, when you have a strategy for “cleaning-up” pixels around the round strokes, and when you have this strategy implemented in your fpgm template, all it takes is a single function call, and you’re done with the “clean-up,” including the seemingly uncommon ppem sizes. With some of today’s applications allowing you to “zoom” to 91.67% and similar, there are hardly any “unused” ppem sizes—including fractional ppem sizes (cf ).

Now consider this: The code for the ‘O’ is a mere 51 bytes. This includes all the “pixel clean-up” illustrated above, and it’s the same 51 bytes for the uppercase ‘O,’ the lowercase ‘o,’ and the figure ‘0’ (the “zero”). For the purpose of my set of strategies, these glyphs are close enough to use the same code pattern, and hence yield the same code size.

Granted, the “two storey g” will be a bit more involved to “hint.” Accordingly it will require more function calls to the fpgm and add more bytes to the code. But hopefully, the following table illustrates the “trend” in font “hinting” I would prefer to see. It compares the code size of the UC ‘O,’ the lc ‘o,’ and the figure ‘0’ (zero) of a number of popular fonts.

Font
(Vista SP1 RTM)
UC ‘O’
(bytes)
lc ‘o’
(bytes)
fig ‘0’
(bytes)
Arial 127 193 181
Times New Roman 177 344 186
Courier New 508 328 429
Verdana 190 250 135
Tahoma 69 56 88
Georgia 141 164 104
Palatino Linotype 661 682 185
Calibri 124 168 234

Memory “footprint” of various fonts when “hinted” by eye.
Compare to the 51 bytes it takes for a suitable set of strategies

These numbers speak for themselves. “Hinting” by strategy can lead to dramatic reductions of code size. While this may not be your primary concern for fonts used on your desktop or laptop computer, consider this:

By this point you may have started to wonder what seems to be the obsession with “pixel clean-up,” now that the “trend” in font rendering is to use some form of anti-aliasing. The simple reason is that I do not consider it safe to assume that every end-user is comfortable with anti-aliasing. This may include any of the anti-aliasing schemes we have discussed in chapter .

At the same time I wanted to illustrate that with appropriate strategies, implemented as functions in the fpgm template, “pixel clean-up” can reduce to a “routine” call to a [set of] function[s], as opposed to obsessive compulsive “delta-hinting.”

Caring for pixel pattern optimizations in bi-level rendering does not have to exclude or impede optimal rendition in any of the anti-aliasing schemes. For instance, the following opportunities discussed in chapter are applicable to the uppercase ‘O:’

Rest assured that all of the above are included in the aforementioned 51 bytes! Getting the pixels right in bi-level rendering is “just icing on the cake.” Following is a table that illustrates what 51 bytes worth of function calls can do to the uppercase ‘O’ (labeled as “experimental”) when compared to more “traditional hinting methods” (labeled as “Vista SP1 RTM”):

Vista SP1 RTM
experimental
Vista SP1 RTM
experimental
Vista SP1 RTM
experimental
Vista SP1 RTM
experimental

Seizing opportunities without “bloating code:” Compare the RTM versions of Calibri, “hinted by eye,” with the experimental versions, “hinted” by a set of strategies implemented as functions in TrueType’s font program (Calibri UC ‘O,’ 6 to 24 pt at 96 DPI)
⇒ Hover your mouse over any of the above illustrations to enlarge it to 200%

As you can see, the following opportunities have been “seized”—at least by the “hinting” method labeled as “experimental:”

Moreover, the above 51 bytes also include a call to a function that fixes what can be fixed in the chain of rounding errors discussed in (corresponds to column “integer ppem sizes & widths” in the table of ). Together with the absence of delta instructions and other ppem specific code, this allows using fractional ppem sizes (cf ). In turn, as illustrated at the end of the previous section, this generally improves the scalability of text used in scalable text layouts.

You may have noticed by now that in some instances my strategy yields an uppercase ‘O’ with proportions that slightly differ from the RTM version of Calibri that ships with Vista SP1, particularly in bi-level rendering. As we have previously discussed in and , this is a matter of prioritizing black-body width, side-bearings, and advance width.

In my strategy, I have tried to “balance” the black-body width with the side-bearings while giving priority to the advance width. The choice to prioritize the advance width reflects the expected usage scenario: As discussed in the previous section, chances are that the font is expected to scale. Hence I apply the aforementioned function to “fix the advance width” and then take it from there.

But—and this takes me to the core of my approach—this priority, and the associated strategy for balancing the black-body width with the side-bearings, is not “hard-wired” into each character’s “hinting” code. There are no deltas that happen to be there because I thought that at one ppem size it would be better to “tweak” the black-body width in favor of the side-bearings, while at another I may have decided otherwise.

Instead of being implied in the result of applying a set of delta instructions, I made the strategy explicit. Take the advance width as “budget” and then “allocate” a number of pixels for the black-body width and the side-bearings following a cascade of decisions. All these decisions are explicitly implemented in a [set of] TrueType function[s].

If you are in the business of making fonts, think about this: What would you do if tomorrow your client asked you to “re-hint” their corporate font but optimized for a reflowable layout, as opposed to the scalable layout they asked for initially? And what would your client do if you offered them a mere 20% discount compared to the original “hinting job?” Because, after all, you may have to “re-delta” quite a few of the characters.

Likewise, if you are “hinting” fonts, recall the table in , where we looked at a range of “hinting” methods combined with current rendering methods. Guess how many times I had to “re-hint” these characters to produce the illustrations? None!

The “first cut” is flexible enough that I didn’t have to “re-hint” it for strategies 2 through 8. All I had to do was to toggle a couple of “switches,” “dial” different rounding “granularities,” or “tweak” the minimum distances—all strategically centralized in the prep (TrueType’s “preparation program”).

This is what a programming language like TrueType can do for you when you use it for encoding constraint behavior, instead of “pixel-popping.” Hence, while I am striving to avoid unnecessary “code bloat,” to me the fundamental advantages of “hinting” by strategies are:

Continuing in the above direction of thinking, I am beginning to question why I have to preempt some of the decisions in my strategies, even if these decisions are centralized in my fpgm and prep templates. For instance, why do I have to assume that the prevailing usage scenario of my font is going to be a scalable layout? Why can’t the application communicate this fact with my font?

Today the application (or the operating system) tells the rasterizer it wants e.g. an 11 point Calibri at 96 DPI, to be rendered for one of the rendering methods discussed in chapter . Moreover—in case the application (or the operating system) wants ClearType—it gets to tell the rasterizer if this should use “natural” advance widths and whether the sub-pixels come in RGB or BGR order, or if this should rather use fractional advance widths and positioning.

In turn, the font gets to inquire the rasterizer about these parameters. If need should be, it can act upon these parameters. For instance, if the font is to be rendered in ClearType, and if the font decides to optimize stroke positioning for maximum rendering contrast (cf ), it can discover if it should assume the sub-pixels to come in RGB or BGR order, as such may entail a different optimization strategy.

However, the font cannot discover if the end-user prefers to have the stroke positioning optimized for maximum rendering contrast or for maximum faithfulness to the outlines. Likewise, it cannot discover if the application implements a reflowable (resolution dependent) or a scalable (resolution independent) layout. Generally, your or my strategies to implement certain “hinting” tasks may or may not be compatible with the expectations of the applications that use the font, or may or may not represent the preferences of the end-users.

Now, with “a little bit of plumbing” in the rasterizer, these decisions could be exposed to the applications and the end-users. In turn, this would empower the end-users to choose if they want their stems optimized for maximum rendering contrast or maximum faithfulness to the outline. Likewise, it would empower the applications to choose a reflowable or a scalable “version” of one and the same font.

Wishful thinking? Maybe—maybe not. It’s no “rocket science” to do the “plumbing.” The overriding problem may be how to expose these preferences to the end-users. “Advanced settings” come to mind—an eponymous button that tends to give me a bit of a queasy feeling—except for those rare cases where I think I know what I’m getting into… Hence, a set of defaults—reflecting “typical” end-user preferences—combined with an illustrated “advanced settings wizard” may be the answer.

But before we get the control panel for qualified typophiles, day-to-day “hinting” may first have to prove it qualifies for such sophistication. Like the webpages “optimized” for 1024×768 pixels, most “hinting” I’ve seen is rather “static.” It doesn’t begin to adapt to different layout scenarios or user-preferences. It is what it is—frozen in time—the pixels the author of the “hints” wanted to see at that time…

*****

To conclude this chapter, I’ll repeat the table shown at the end of chapter , summarizing the 4 rendering methods, but I’ll add the properties we have discussed meanwhile:

Rendering Method →
↓ Property
smoothing
(x-direction)
noyesyesyes
smoothing
(y-direction)
noyesnoyes
exploits LCD
sub-pixel structure
nonoyesyes
“bleeding”
(x-direction)
nono(a)yesyes
“bleeding”
(y-direction)
nono(a)noyes(b)
need to “clean-up”
pixel patterns
yesnono (x)
(?) (y)
no
can render fractional
stems (x-direction)
noyesyesyes
can render 1 px
stems in black
yesyes(a)nono
max W3C contrast
for 1 px stems
21.0021.0012.16…
13.27(c)
12.16…
13.27(c)
min ppem(d) for
“black core” stems
11(e)18…
20(f)
19…
21(f)
19…
21(f)
can render fractional
crossbars (y-direction)
noyesnoyes
can render 1 px
crossbars in black
yesyes(a)yesno
max W3C contrast
for 1 px crossbars
21.0021.0021.0015.09
min ppem(d) for
“black core” crossbars
12…
14(e,f)
21…
24(f)
12…
14(e,f)
26…
30(f)
can render fractional
under- and overshoots
noyesnoyes
can render fractional
serif “weights”
noyesyes (x)
no (y)
yes
can render stroke design
contrast faithfully
no…
kinda(g)
no…
yes(g)
no…
kinda(g)
no…
yes(g)
can use fractional
advance widths
nono…
yes(h)
yesyes
can use fractional
ppem sizes
yesyesyesyes
can render text
in color on color
yesyeskindakinda
sensitivity to
gamma correction
noyesyesyes
sensitivity to
sub-pixel structure
nonoyesyes
can display text
at ±90°
yesyeskindakinda…
yes(i)
bits per pixel for
cache or processing
14832

Opportunities: Properties of 4 different rendering methods revisited

  1. Assuming a box filter
  2. Assuming a sinc filter
  3. Assuming a 1 pixel wide stroke strictly aligned with pixel boundaries (like bi-level “hinting”), the lower figure represents the contrast between the “purplish core” and the “bleeding” on the right edge, while the higher figure represents the same for the left edge. For any other [fractional] stroke position (cf ), and for any amount of software γ-correction (cf ), expect lower figures.
  4. Assuming a “typical” font like Arial, the same as in
  5. Assuming stems and crossbars are not constrained to a minimum distance of 1 pixel; otherwise this figure could be as low as TrueType instructions are “on.”
  6. Lower figure applies to uppercase Arial, while higher figure applies to lowercase (because the lowercase stems and crossbars are thinner than the uppercase ones).
  7. This depends on the actual “hinting:” With appropriate “hinting,” e.g. the property “stroke design contrast” can be rendered in full-pixel anti-aliasing as faithfully as the “granularity” of the samples permits. When “hinting” strokes to full-pixel boundaries without correlating horizontal to vertical stroke weights, this can render “stroke design contrast” about as unfaithfully as naïve bi-level rendering.
  8. In theory this should be possible—somehow anyway—but I never got around to trying it out. If fractional advance widths look a little bit blurry at small type sizes and low resolutions when using or , I would expect it to look even blurrier at these sizes when using ; cf also the “Schillerstein” example in .
  9. This depends on both the “hinting” and the rendering respecting the actual LCD sub-pixel structure (cf ).

As you can see, if you’re in the business of making fonts, you’ve got opportunities. Seize them!

previous chapter, ↑ top, next chapter