The Raster Tragedy at Low-Resolution Revisited

The Raster Tragedy at Low-Resolution Revisited:
Opportunities and Challenges beyond “Delta-Hinting”

In the preceding chapters we have looked at anti-aliasing methods that enable opportunities, and at the rasterizer that turns some opportunities into challenges. The anti-aliasing opportunities require a clear concept of what we can do when we constrain outlines to overcome the pixel size of low resolution devices. The rasterizer challenges require deep insight into its inner workings in order to know what we have to do when we program this obscure assembly language called TrueType.

Yet, however involved some of the TrueType coding may be, we are in control of pretty much any aspect of the constraint system. We can discover if a font is about to be rendered in bi-level or in one of the anti-aliasing methods. We can round the positions and widths of strokes according to the discovered rendering method. And, if we have to use them, we can restrict delta instructions to the rendering method they apply to.

By contrast, once we have coded up all the constraints, or once the rasterizer has interpreted all the encoded instructions, we are no longer in control. For instance, we cannot control how the end users have setup their computers. This presents a number of obstacles that can make the most legible font hard to read.

Therefore, in this chapter we will look at what happens to the “perfectly” constrained outlines on the path from generating the pixels all the way to the end user “experience.” May this compilation be useful to the font makers, the software developers, and most importantly to the end users: It all has to work together!

In chapter we have discussed fractional stroke positions and weights as an opportunity to reduce exaggerations and distortions. In chapter we have expounded the problems of rounding to sample boundaries. Ideally, the “rounding fractions” of the stroke positions and weights used in “hinting” should be “in sync” with the sample boundaries implemented by the operating system’s rasterizer.

The following table compiles the typical (as used in “hinting”) and optimal (for compatibility with Windows) pixel fractions and their corresponding oversampling rates used in the process:

Rendering Method → Pixel Fraction & ↓ Oversampling Rate
x d i r	Typical	Fraction	1	1	1/16	1/16
	Typical	Rate	1×	1×	16×	16×
	Optimal	Fraction	1	1/4	1/6	1/6
	Optimal	Rate	1×	4×	6×	6×
y d i r	Typical	Fraction	1	1	1	1
	Typical	Rate	1×	1×	1×	1×
	Optimal	Fraction	1	1/4	1	1/5
	Optimal	Rate	1×	4×	1×	5×

Typical and optimal pixel fractions and corresponding oversampling rates used by 4 different rendering methods for stroke positions and weights

For instance, to make the most out of full-pixel anti-aliasing, fonts ideally round stroke positions and weights to the nearest 1/4 of a pixel, corresponding to the 4× oversampling rate of that method. But most fonts actually round the stroke positions and weights to the nearest full pixel. This is not necessarily wrong—it trades intermediate stroke positions and weights for “sharp” strokes with maximum rendering contrast (cf and ).

Likewise, to make the most out of anti-aliasing as implemented for the y-direction in ClearType, fonts ideally round stroke positions and weights to the nearest 1/5 of a pixel. As discussed in , this allows e.g. to gradually “phase in” the under- and overshoots (cf ), and it permits a much more faithful rendition of stroke design contrast than bi-level rendering (cf ). But most fonts forego these opportunities. As mentioned in , it is really extremely tedious to implement with TrueType.

All the above rounding fractions have corresponding oversampling rates. For instance, rounding to the nearest 1/4 of a pixel corresponds to an oversampling rate of 4×. Seen this way, most anti-aliasing methods implement an optimal oversampling rate that is an integer multiple of the typical oversampling rate used for “hinting” stroke positions and weights.

As long as the optimal oversampling rate is an integer multiple of the actual oversampling rate, there is nothing wrong per se. It merely forfeits some opportunities for the benefit of other priorities. However, early development decisions in ClearType led to 2 different oversampling rates that do not fulfill this criterion. 6× is not an integer multiple of 16×. The associated “undersampling” is a problem that we will look at next.

In chapter we have already looked at what happens when the sampling rate is too low for a given level of detail. “Undersampling” led to aliasing. Equal stems got rendered with unequal sample counts. Subsequently we have looked at all kinds of workarounds to make small features “sampleable” and to sample like features with like sample counts. In a way, with these workarounds or constraints we have implemented a kind of anti-aliasing filter.

In the case of ClearType, the 6× oversampling rate used to generate pixels is lower than the 16× rate typically used for rounding stroke positions and weights in “hinting.” This is again a case of “undersampling” and thus it also leads to aliasing. To illustrate the point I have used VTT to create a simple set of constraints for the Arial lc ‘m.’ Below is the outline with control point numbers, followed by the source code of the constraints.

Outlines of the Arial lc ‘m’

VTT Talk (Typeman Talk)	TrueType
`/* Y direction / YAnchor(0,10) YAnchor(1,6) YAnchor(6,7) YLink(6,30,85) YAnchor(11,7) YLink(11,21,85) YAnchor(16,10) YAnchor(26,10) YInterpolate(26,9,3,6) / X direction */ XIPAnchor(36,0,26,16,37) XLink(0,35,81) XDist(35,2) XLink(26,25,81) XInterpolate(25,9,2) XLink(16,15,81) Smooth()`	`SVTCA[Y] MIAP[R], 0, 10 MIAP[R], 1, 6 MIAP[R], 6, 7 MIRP[m>RBl], 30, 85 MIAP[R], 11, 7 MIRP[m>RBl], 21, 85 MIAP[R], 16, 10 MIAP[R], 26, 10 SRP2[], 6 IP[], 9 IP[], 3 SRP1[], 36 SRP2[], 37 SVTCA[X] SLOOP[], 3 IP[], 0, 26, 16 MDAP[R], 0 MDAP[R], 26 MDAP[R], 16 SRP0[], 0 MIRP[M>RBl], 35, 81 MDRP[m<RGr], 2 SRP0[], 26 MIRP[m>RBl], 25, 81 SRP1[], 2 IP[], 9 SRP0[], 16 MIRP[m>RBl], 15, 81 IUP[Y] IUP[X]`

VTT Talk and corresponding TrueType source code used to constrain the Arial lc ‘m’

Rendered in ClearType, “Natural” Advance Widths, at 12 point on a 120 DPI device, the above code generates the set of pixels below.

Arial lc ‘m,’ rendered in ClearType using “natural” advance widths (12 pt, 120 DPI). Due to a mismatch between the actual and ideal “rounding fractions” (1/16 and 1/6 respectively) the middle and right stems are not rendered with equal sample counts
⇒ Hover your mouse over the above illustration to see it corrected for equal sample counts

Notice the left edge of the middle stem is rendered with a darker shade of orange than the left edge of the right stem, while the respective right edges are rendered equally. This indicates that 2 different sample counts are in use for two stems constrained to equal weights.

To verify that this is a case of “mis-sampling” I have inspected the x-coordinates of all the stems’ edges and “re-rounded” them from the nearest 1/16 of a pixel to the nearest 1/6 of a pixel. The following table compiles this data, along with the corresponding stem weights.

property	left stem		middle stem		right stem
	left edge	right edge	left edge	right edge	left edge	right edge
x-coords in 1/16	1 5/16	3 1/16	7 8/16	9 4/16	13 10/16	15 6/16
weights in 1/16	1 12/16		1 12/16		1 12/16
x-coords in 1/6	1 2/6	3	7 3/6	9 2/6	13 4/6	15 2/6
weights in 1/6	1 4/6		1 5/6		1 4/6

Arial lc m, ClearType “Natural” Advance Widths, 12 pt, 120 DPI, x-coordinates of stems’ edges in 1/16 and 1/6 of a pixel, along with the corresponding stem weights

In terms of 1/16 of a pixel, all 3 stems have the same weight. This is the expected result from using a CVT. By contrast, in terms of 1/6 of a pixel, the middle stem is indeed wider than the other two stems.

One way to look at this is to translate the stem width of 1 12/16 pixels to the nearest 1/6 of a pixel. The exact translation would be something like “1 4.5/6” pixels, which is not possible. There are no “half samples.” The fractional part will have to be either 4/6 or 5/6 pixels. This may serve to explain the diverging stem widths.

The logical way to look at this is to effectively “re-round” every x-coordinate to the nearest 1/6 of a pixel. As with translating the stem widths, this causes “edge” cases: 4/16 would translate to “1.5/6” pixels and 12/16 would translate to “4.5/6” pixels.

These “edge” cases correspond to sample centers that are “dead-on” the outline. TrueType defines such samples to be “in.” Hence, on a left edge the corresponding x-coordinate rounds down, while on a right edge it rounds up. In turn, this explains why the right edge of the middle stem rounds up, and along with that, why the middle stem gets rendered with a higher sample count than the other 2 stems.

In technical terms, rounding the outlines to the nearest 1/16 of a pixel but sampling them at the rate of 6 samples per pixel amounts to downsampling without anti-aliasing filter. Aliasing is the expected consequence. Accordingly, this technical faux-pas should be avoided.

Up until now we have discussed various methods for rendering text. All these examples and explanations focused on rendering black text against a white background, even though for sub-pixel anti-aliasing the idea of black text may have been a bit of a stretch. At least conceptually, it was always black text. Hence in this section we will have a look at rendering text in colors and against arbitrary backgrounds.

Rendering text in color is easiest understood in bi-level rendering. A pixel is either “on” or “off.” Accordingly, when rendering black text against a white background, the “on” pixel is black, while the “off” pixel is white. If we were to render red text against a blue background, the “on” pixel would be red, while the “off” pixel would be blue.

For full-pixel anti-aliasing, recall that we related the pixel coverage to a level of gray. Full coverage meant the pixel was “fully on,” while no coverage meant the pixel was “fully off.” Partial coverage translated to an intermediate shade of gray—somewhere between “on” and “off.”

Accordingly, repeating the above choice of colors, “fully on” would translate to red, while “fully off” would translate to blue. Anything in between would translate to some mix of red and blue, ranging from red over purple to blue along the RGB color wheel, as illustrated below:

Anti-aliasing red text against a blue background: intermediate shades of gray translate to intermediate blends of red and blue

In technical terms, this is called alpha blending. Conceptually, alpha blending uses the values that represent pixel coverage to blend the text and background colors. For our example this means that a larger pixel coverage translates to a higher share of red combined with a lower share of blue, and vice-versa for a smaller pixel coverage.

Analogous to full-pixel anti-aliasing, sub-pixel anti-aliasing uses sub-pixel coverage to alpha blend the three primary colors red, green, and blue individually. This is the theory behind sub-pixel anti-aliased text in color, rendered against a background in any color. In practice, this may look a little confusing at first (CAUTION: To properly view the following illustration, be sure to double-check all your settings as per the check-list introduced in ):

Asymmetric sub-pixel anti-aliased text, rendered in black, red, green, blue, and white against black, red, green, blue, and white backgrounds (Verdana lc ‘n’ 15 ppem)

All 25 examples use the exact same character, the same constraints (“hints”), and the same positioning to arrange them into a table. All that changes from cell to cell are the text and background colors.

Yet, for instance in the black row at the top, the red ‘n’ appears to have much heavier stems than the green and blue ‘n.’ In the red row below, the green and blue ‘n’ appear to have a bit of a shadow on the right, as if to render some “3D-effect.” Right next to it, the white ‘n’ looks about as pixilated as if it were rendered in bi-level.

There are 2 main reasons behind the poor performance of sub-pixel anti-aliasing rendering text in color against a background in color:

Pixel coverage is partially left out of consideration by most color combinations.
Screens respond non-linearly to the linear input that represents pixel coverage.

First, let’s look at pixel coverage in the presence of a nominal text rendering color other than black or white (we will come back to the non-linear screen response in the next section, as it is a problem common to all anti-aliasing methods).

Recall that in sub-pixel anti-aliasing, downsampling is performed separately on the individual sub-pixels (cf ). Conceptually, each sub-pixel gets to represent 1/3 of the full-pixel coverage. Accordingly, if text is rendered in one of the 3 primary colors red, green, or blue, only 1/3 of the actual pixel coverage is used to represent an entire pixel.

The remaining 2/3 of the pixel coverage are completely left out of consideration! This is easiest to see when looking at the sub-pixels of one of the above ‘n’ rendered in one of the primary colors against a black background:

Asymmetric sub-pixel anti-aliased text, rendered in red, green, and blue against a black background (Verdana lc ‘n’ 15 ppem, showing the individual sub-pixels)
⇒ Hover your mouse over the above illustration to see the sub-pixel grid

Rendering sub-pixel anti-aliased text in one of the 3 primary colors against a black (or white) background effectively “throws away” 2/3 of the samples before downsampling! This explains some of the pixilated appearance in the preceding illustration. If it weren’t for ClearType’s 6× oversampling rate (cf ), it would be the same (green) or substantially the same (red or blue) as bi-level text rendered in the respective primary color.

Accordingly, to obtain smooth text with sub-pixel anti-aliasing rendered in color against a colored background, the primary colors red, green, and blue should be avoided. Likewise, combinations of 2 primary colors, that is, yellow, magenta (aka fuchsia), and cyan (aka aqua) are probably best avoided, too. To minimize pixilation as a result of color-on-color text rendition, we really need a combination of all 3 primary colors.

Theoretically, the best results use equal amounts of the 3 primary colors. In other words, sub-pixel anti-aliasing should restrict text rendering to levels of gray only, including black and white. Think of it this way: any set of unequal amounts of red, green, and blue will yield unequal fractional representations of the respective sub-pixels, and hence lead to issues like the ones illustrated above.

Practically, some “off-white” color used as a background is not going to make sub-pixel anti-aliasing fail altogether. Rather, it will deteriorate gradually as hues are saturated away from neutral gray. The above examples rendering both text and background in primary colors illustrate where color saturation will lead when using colors in an attempt to increase the perceived resolution (DPI) of the LCD device on which it is exercised.

In the era of CRTs, luminous dots were produced by beaming electrons at the fluorescent back of the glass screen inside the evacuated tube. More electrons caused the fluorescent material to glow brighter, while fewer electrons made it less bright. In turn, more electrons required a higher voltage to drive the electron gun, and fewer electrons a lower voltage.

Varying the voltage between 0 and the maximum voltage that the CRT was designed for varied the luminosity from darkness to the maximum brightness that the CRT could achieve. But the luminosity didn’t follow the voltage linearly or proportionally. Instead, it followed a curve similar to the one below:

Non-linear response of a typical CRT: The luminosity L output by the CRT is not proportional to the voltage V supplied to the CRT

In the above diagram, voltage V is in x-direction, while luminosity L is in y-direction. For instance, if the voltage applied to the electron gun is about 50% of the maximum, luminosity is only about 25% of the maximum. It takes approximately 70% of the maximum voltage to obtain 50% of the luminosity. As far as I understand, LCDs show a similarly non-linear response, or were designed to do so for compatibility reasons.

This creates a problem for representing intermediate pixel coverage by shades of gray. Assume we want to render a half covered pixel as “50% gray” (middle gray). The computer sends 50% of the maximum voltage to the monitor. But the monitor renders this at 25% luminosity (“brightness”), or maybe even darker. This is not good.

The compensation for the monitor’s non-linear response is called gamma correction. In theory, this is a fairly simple operation: Determine the non-linear response curve of the individual monitor (the color profile), “flip” the curve about the 45° axis (the linear response curve), and tabulate the result. Then use this table as a lookup for replacing “gray values” by “compensated gray values.”

Correcting the non-linear response of a typical CRT: Conceptually, the voltage used to control the CRT is increased to compensate the “lagging” response of the CRT
⇒ Hover your mouse over the above illustration to see the original response of the CRT. Notice how the correction curve is “flipped” about the 45° axis (the linear response curve)

In practice, the non-linear response curve is often approximated by a simple power function, since few users aside from serious digital photographers and professional graphic designers will be spending the time and money on the special hardware that can determine the actual response curve of their monitor(s):

L = Vγ

In the above equation, L is the luminosity emitted by the monitor (normalized to the interval [0…1]), V is the voltage driving the monitor (equally normalized), and γ is the number (gamma value) that determines the “curvature” of the non-linear response (with typical values in range 1 through 3). For many usage scenarios, approximating the monitor’s response curve by the above power function and a suitable gamma value may be a tolerable compromise, especially for CRTs.

Unfortunately, in reality there are a number of factors that complicate matters, and these factors can combine to compromise the viewing experience of text and photos. Following is a list of issues in no particular order and without claiming to be exhaustive.

Your computer may not have provisions for gamma correction, or it is not easily discoverable. For instance, I had to dig through a custom control panel applet that came with the “premium” graphics chip on my laptop. By default, it was set to “50%.” This doesn’t tell me much of anything.
Once you have discovered gamma correction, the applet doesn’t give you an objective way to do the correction visually. It shows you an image and you are probably supposed to adjust to taste, ignoring the artistic intent of the photographer.
If you are using a flat panel screen, be aware that the non-linear response of your screen changes significantly with slight variations of your viewing angle. Simply look at a typical family photo with plenty of light, medium, and dark shades, and tilt the screen of your laptop back and forth. The midtones will change. Similarly, when looking at your laptop or flat panel screen at an angle (left or right), the midtones will change.
Ambient light may change the subjective response of your screen. This may explain why some television sets come with premium features that attempt to compensate for varying ambient light. Serious digital photographers and professional graphic designers may work in darkened rooms with color temperature controlled artificial light but no natural light.
Gamma correction is “cumulative.” If it is done correctly, it only needs to be done once between the pixel data and the monitor. If it is done more than once, photos and anti-aliased text can get “washed out.” If it is not done at all, photos can appear too dark and anti-aliased text can seem rendered with lower resolution or “broken hints.”
In Internet Explorer on Windows, images stored as Portable Network Graphics (“.png”) files appear to be rendered with extra gamma correction. This is easy to observe: Take the “gray ramp” from the illustration below, represent it as both “.png” and “.bmp,” stack the images in a simple “.html” document, and view the results in both Internet Explorer 7 and Mozilla Firefox 3 side-by-side (but cf #3 above!). Therefore, if you are reading this web site in IE, some illustrations may look too light in their midtones even though your gamma correction may be spot-on.
In Windows, text rendered in full-pixel anti-aliasing is displayed with a gamma correction done in software and using a gamma value of about 2.3. This gamma correction appears to be in addition to what my premium graphics chip wants to do. If I “under-correct” gamma on my graphics chip, all my photos look too dark in their midtones, but text rendered in full-pixel anti-aliasing looks more natural. If I “correct” gamma for my photos, text rendered in full-pixel anti-aliasing looks “washed out.” Windows (GDI) does not have any provisions to adjust or defeat the software gamma correction of full-pixel anti-aliasing (“standard font smoothing”).
In Windows, text rendered with sub-pixel anti-aliasing uses software gamma correction, too. But unlike full-pixel anti-aliasing, it can be adjusted or defeated. To do so, use the ClearType Control Panel applet and follow the simple steps to “tune” ClearType to your taste and for your monitor.

Granted, in my daily computer use to read e-mail and surf the net I don’t retreat into a color temperature controlled studio. But it is helpful to understand the obstacles between the origin of the pixels and your viewing experience.

To illustrate the effect of gamma correction, I have created a somewhat abstract font with a single “character.” The sole purpose of this “character” is to render every single shade of gray implemented by Windows’ “Standard Font Smoothing” (full-pixel anti-aliasing). The outlines look like this:

Rendering a “gray scale” with a somewhat abstract TrueType font: This set of outlines represents a single “character”
⇒ Hover your mouse over the above “character” to see it rendered in full-pixel anti-aliasing

Be sure to hover your mouse over the above illustration to see how this “character” renders a “gray scale.” Notice that this “gray scale” is not gamma corrected by software (but cf #6 above about “.png” files in IE). If your monitor is compensated for any non-linear response it may have, you should see 17 clearly distinguishable shades of gray, including black and white.

If you don’t, or if the progression appears “skewed” towards dark or light shades, then your color profile may be “off.” To check if your profile may need an adjustment, I have deliberately under- and overcorrected the above “gray scale” with a range of gamma values. This may not give you the exact gamma correction required to profile your monitor, but at least an idea in which direction it may need an adjustment.

Gamma correction of a plain “gray scale:” Be sure to try the buttons above to see if your computer’s gamma correction needs an adjustment

Be sure to try these buttons. Ideally, at least for one or two of them, you should see a clear difference between “almost black” and black on the left, and between “almost white” and white on the right, and subjectively these differences should appear to be about the same.

On a plain “gray scale” the effect of gamma under- or over-correction is easy to see: under-correction (γ number too low) skews the midtones towards the dark end while over-correction (γ number too high) skews them towards the light end. Following is the effect of gamma under- or over-correction on colors.

Gamma correction and alpha-blending: Be sure to try the buttons above to see if your computer’s gamma correction needs an adjustment

Once more be sure to try these buttons. When alpha-blending red with blue, as in the above illustration, gamma correction affects the “blends.” For instance, for the “fifty-fifty blend” in the middle, under-correction skews the purple towards a Tyrian purple, while over-correction skews it towards deep magenta. The “pure” red and blue at either end remain the same.

In the context of sub-pixel anti-aliased text, erroneous gamma correction can impede the correct impression of stem weights or “help” the impression of “3D-shadows.” To illustrate this point, I’ll repeat an illustration from the previous section, but this time around with the option to change the gamma correction.

Gamma correction and ClearType: Be sure to try the buttons above to see if your computer’s ClearType settings need to be tuned (Verdana lc ‘n’ 15 ppem)

Once again be sure to try the various gamma correction buttons. Notice how deliberately under-correcting the gamma response appears to embolden text rendered against a white background (try γ = 0.5) while over-correcting it appears to embolden text rendered against a black background (try γ = 4.0). For white or black backgrounds, gamma correction appears to enhance or reduce stroke rendering contrast (cf ).

For combinations of color-on-color, it seems intractable to find the gamma correction that avoids both the impressions of “3D-shadows” and excessive pixilation. Intuitively, this doesn’t surprise: We have already used the colors to try to increase the perceived device resolution (DPI, cf ). We can’t claim that same “degree of freedom” again to render text in color against a colorful background.

Logically, recall the “cores” of black pixels introduced in . In our example, the stems aren’t wide enough for a “black core” of at least 1 pixel (or 3 sub-pixels); there are merely 2 sub-pixels of “solid core.” But only those 2 “solid core” sub-pixels render the nominal color; the “fringe” sub-pixels render the transition to the background color.

For instance, the red ‘n’ against the black background happens to have both stems positioned such that its 2 “solid core” sub-pixels are aligned with the green and blue sub-pixels. This leaves the “fringes” of the stems to render the nominal color red. Put another way, when rendering this ‘n’ in red, its stems are all “fringe” and no “core” (!)

Now recall that gamma correction makes pixel brightness proportional to pixel coverage and that it affects the “blends” only, not the “cores.” In turn, erroneous gamma correction yields a misrepresentation of pixel coverage and therefore of the “fringe” sub-pixels. This explains why the ‘n’ rendered against a black background gets its stems all but obliterated by severe gamma under-correction (γ = 0.5), while severe gamma over-correction (γ = 4.0) emboldens these stems, as can be experienced in the illustration below.

Gamma correction and ClearType: Under-correcting gamma pushes the “fringes” towards the black background while over-correcting gamma pushes the “fringes” towards the nominal rendering color. The red ‘n’ in this example is most sensitive to gamma correction because it is all “fringe” and no “core.” The green and blue ‘n’ in this example are less sensitive to gamma correction because one of their “solid core” sub-pixels renders their nominal color green or blue (Verdana lc ‘n’ 15 ppem, showing the individual sub-pixels)
⇒ Hover your mouse over the above illustration to see the sub-pixel grid

Be sure to try both the γ = 0.5 and γ = 4.0 buttons in the illustration above. Severely under-correcting gamma pushes the “fringes” close to the black background while severely over-correcting gamma pushes the “fringes” towards the nominal rendering color.

In the above example, the red ‘n’ must render its nominal color (red) with its “fringe” sub-pixels, since its “solid core” sub-pixels align with the colors green and blue. This makes it very sensitive to erroneous gamma correction. While still affected by the latter, the green and blue ‘n’ fare better, because at least one of their “solid core” sub-pixels aligns with the nominal rendering color.

Notice that it is not generally the green and blue characters that fare better. It is merely this particular combination of fractional stem weights (cf ), fractional stem positions (cf ), and nominal rendering color that happens to align the “solid core” sub-pixels with the physical sub-pixels of the corresponding colors. Given another fractional stem position, red can fare better, while for heavier stems with 3 sub-pixels of “solid core,” any of the three primary colors red, green, and blue will perform equally well (or equally poorly).

Thus, as you can see, gamma correction is important, but it is only one piece of the font rendering puzzle: Gamma correction makes pixel brightness proportional to [sub-]pixel coverage, but it does not undo the damage caused by sub-pixel coverage that has been left out of consideration (cf ), nor eliminate the impression of “color fringing” (cf ), let alone augment insufficient stem widths (cf ). Together, “solid cores,” colors, and gamma correction once more represent a lateral shift of compromises.

Recall the numbered “shutters” introduced in . Each of the “shutters” represents exactly 1 pixel or 3 sub-pixels. When software wants to light a particular pixel with a particular color, it “tells” the graphics chip the location or coordinates of the pixel (e.g. 123 pixels over and 456 pixels down) and the RGB value of the color (e.g. #FF0000 or 255,0,0 for red). But how does the graphics chip communicate with the monitor?

Basically, there are two different methods, digital and analog. If you are using a laptop, the communication or video interface will be digital. Conceptually, the graphics chip simply relays the numbers for the pixel’s coordinates and its RGB value. The monitor then sets the corresponding “shutter” triplet to the requested color combination. The result will be that the exact spot corresponding to the targeted pixel will light up in red (cf ).

By contrast, an analog interface will have to produce a signal (voltage) to control the 3 individual electron guns. Conceptually, it will have to serialize all the pixels and their color values that are supposed to be shown on the monitor such that the monitor’s electron guns can repeatedly “paint” a picture on the fluorescent coating on the inside of the evacuated tube.

This comes with an inherent problem: The serialization effectively amounts to a digital-to-analog conversion, which is not exact. The voltages produced from the pixels may be close, but not necessarily exact. In turn, the monitor will have to use the voltages to modulate the electron guns, making sure that the picture is assembled correctly. CRTs used to have knobs to adjust the width and height of the picture in case the control mechanism that deflects the electron guns had the latter “overshoot” their targets.

Now, for CRTs this level of imprecision may not be a problem. CRTs don’t really have pixels, hence if the electron beams paint a slightly wider or narrower picture, this may go unnoticed.

But if you are using an LCD with an analog interface, the monitor doesn’t have any electron guns. It does, after all, have real pixels. Accordingly, the monitor will have to convert the received analog voltages back to digital to reconstruct the pixel numbers and color values, which is again inexact.

This is an unnecessary and potentially detrimental detour. The analog voltages are already inexact and the conversion back to digital will make this even more inexact because it has to round analog voltages back to integer numbers. Imagine if this lit up a pixel adjacent to the targeted one, or if this used the pixel’s red value to light up the adjacent green sub-pixel.

Gamma correction surely will not correct these color fringes! Loosely put, think of printing out a document and then scanning it back in. This doesn’t make sense, either.

Thus, if you use a CRT, you will be using an analog video interface, and sub-pixel anti-aliasing may not be the most suitable text rendering method. If you use an LCD with an analog interface, consider switching to a digital video interface (DVI). Many flat panel screens and graphics cards come with both analog and digital interfaces and it may be a mere matter of switching cables. If you use a laptop, you are all set.

In the days of CRTs, you may have switched your monitor to a lower resolution if the text size appeared too small to read comfortably. For instance, you may have switched it from 1280×1024 to 1024×768, followed by adjusting the knobs for horizontal and vertical “size.” This appeared to increase the pixel size. If text was formed by a certain number of pixels, the same text now looked larger.

Laptops and desktop computers with LCDs inherited this concept—probably for backwards compatibility reasons or to ease the transition for the end-users. But unlike CRTs, LCDs do have individually addressable pixels. Accordingly, these pixels have a fixed size and cannot be enlarged—at least not physically. Hence, to keep the concept of switching resolutions, hardware manufacturers introduced a way to make fewer pixels somehow fit the entire screen. They may call it Flat Panel Scaling or similar.

For the above example we thus have 1024×768 pixels that somehow must be “redistributed” to use up all of the 1280×1024 available pixels. This is not going to be exact. Think of a single pixel as covering a small area (1×1 pixels). For the purpose of this “redistribution,” said pixel will now have to cover a larger area (11/4×11/3 pixels).

Seen this way the “enlarged pixel” now also covers 1/4 of the pixel on the right, 1/3 of the pixel below, and 1/12 of the pixel below the right pixel. This could be represented by commensurate levels of gray, as illustrated below:

Simulation of flat panel scaling applied to an individual pixel: Pixels must be “redistributed” to cover larger areas
⇒ Hover your mouse over the above illustration for an idea how this could be achieved. The blue outline represents the area to be covered

But whatever the actual method of flat panel scaling is, it effectively amounts to downsampling with an unknown filter. It may be slightly better than the inadvertent unfiltered downsampling (cf ), but it is certainly less than optimal.

Simulation of flat panel scaling applied to a “test image:” If your LCD is set to its native resolution, and if your browser is set to 100% “Zoom” (“Ctrl+0” tends to reset the “Zoom” to 100% on modern browsers), you should see a set of sharply defined horizontal (left) and vertical (right) lines, along with a checker board pattern (center), and complemented by text rendered in pixilated bi-level (below). Each pixel is either black or white.
⇒ Hover your mouse over the above illustration to see how this “test image” can be scaled (using “bi-cubic interpolation” in Photoshop—likely as good if not better than any of the flat panel scaling algorithms). Notice how this adds intermediate shades of gray, rendering the checker board pattern as if it were some kind of “Tartan,” and rendering the bi-level text with some kind of “font smoothing”

As a font-maker, think of it this way: With a particular set of constraints or “hints” you may have found your individual balance between rendering sharply contrasted stems and crossbars and smoothly rendered round and diagonal parts. You have toned down the level of blur to your level of tolerance, and now the flat panel scaling increases the blur!

Therefore, leave the flat panel screen at its native resolution. You may have to dig around in the control panel applet of your graphics chip or the user manual of your computer to discover the native resolution. You may also have to look up a cryptic abbreviation such as “WUXGA” to learn that the native resolution of your LCD is 1920×1200 pixels. Use this information to double-check if the actual resolution is set to the native resolution.

Last but not least, let’s address the text size that appeared too small to read comfortably. Strictly speaking, referring to the monitor’s resolution as 1280×1024 or 1024×768 pixels is wrong. The number of pixels merely indicates that the entire width or height of the monitor is partitioned into 1280 or 768 pixels. It does not specify the resolution.

The resolution of your monitor is the number of pixels (or dots) per inch (or DPI for short). If you don’t know the DPI of your monitor, take a measuring tape and measure the visible part of the width of your screen in inches (convert accordingly if you use centimeters). Then divide the number of dots corresponding to the width of your screen by the width in inches, and you’ll get dots per inch.

Don’t worry if the number is not exact. Windows works best with 2 discrete resolutions, 96 and 120 DPI. Pick the one that is closer, or the one that works better for you—120 DPI will get you larger text. Use this value to “Adjust font size (DPI)” in Windows Vista’s “Personalization.” This should get you the text size you were looking for without incurring blurry artifacts caused by scaling your flat panel to a non-native “resolution.”

Up until now, whenever we have looked at how LCD pixels are structured into sub-pixels, it looked like this:

A theoretical representation of a color LCD pixel, illustrating the relative locations of the red, green, and blue sub-pixels

While this is a frequently used arrangement of the colors red, green, and blue into vertical stripes, other arrangements are possible. Manufacturers sometimes install the color filters back-to-front such that the left sub-pixel lights up in blue, while the right one gets red. Handheld devices may have the sub-pixels oriented in horizontal stripes, or users may rotate their screens from landscape into portrait orientation, both on desktop devices and on tablet PCs.

4 ways to arrange red, green, and blue sub-pixels stripes into a color LCD pixel

In theory, this allows for a total of 12 combinations, but the above 4 should cover all practical cases. Sub-pixel anti-aliasing must know the exact arrangement of the sub-pixels (LCD sub-pixel structure). If it doesn’t, or if it doesn’t respect that knowledge, then the rendering will be compromised.

To see this, recall the concept of pixel coverage. In a way, sub-pixel anti-aliasing determines pixel coverage in terms of the individual sub-pixels. For instance, if this determination yields that the left sub-pixel is “fully on,” it will need to encode this fact by a corresponding color component. Assuming the left sub-pixel is the red sub-pixel, it will encode this fact in the red color component.

But if the sub-pixels are arranged in BGR order, as opposed to RGB order, this will not affect the left sub-pixel. Instead, it will affect the right sub-pixel. In turn, this result will not be a good representation of pixel coverage. Likewise, if the sub-pixels are arranged horizontally, this may affect the top or the bottom sub-pixel. Again, this will not be a good representation of pixel coverage. The results can look distractingly blurry, as illustrated below (CAUTION: To properly view the following illustration, be sure to double-check all your settings as per the check-list introduced in ):

Matching or mismatching the LCD sub-pixel structure: If the sub-pixel structure doesn’t match, the rendered result can look distractingly blurry.
⇒ Be sure to try the above buttons: Start with the UC ‘H’ and compare RGB vs BGR. If the sub-pixel structure doesn’t match, it may look as if you had “double vision” or as if there were a shadow of the ‘H’ to the right and/or left of the ‘H.’ Then progress to the remaining buttons to identify further issues (Times New Roman UC ‘A’ and ‘H’ 12 pt at 96 DPI)

The difference between BGR and RGB is easy to address. The rendering software simply has to know what color is on the left and on the right of a pixel. With that single bit of information it can correctly translate partial pixel coverage to the corresponding color components.

But it must know. If you suspect your screen is in BGR while the rendering software assumes RGB, double-check with the ClearType control panel applet. This will let you correct the renderer’s assumption if applicable.

Rotation from landscape into portrait orientation is harder to address. For the rasterizer, it is easy to interchange the oversampling rates that apply to the x- and y-directions. But the result of doing so looks rather odd. The vertical stems are now horizontal bars, rendered with no oversampling at all. They look like they have been rendered in “black-and-white” (CAUTION: To properly view the following illustrations, be sure to double-check all your settings as per the check-list introduced in ):

Rotating from landscape into portrait orientation: the vertical stems are now horizontal bars rendered in “black-and-white” (Arial lc ‘n’ 11 pt at 96 DPI)

The arch of the lc ‘n’ above remains anti-aliased (“blurry”) while the stems become aliased (“sharp”). Notice that anti-aliasing in y-direction doesn’t help much unless the weights and positions of the stems are constrained to sample boundaries—which they are not:

Rotating from landscape into portrait orientation: merely switching to hybrid sub-pixel anti-aliasing won’t do (Arial lc ‘n’ 11 pt at 96 DPI)
⇒ Hover your mouse over the above illustration to revert to the lc ‘n’ rendered without hybrid sub-pixel anti-aliasing

We have covered the difficulties of rounding stroke weights and positions to sample boundaries, instead of pixel boundaries, and along with that the chances to “upgrade” the fonts.

Accordingly, at the time ClearType was first released, horizontal orientation of the sub-pixels could not be addressed. GDI simply renders text as if the sub-pixels were vertical. This looks blurrier than necessary, but maybe not quite as “odd” as rendering aliased stems with anti-aliased arches as illustrated above.

Of course this is not a solution, merely a workaround. Constraints (“hints”) really have to round stroke weights and positions to the respective sample boundaries, as illustrated below:

Rotating from landscape into portrait orientation requires fractional stroke weights (cf ) and positions (cf ) (Arial lc ‘n’ 11 pt at 96 DPI)
⇒ Hover your mouse over the above illustration to revert to the lc ‘n’ rendered without fractional stroke weights and positions

The above example does respect the LCD sub-pixel structure. This avoids introducing unnecessary blur, avoids the oddities illustrated previously, and generally improves consistency between text rendered horizontally vs vertically.

One of the practical obstacles of respecting the LCD sub-pixel structure is TrueType’s limited rounding functionality (cf ). But even if rounding to sample boundaries were possible with SROUND[], we wouldn’t know the number of samples per pixel in x- and y-direction for sure. This information is not discoverable at run-time. When constraining (“hinting”) a font, we have to assume we know the correct sample counts and that they don’t change (but cf ).

This may work fine until a client asks if such-and-such font is optimized for their RGBW screens (the ‘W’ represents a white sub-pixel). Following are 2 examples of RGBW sub-pixel structures:

Examples of RGBW sub-pixel structures: replacing 1 of the 2 green sub-pixels of a “Bayer” pattern (left), or extending the RGB striping pattern by a white stripe (right)

This requires the operating system, the rasterizer, and the constraints to work together. The operating system will have to identify the LCD sub-pixel structure, either through communication with the graphics chip, or through a setup “wizard” or similar. The correct LCD sub-pixel structure needs to be communicated with the rasterizer, and in turn, with the constraints. The constraints, for their part, will have to adapt to different oversampling rates, and potentially add new strategies to optimize rendering contrast in the presence of limited screen resolutions (cf and ).

Assume we have the perfect set of new fonts, with all of the above challenges and obstacles out of the way as far as possible. But despite all due care of the font makers and end-users, this doesn’t guarantee that these fonts will be seen as perfectly as intended. There is yet another kind of challenge. The operating system and the applications sometimes don’t honor the user’s settings and preferences.

From its inception in GDI, one of the primary objections to ClearType has been its lack of anti-aliasing in y-direction. ClearType implements asymmetric sub-pixel anti-aliasing. For body text at 96 DPI resolution, combined with suitable constraints to optimize rendering contrast (cf ), this may not be all that noticeable, at least in Latin-based text displayed on LCDs with a compatible sub-pixel structure (cf ). But as you are moving up the “size ramp,” the absence of anti-aliasing in the direction parallel to the long side of the sub-pixels becomes increasingly objectionable.

Therefore, sometime in late 2000 I spent an exhaustingly exciting weekend to prototype a combination of sub-pixel anti-aliasing in x-direction with full-pixel anti-aliasing in y-direction, using a suitable downsampling filter provided by MS Research. After some adjustments to gradually “phase-in” full-pixel anti-aliasing in y-direction, after some “jury rigs” to the rasterization (cf ), and after some performance optimizations, this implementation of hybrid sub-pixel anti-aliasing performed as desired. A variant of the first prototype now ships with WPF (Windows Presentation Foundation).

But, for reasons that I don’t know, it didn’t make its way into GDI. In turn, this left MS Office with a problem. With their code base built on GDI, Office understandably didn’t want to forfeit font “smoothing” in both x- and y-direction—particularly for PowerPoint presentations. Accordingly, Office applications switch to “standard font smoothing” for sufficiently large type sizes and/or device resolutions.

The threshold ppem size of this transition seems to depend on the individual fonts. For instance, at 12 point and 120 DPI (20 ppem), Word renders Verdana bold in “standard font smoothing” while Verdana regular uses ClearType.

Transitioning from asymmetric sub-pixel anti-aliasing to [symmetric] full-pixel anti-aliasing at intractable ppem sizes: The top line is rendered in “ClearType” while the bottom line is rendered in “standard font smoothing” within one and the same Word document (Verdana 12 pt at 120 DPI)
⇒ Hover your mouse over the above illustration to see how this renders in VTT’s “Text String” at the top of VTT’s “Main Window.” Both layouts use the “lay down one character after another” method

This looks about as odd as the rotation from landscape into portrait orientation illustrated in .

The primary difference I notice right away is not the presence or absence of colors. Rather, it is the two different ways to constrain outlines. In ClearType, Verdana rounds stroke weights and positions to fractional pixel positions (due to the respective “jury rig” in the rasterizer) while in “standard font smoothing” it continues to round to full pixel positions (in absence of “suitable jury rigs” in the rasterizer, or in absence of more suitable constraints or “hints”).

But just in case you are suspecting the “evil WYSIWYG algorithm:” No, it’s not! I took the above screen shots in Word’s “Web Layout Mode” which implements a reflowable layout (cf ), substantially “laying down one character after another.” Be sure to hover your mouse over the above illustration to see the same examples rendered with VTT’s “Text String” at the top of its “Main Window.” I know for sure that this uses the “lay down one character after another” strategy.

Now, if you create an Office document using the new default font Calibri, you may be surprised to find that at 18 pt and 120 DPI (30 ppem) and above text looks a bit “washed out” or “fuzzy” compared to text at smaller point sizes. The reason behind this is twofold. On the one hand, 30 ppem appears to be the threshold for Office to switch to “standard font smoothing,” at least for Calibri regular.

On the other hand, for reasons that I don’t know, Calibri’s constraints (“hints”) are switched off if Calibri is rendered in “standard font smoothing” and at 20 ppem (12 pt at 120 DPI) or above. Combined with the two different approaches to gamma correction (cf ), this explains the “washed out” or “fuzzy” appearance of text at 18 pt at 120 DPI (30 ppem) and above.

There are a few “hoops” to jump through if you would rather turn off ClearType. I can understand why ClearType is “on” by default in Windows Vista. By the time Vista was released, LCDs outsold CRTs, and as far as I understand, a majority of users prefer ClearType over pixilated “black-and-white” rendering. However, there are users who don’t, and they may have valid reasons.

ClearType seems easy to turn off in Vista once you know where to “set about.” Just go to personalize “Appearance and Sounds,” select “Windows Color And Appearance,” click “Effects,” and change or uncheck the “method to smooth edges of screen fonts,” and you should be all set. Frankly, I didn’t know that good type is a special effect, but then again, I don’t know the first thing about marketing. Just be sure to click “Ok” and “Apply” and ClearType is off.

Once you start browsing the internet with Internet Explorer (IE 7), ClearType comes back on—probably kind of uninvitedly so. Turns out that IE 7 has its own set of user preferences. Go to “Tools,” then “Internet Options,” and then “Advanced.” Now scroll down to “Multimedia” and uncheck “Always use ClearType for HTML.” So it’s not a special effect anymore, it’s now an advanced multimedia option. Be sure to restart IE and you should be all set.

Similarly, as far as I read, when you use Office 2007 on Windows XP, ClearType is back on—again. According to the on-line documentation, turning off ClearType is not recommended. Rest assured though that this is perfectly safe. Click the “Office Button” to go to “Word Options,” choose “Popular,” and uncheck “Always use ClearType.” After special effects and advanced multimedia, ClearType is now a popular option. Just remember to restart Office and you should be ready to type without ClearType.

If nothing else, the above behavior appears consistent with the user experience I get from Vista elsewhere. For instance, as far as I can tell, there is no simple way to configure all current and future folders to look the same for ease of navigation. Instead, sometimes you have to “Customize” the “Properties” of a folder to merely show all your files as plain “Documents”—only to have said customization spontaneously revert back to “Pictures and Videos” right in front of your eyes because some random spacer.gif ended up in that folder…

*****

In this chapter we have looked at a number of challenges and obstacles on the path from the pixel data to the light appearing on screen. Most of these challenges may be the same or at least fairly similar for any computer, operating system, or software you may be using.

Some may be specific to Windows, such as the unfiltered downsampling (cf ) and the inconsistent user control of gamma correction (cf ) in the presence of anti-aliased text, or the seemingly erratic honoring of user settings and preferences (cf ).

There may be valid reasons for any of the individual challenges and obstacles discussed in this chapter, but one thing should be clear:

It all has to work together for the best end-user experience!

After covering a range of opportunities and challenges, the next chapter will finally get to discuss font rendering in broader terms.

← previous chapter, ↑ top, next chapter →