1.8. How Are Graphics Generated?

Before moving forward with graphics, we should discuss some basics about how we generate graphics at the hardware level so you can truly understand what is going on. To begin with, there are two types of graphics: raster and vector. 2D Raster graphics consists of a 2-dimensional addressible matrix of color values. The smallest element of this matrix is known as a pixel. A pixel can be only one color. The following image is an example of a raster graphic with an enlarged version shown on the right with a grid to highlight the discrete pixels of the image. Notice that the pixels never contain more than a single color.

The width and height of a raster graphic is known as its resolution. Thus, the graphic above has a resolution of 32x32 pixels. Each pixel's color in a raster graphic is represented by a number which is almost always an integer. The size of the integer used to represent a color is known as the image's color depth. On most modern systems, 32-bit integers are used to represent colors which means we say that they have 32-bit color depth as in the following image:

Older systems often used 16-bit or 8-bit integers which means that they often could not achieve photorealistic graphics (e.g., the very first Nintendo). (n.b., the exception to this would be images that use color palattes with 8 or 16-bit colors on modern systems, but palettes are beyond the scope of this unit). The following image is the image above reduced to 256 colors (8-bit):

Given this description of raster graphics, how are they implemented in hardware? The primary graphical hardware of a computer is obviously the monitor which in all modern systems is raster hardware by design. Both older cathode ray tube (CRT) monitors and modern flat panel monitors seek to achieve the same ends: a 2-dimensional addressible matrix of colored pixels. But how do we achieve millions of different colors? First we have to understand a little about the physics of light and the biology of the human eye. The best model to understand color and visible light is the wave theory of light. The wave theory posits that light travels in waves. The type of light is defined by its wavelength which is the distance between two peaks of a wave. The closer the peaks, the shorter the wavelength. Radio waves and x-rays are both forms of light, but what we know as visible light is light with wavelengths between about 400 and 700 nanometers (1 billion nanometers equals 1 meter).

Credit: Wikimedia Commons

The human eye is trichromatic because it contains three different color receptors (known as cones) that respond to a range of different wavelengths. The three types of cones are known as short, medium, and long after the length of wavelengths they can detect (i.e. 700 nm is longer than 400nm). As you can see, these ranges overlap.

Credit: Wikimedia Commons

Due to the wiring of the brain, stimulation of short wavelength cones produces the perception of blueish hues (A hue is the shade of a color. Though red and maroon are different colors, they have the same hue. They differ only in brightness). Stimulation of medium wavelength cones corresponds to yellowish-green hues and stimulation of long wavelength cones corresponds to reddish hues. The reason we say blueish and reddish instead of blue and red is because cones respond to ranges of hues and not just distinct hues. Looking at the chart above you can see that if light with a wavelength of 700nm falls on the retina (part of the eye that detects light), long wavelength cones will be stimulated with almost no stimulation of short or medium cones. However, what happens with light of 600nm? Both long and medium cones are stimulated. Or with 450nm short cones are predominately stimulated but long and medium cones are stimulated to a lesser extent as well. If multiple cones are stimulated, they compete. This competition yields the range of thousands of different hues we detect. The full visible spectrum in the figure above is the result of these hues. For example, you can see how the brain produces yellow by interpolating between red and green. You can also see why we say white light is a combination of all colors: if all of the cones are equally stimulated, the brain interprets the color as white.

So we can assign a hue to every wavelength of light. But how can we implement this at the level of computer monitor hardware? Can every pixel produce any wavelength of light? Obviously this would produce all the colors we need, but it is actually quite tricky to physically implement. Of course, since we understand how the eye works, we can trick the brain into perceiving any color we want simply by stimulating the three cones. For example, if the eye receives light at a wavelength of 580nm it will perceive it as yellow due to how the long and medium cones are stimulated. However, if it receives light composed of 2 wavelengths: 650nm (red) and 550nm (green) it will again simply compare the responses of the long and medium cones and also come up with yellow. The light composed of red and green that falls on the retina isn't really yellow monochromatic (one color) light as defined by physicists, but the brain cannot tell the difference. We can use what is basically an optical illusion to our advantage in monitor hardware. All we need to do is define 3 primary colors (if humans had 4 cones we would have to define 4 primary colors). Essentially every computer monitor in use today relies on 3 primary colors (red, green, and blue) which are known as the RGB color model. Although different monitor hardware produces slightly different wavelengths, these colors have wavelengths around 650nm for red, 550nm for green, and 450nm for blue. If a pixel is small enough, the eye physically resolves that location on a screen as a single point (How small a pixel has to be depends of course on how far away the viewer is from the screen. In other words, pixels on digital billboards can be much larger than screen pixels because the viewer is further away from the billboard.) Although the eye resolves a pixel as a point, it is actually composed of three subpixels which can produce only red, green, and blue at different intensities. Of course, the eye sees all three subpixels as one and thus tries to interpret it as a single color. So if we want yellow, we shut off the blue subpixel and turn on the red and green subpixels to full intensity. This technique is true for both flat panels and older CRT monitors, but you should be aware that the technology behind each is very different. And more importantly, older hardware has different standards for what wavelengths are defined for red, green, and blue. This means that a color on one monitor will appear as a different color on a different monitor (that's why most modern monitors adhere to similar standards).

Given that the hardware needs to know what intensity each subpixel should be set to, how do we implement this in software? The answer is in the integers we mentioned earlier that define colors. As we said, almost all color mappings today use 32-bit integers. This means there are 4 bytes per color. However, we use only 3 bytes for color: one each for red, blue, and green. The 4th byte is generally used to specify transparency (i.e., the degree to which the background image should be blended with the new image). Using 3 bytes means we have

or just over 16 million possible colors.

Of course, there are a lot of steps between a 32-bit number in software and a colored pixel on monitor hardware. This is where we benefit from virtualization and encapsulation. There are many layers of software between the programmer and the hardware. The software layer closest to the hardware is known as a driver. The software layer closest to the programmer is generally known as a graphics engine. While monitor hardware deals with raster graphics, graphics engines can be either raster-based or vector-based. Raster graphics engines allow the drawing of basic points, lines, and shapes in which locations are defined in terms of pixels (or at least some whole number grid). In raster graphics engines, the exact color of each pixel is the responsibility of the programmer. For example, if a red line is drawn from point A to point B, every pixel along that line is red and all other pixels are left unchanged:

Vector graphics engines also allow drawing points, lines, and shapes except the locations are defined in terms of some other units (sometimes inches, millimeters, or points) which do not necessarily have to relate directly to pixels. Thus, in vector graphics engines, the programmer specifies only that a line should be between points A and B and the engine determines the colors of the pixels in between:

Notice that the line from the vector image appears smoother than in the raster image. This is because even though pixels are small enough to be visibly perceived as single points, they are not small enough so that every point on a line falls exactly on a single pixel. To understand this, we will plot a line from the start and end points of the raster image on the zoomed grid:

If a line crosses only a tiny bit of a pixel, the raster engine colors that pixel full red and ignores all other pixels. On the other hand, consider the vector-engine line:

Notice that the vector engine starts plotting the line from the center of the start and end pixels and weights pixels by how much of the line crosses through each of them. In other words, a line that crosses only a single corner of a pixel is colored light red while a line that passes from the top-left corner to the bottom-right corner is colored full red. The reason the vector line appears smoother is that the brain perceives the partially shaded pixels as if the line crosses both of them at subpixel resolution. Thus, vector graphics result in visually smoother graphics. The other advantage of vector graphics is that they can be zoomed to any size or rotated without losing image quality. The reason for this is that shapes are defined not in terms of pixels but in terms of coordinates which can easily be scaled and rotated.

Since all monitors expect to receive raster graphics commands, the vector graphics image handles the rasterization of the image internally just before sending it to the screen to be displayed. Thus, the difference between vector and raster graphics is yet another example of abstraction in programming. Vector graphics engines remove the need for the programmer to worry about pixels and allow him/her to focus on design (abstracting away the details of producing an image at the hardware level just as high-level programming languages abstract away the production of machine code for the processor). Vector graphic images or movies are stored in a vector graphics format like SVG (Scalable Vector Graphics) or Adobe Flash (SWF). Raster graphic images or movies are stored in raster graphics formats like PNG, JPEG, GIF, BMP, etc. Raster graphic formats are generally more well known, but it is very important to understand how to use vector formats for things like web pages (where scaling can be very important for readability) and posters/presentations where screen or paper sizes are different at the final presentation stage than at the design stage.

In this book we cover both raster and vector graphics. The FTGraphics library is currently a vector-raster hybrid graphics engine (i.e., you specify general coordinates instead of exact pixels). The reason it is not purely a vector engine is that effects such as smooth line drawing have not yet been added (however, this will change in the future).