Autostereogram

Updated on Apr 25, 2026

Edit

Comment

An autostereogram is a single-image stereogram (SIS), designed to create the visual illusion of a three-dimensional (3D) scene from a two-dimensional image. In order to perceive 3D shapes in these autostereograms, one must overcome the normally automatic coordination between accommodation (focus) and horizontal vergence (angle of one's eyes). The illusion is one of depth perception and involves stereopsis: depth perception arising from the different perspective each eye has of a three-dimensional scene, called binocular parallax.

The simplest type of autostereogram consists of horizontally repeating patterns (often separate images) and is known as a wallpaper autostereogram. When viewed with proper convergence, the repeating patterns appear to float above or below the background. The well-known Magic Eye books feature another type of autostereogram called a random dot autostereogram. One such autostereogram is illustrated above right. In this type of autostereogram, every pixel in the image is computed from a pattern strip and a depth map. A hidden 3D scene emerges when the image is viewed with the correct convergence.

Autostereograms are similar to normal stereograms except they are viewed without a stereoscope. A stereoscope presents 2D images of the same object from slightly different angles to the left eye and the right eye, allowing us to reconstruct the original object via binocular disparity. When viewed with the proper vergence, an autostereogram does the same, the binocular disparity existing in adjacent parts of the repeating 2D patterns.

There are two ways an autostereogram can be viewed: wall-eyed and cross-eyed. Most autostereograms (including those in this article) are designed to be viewed in only one way, which is usually wall-eyed. Wall-eyed viewing requires that the two eyes adopt a relatively parallel angle, while cross-eyed viewing requires a relatively convergent angle. An image designed for wall-eyed viewing if viewed correctly will appear to pop out of the background, while if viewed cross-eyed it will instead appear as a cut-out behind the background and may be difficult to bring entirely into focus.

History

In 1838, the British scientist Charles Wheatstone published an explanation of stereopsis (binocular depth perception) arising from differences in the horizontal positions of images in the two eyes. He supported his explanation by showing pictures with such horizontal differences, stereograms, separately to the left and right eyes through a stereoscope he invented based on mirrors. When people looked at these flat, two-dimensional pictures, they experienced the illusion of three-dimensional depth.

Between 1849 and 1850, David Brewster, a Scottish scientist, improved the Wheatstone stereoscope by using lenses instead of mirrors, thus reducing the size of the device.

Brewster also discovered the "wallpaper effect". He noticed that staring at repeated patterns in wallpapers could trick the brain into matching pairs of them as coming from the same virtual object on a virtual plane behind the walls. This is the basis of wallpaper-style "autostereograms" (also known as single-image stereograms).

In 1939 Boris Kompaneysky published the first Random dot stereogram containing an image of the face of Venus, intended to be viewed with a device.

In 1959, Bela Julesz, a vision scientist, psychologist and MacArthur Fellow, invented the random dot stereogram while working at Bell Laboratories on recognizing camouflaged objects from aerial pictures taken by spy planes. At the time, many vision scientists still thought that depth perception occurred in the eye itself, whereas now it is known to be a complex neurological process. Julesz used a computer to create a stereo pair of random-dot images which, when viewed under a stereoscope, caused the brain to see 3D shapes. This proved that depth perception is a neurological process.

Japanese designer Masayuki Ito, following Julesz, created a single image stereogram in 1970 and Swiss painter Alfons Schilling created a handmade single-image stereogram in 1974, after creating more than one viewer and meeting with Julesz. Having experience with stereo imaging in holography, lenticular photography, and vectography, he developed a random-dot method based on closely spaced vertical lines in parallax.

In 1979, Christopher Tyler of Smith-Kettlewell Institute, a student of Julesz and a visual psychophysicist, combined the theories behind single-image wallpaper stereograms and random-dot stereograms (the work of Julesz and Schilling) to create the first black-and-white "random-dot autostereogram" (also known as single-image random-dot stereogram) with the assistance of computer programmer Maureen Clarke using Apple II and BASIC. This type of autostereogram allows a person to see 3D shapes from a single 2D image without the aid of optical equipment. In 1991 computer programmer Tom Baccei and artist Cheri Smith created the first color random-dot autostereograms, later marketed as Magic Eye.

A computer procedure that extracts back the hidden geometry out of an autostereogram image was described by Ron Kimmel. In addition to classical stereo it adds smoothness as an important assumption in the surface reconstruction.

Simple wallpaper

Stereopsis, or stereo vision, is the visual blending of two similar but not identical images into one, with resulting visual perception of solidity and depth. In the human brain, stereopsis results from complex mechanisms that form a three-dimensional impression by matching each point (or set of points) in one eye's view with the equivalent point (or set of points) in the other eye's view. Using binocular disparity, the brain derives the points' positions in the otherwise inscrutable z-axis (depth).

When the brain is presented with a repeating pattern like wallpaper, it has difficulty matching the two eyes' views accurately. By looking at a horizontally repeating pattern, but converging the two eyes at a point behind the pattern, it is possible to trick the brain into matching one element of the pattern, as seen by the left eye, with another (similar looking) element, beside the first, as seen by the right eye. With the typical wall-eyed viewing, this gives the illusion of a plane bearing the same pattern but located behind the real wall. The distance at which this plane lies behind the wall depends only on the spacing between identical elements.

Autostereograms use this dependence of depth on spacing to create three-dimensional images. If, over some area of the picture, the pattern is repeated at smaller distances, that area will appear closer than the background plane. If the distance of repeats is longer over some area, then that area will appear more distant (like a hole in the plane).

People who have never been able to perceive 3D shapes hidden within an autostereogram find it hard to understand remarks such as, "the 3D image will just pop out of the background, after you stare at the picture long enough", or "the 3D objects will just emerge from the background". It helps to illustrate how 3D images "emerge" from the background from a second viewer's perspective. If the virtual 3D objects reconstructed by the autostereogram viewer's brain were real objects, a second viewer observing the scene from the side would see these objects floating in the air above the background image.

The 3D effects in the example autostereogram are created by repeating the tiger rider icons every 140 pixels on the background plane, the shark rider icons every 130 pixels on the second plane, and the tiger icons every 120 pixels on the highest plane. The closer a set of icons are packed horizontally, the higher they are lifted from the background plane. This repeat distance is referred to as the depth or z-axis value of a particular pattern in the autostereogram. The depth value is also known as Z-buffer value.

The brain is capable of almost instantly matching hundreds of patterns repeated at different intervals in order to recreate correct depth information for each pattern. An autostereogram may contain some 50 tigers of varying size, repeated at different intervals against a complex, repeated background. Yet, despite the apparent chaotic arrangement of patterns, the brain is able to place every tiger icon at its proper depth.

Depth maps

Autostereograms where patterns in a particular row are repeated horizontally with the same spacing can be read either cross-eyed or wall-eyed. In such autostereograms, both types of reading will produce similar depth interpretation, with the exception that the cross-eyed reading reverses the depth (images that once popped out are now pushed in).

However, icons in a row do not need to be arranged at identical intervals. An autostereogram with varying intervals between icons across a row presents these icons at different depth planes to the viewer. The depth for each icon is computed from the distance between it and its neighbor at the left. These types of autostereograms are designed to be read in only one way, either cross-eyed or wall-eyed. All autostereograms in this article are encoded for wall-eyed viewing, unless specifically marked otherwise. An autostereogram encoded for wall-eyed viewing will produce inverse patterns when viewed cross-eyed, and vice-versus. Most Magic Eye pictures are also designed for wall-eyed viewing.

The wall-eyed depth map example autostereogram to the right encodes 3 planes across the x-axis. The background plane is on the left side of the picture. The highest plane is shown on the right side of the picture. There is a narrow middle plane in the middle of the x-axis. Starting with a background plane where icons are spaced at 140 pixels, one can raise a particular icon by shifting it a certain number of pixels to the left. For instance, the middle plane is created by shifting an icon 10 pixels to the left, effectively creating a spacing consisting of 130 pixels. The brain does not rely on intelligible icons which represent objects or concepts. In this autostereogram, patterns become smaller and smaller down the y-axis, until they look like random dots. The brain is still able to match these random dot patterns.

The distance relationship between any pixel and its counterpart in the equivalent pattern to the left can be expressed in a depth map. A depth map is simply a grayscale image which represents the distance between a pixel and its left counterpart using a grayscale value between black and white. By convention, the closer the distance is, the brighter the color becomes.

Using this convention, a grayscale depth map for the example autostereogram can be created with black, gray and white representing shifts of 0 pixels, 10 pixels and 20 pixels, respectively as shown in the greyscale example autostereogram. A depth map is the key to creation of random-dot autostereograms.

Random-dot

A computer program can take a depth map and an accompanying pattern image to produce an autostereogram. The program tiles the pattern image horizontally to cover an area whose size is identical to the depth map. Conceptually, at every pixel in the output image, the program looks up the grayscale value of the equivalent pixel in the depth map image, and uses this value to determine the amount of horizontal shift required for the pixel.

One way to accomplish this is to make the program scan every line in the output image pixel-by-pixel from left to right. It seeds the first series of pixels in a row from the pattern image. Then it consults the depth map to retrieve appropriate shift values for subsequent pixels. For every pixel, it subtracts the shift from the width of the pattern image to arrive at a repeat interval. It uses this repeat interval to look up the color of the counterpart pixel to the left and uses its color as the new pixel's own color.

Unlike the simple depth planes created by simple wallpaper autostereograms, subtle changes in spacing specified by the depth map can create the illusion of smooth gradients in distance. This is possible because the grayscale depth map allows individual pixels to be placed on one of 2ⁿ depth planes, where n is the number of bits used by each pixel in the depth map. In practice, the total number of depth planes is determined by the number of pixels used for the width of the pattern image. Each grayscale value must be translated into pixel space in order to shift pixels in the final autostereogram. As a result, the number of depth planes must be smaller than the pattern width.

The fine-tuned gradient requires a pattern image more complex than standard repeating-pattern wallpaper, so typically a pattern consisting of repeated random dots is used. When the autostereogram is viewed with proper viewing technique, a hidden 3D scene emerges. Autostereograms of this form are known as Random Dot Autostereograms.

Smooth gradients can also be achieved with an intelligible pattern, assuming that the pattern is complex enough and does not have big, horizontal, monotonic patches. A big area painted with monotonic color without change in hue and brightness does not lend itself to pixel shifting, as the result of the horizontal shift is identical to the original patch. The following depth map of a shark with smooth gradient produces a perfectly readable autostereogram, even though the 2D image contains small monotonic areas; the brain is able to recognize these small gaps and fill in the blanks (illusory contours). While intelligible, repeated patterns are used instead of random dots, this type of autostereogram is still known by many as a Random Dot Autostereogram, because it is created using the same process.

Animated

When a series of autostereograms are shown one after another, in the same way moving pictures are shown, the brain perceives an animated autostereogram. If all autostereograms in the animation are produced using the same background pattern, it is often possible to see faint outlines of parts of the moving 3D object in the 2D autostereogram image without wall-eyed viewing; the constantly shifting pixels of the moving object can be clearly distinguished from the static background plane. To eliminate this side effect, animated autostereograms often use shifting background in order to disguise the moving parts.

When a regular repeating pattern is viewed on a CRT monitor as if it were a wallpaper autostereogram, it is usually possible to see depth ripples. This can also be seen in the background to a static, random-dot autostereogram. These are caused by the sideways shifts in the image due to small changes in the deflection sensitivity (linearity) of the line scan, which then become interpreted as depth. This effect is especially apparent at the left hand edge of the screen where the scan speed is still settling after the flyback phase. On a TFT LCD, which functions differently, this does not occur and the effect is not present. Higher quality CRT displays also have better linearity and exhibit less or none of this effect.

Mechanisms for viewing

Much advice exists about seeing the intended three-dimensional image in an autostereogram. While some people may quickly see the 3D image in an autostereogram with little effort, others must learn to train their eyes to decouple eye convergence from lens focusing.

Not every person can see the 3D illusion in autostereograms. Because autostereograms are constructed based on stereo vision, persons with a variety of visual impairments, even those affecting only one eye, are unable to see the three-dimensional images.

People with amblyopia (also known as lazy eye) are unable to see the three-dimensional images. Children with poor or dysfunctional eyesight during a critical period in childhood may grow up stereoblind, as their brains are not stimulated by stereo images during the critical period. If such a vision problem is not corrected in early childhood, the damage becomes permanent and the adult will never be able to see autostereograms. It is estimated that some 1 percent to 5 percent of the population is affected by amblyopia.

3D perception

Depth perception results from many monocular and binocular visual clues. For objects relatively close to the eyes, binocular vision plays an important role in depth perception. binocular vision allows the brain to create a single Cyclopean image and to attach a depth level to each point in it.

The brain uses coordinate shift (also known as parallax) of matched objects to identify depth of these objects. The depth level of each point in the combined image can be represented by a grayscale pixel on a 2D image, for the benefit of the reader. The closer a point appears to the brain, the brighter it is painted. Thus, the way the brain perceives depth using binocular vision can be captured by a depth map (Cyclopean image) painted based on coordinate shift.

The eye operates like a photographic camera. It has an adjustable iris which can open (or close) to allow more (or less) light to enter the eye. As with any camera except pinhole cameras, it needs to focus light rays entering through the iris (aperture in a camera) so that they focus on a single point on the retina in order to produce a sharp image. The eye achieves this goal by adjusting a lens behind the cornea to refract light appropriately.

Stereo-vision based on parallax allows the brain to calculate depths of objects relative to the point of convergence. It is the convergence angle that gives the brain the absolute reference depth value for the point of convergence from which absolute depths of all other objects can be inferred.

Simulated 3D perception

The eyes normally focus and converge at the same distance in a process known as accommodative convergence. That is, when looking at a faraway object, the brain automatically flattens the lenses and rotates the two eyeballs for wall-eyed viewing. It is possible to train the brain to decouple these two operations. This decoupling has no useful purpose in everyday life, because it prevents the brain from interpreting objects in a coherent manner. To see a man-made picture such as an autostereogram where patterns are repeated horizontally, however, decoupling of focusing from convergence is crucial.

By focusing the lenses on a nearby autostereogram where patterns are repeated and by converging the eyeballs at a distant point behind the autostereogram image, one can trick the brain into seeing 3D images. If the patterns received by the two eyes are similar enough, the brain will consider these two patterns a match and treat them as coming from the same imaginary object. This type of visualization is known as wall-eyed viewing, because the eyeballs adopt a wall-eyed convergence on a distant plane, even though the autostereogram image is actually closer to the eyes. Because the two eyeballs converge on a plane farther away, the perceived location of the imaginary object is behind the autostereogram. The imaginary object also appears bigger than the patterns on the autostereogram because of foreshortening.

The following autostereogram shows three rows of repeated patterns. Each pattern is repeated at a different interval to place it on a different depth plane. The two non-repeating lines can be used to verify correct wall-eyed viewing. When the autostereogram is correctly interpreted by the brain using wall-eyed viewing, and one stares at the dolphin in the middle of the visual field, the brain should see two sets of flickering lines, as a result of binocular rivalry.

While there are six dolphin patterns in the autostereogram, the brain should see seven "apparent" dolphins on the plane of the autostereogram. This is a side effect of the pairing of similar patterns by the brain. There are five pairs of dolphin patterns in this image. This allows the brain to create five apparent dolphins. The leftmost pattern and the rightmost pattern by themselves have no partner, but the brain tries to assimilate these two patterns onto the established depth plane of adjacent dolphins despite binocular rivalry. As a result, there are seven apparent dolphins, with the leftmost and the rightmost ones appearing with a slight flicker, not dissimilar to the two sets of flickering lines observed when one stares at the 4th apparent dolphin.

Because of foreshortening, the difference in convergence needed to see repeated patterns on different planes causes the brain to attribute different sizes to patterns with identical 2D sizes. In the autostereogram of three rows of cubes, while all cubes have the same physical 2D dimensions, the ones on the top row appear bigger, because they are perceived as farther away than the cubes on the second and third rows.

Viewing techniques

If one has two eyes, fairly healthy eyesight, and no neurological conditions which prevent the perception of depth then one is capable of learning to see the images within autostereograms. "Like learning to ride a bicycle or to swim, some pick it up immediately, while others have a harder time."

As with a photographic camera, it is easier to make the eye focus on an object when there is intense ambient light. With intense lighting, the eye can constrict the pupil, yet allow enough light to reach the retina. The more the eye resembles a pinhole camera, the less it depends on focusing through the lens. In other words, the degree of decoupling between focusing and convergence needed to visualize an autostereogram is reduced. This places less strain on the brain. Therefore, it may be easier for first-time autostereogram viewers to "see" their first 3D images if they attempt this feat with bright lighting.

Vergence control is important in being able to see 3D images. Thus it may help to concentrate on converging/diverging the two eyes to shift images that reach the two eyes, instead of trying to see a clear, focused image. Although the lens adjusts reflexively in order to produce clear, focused images, voluntary control over this process is possible. The viewer alternates instead between converging and diverging the two eyes, in the process seeing "double images" typically seen when one is drunk or otherwise intoxicated. Eventually the brain will successfully match a pair of patterns reported by the two eyes and lock onto this particular degree of convergence. The brain will also adjust eye lenses to get a clear image of the matched pair. Once this is done, the images around the matched patterns quickly become clear as the brain matches additional patterns using roughly the same degree of convergence.

When one moves one's attention from one depth plane to another (for instance, from the top row of the chessboard to the bottom row), the two eyes need to adjust their convergence to match the new repeating interval of patterns. If the level of change in convergence is too high during this shift, sometimes the brain can lose the hard-earned decoupling between focusing and convergence. For a first-time viewer, therefore, it may be easier to see the autostereogram, if the two eyes rehearse the convergence exercise on an autostereogram where the depth of patterns across a particular row remains constant.

In a random dot autostereogram, the 3D image is usually shown in the middle of the autostereogram against a background depth plane (see the shark autostereogram). It may help to establish proper convergence first by staring at either the top or the bottom of the autostereogram, where patterns are usually repeated at a constant interval. Once the brain locks onto the background depth plane, it has a reference convergence degree from which it can then match patterns at different depth levels in the middle of the image.

The majority of autostereograms, including those in this article, are designed for divergent (wall-eyed) viewing. One way to help the brain concentrate on divergence instead of focusing is to hold the picture in front of the face, with the nose touching the picture. With the picture so close to their eyes, most people cannot focus on the picture. The brain may give up trying to move eye muscles in order to get a clear picture. If one slowly pulls back the picture away from the face, while refraining from focusing or rotating eyes, at some point the brain will lock onto a pair of patterns when the distance between them matches the current convergence degree of the two eyeballs.

Another way is to stare at an object behind the picture in an attempt to establish proper divergence, while keeping part of the eyesight fixed on the picture to convince the brain to focus on the picture. A modified method has the viewer focus on their reflection on a reflective surface of the picture, which the brain perceives as being located twice as far away as the picture itself. This may help persuade the brain to adopt the required divergence while focusing on the nearby picture.

For crossed-eyed autostereograms, a different approach needs to be taken. The viewer may hold one finger between their eyes and move it slowly towards the picture, maintaining focus on the finger at all times, until they are correctly focused on the spot that will allow them to view the illusion.

Terminology

Stereogram and autostereogram

Stereogram was originally used to describe as a pair of 2D images used in stereoscope to present a 3D image to viewers. The "auto" in autostereogram describes an image that does not require a stereoscope. The term stereogram is now often used interchangeably with autostereogram. Dr. Christopher Tyler, inventor of the autostereogram, consistently refers to single image stereograms as autostereograms to distinguish them from other forms of stereograms.

Random dot stereogram (RDS)

Random dot stereogram, describes a pair of 2D images containing random dots which, when viewed with a stereoscope, produced a 3D image. The term is now often used interchangeably with random dot autostereogram.

Single image stereogram (SIS)

Single image stereogram (SIS). SIS differs from earlier stereograms in its use of a single 2D image instead of a stereo pair and is viewed without a device. Thus, the term is often used as a synonym of autostereogram. When the single 2D image is viewed with proper eye convergence, it causes the brain to fuse different patterns perceived by the two eyes into a virtual 3D image without, hidden within the 2D image, the aid of any optical equipment. SIS images are created using a repeating pattern. Programs for their creation include Mathematica.

Random dot autostereogram/hidden image stereogram

Is also known as single image random dot stereogram (SIRDS). This term also refers to autostereograms where the hidden 3D image is created using a random pattern of dots within one image, shaped by a depth map within a dedicated stereogram rendering program.

Wallpaper autostereogram/object array stereogram/texture offset stereogram

Wallpaper autostereogram is a single 2D image where recognizable patterns are repeated at various intervals to raise or lower each pattern's perceived 3D location in relation to the display surface. Despite the repetition, these are a type of single image autostereogram.

Single image random text stereogram (SIRTS)

An single image random text ASCII stereogram is an alternative to SIRDS using random ASCII text instead of dots to produce a 3D form of ASCII art.

Map textured stereogram

In a map textured stereogram, "a fitted texture is mapped onto the depth image and repeated a number of times" resulting in a pattern where the resulting 3D image is often partially or fully visible before viewing.

References

Autostereogram Wikipedia

(Text) CC BY-SA

Contents