Basic information for depth

  

Summary: Non-scalar or relative distance perception is often simply referred to as "depth" perception. It involves some of the factors that allow us to perceive the separations of objects in our three dimensional world. Many different kinds of information (sometimes called "cues") contribute to the appearance of depth. The present article provides a basic summary of such information in the context of a realistic scene.

Many of the non-scalar factors that aid in seeing a realistic, everyday scene – with some objects appearing nearer and others appearing farther – arise from the images presented to us by the scene.  We also use information involving the registration of how our ocular muscles react, when presented with a particular scene, as well as information based on differences between the images reaching our left and right eyes. Over any realistic period, positional changes among moving images, and differences in our eye muscles over time, also contribute to seeing depth.  Finally, there are certain built-in characteristics of our perceptual system that also affect how things appear.

Although this article does not include an exhaustive list of all cues for distance, the primary omissions involve factors that provide not only a basis for perceiving the relative positions of objects in depth, but also contribute a sense of scale.  Such "scalar" information provides a sense of how far away an object is from us, in behaviorally meaningful terms.  These factors are discussed in the article Basic information for scalar distance.  Furthermore, many of the depth cues included in the present article deserve individual elaboration; for example, see the article Do smaller things appear farther away?

Examples using a natural scene

The photograph of a scene from a game of American football provides several excellent examples of different non-scalar depth cues.  Specifically, the photo in Figure 1 shows the players of two teams awaiting the start of a play (or "down").   Notice that the players clearly appear spread across the field, some seeming much farther than others.

Figure 1

Changes in angular size

One important factor in creating the apparent depth in Figure 1 involves what is often called "relative size."  Note that the images of the players in the bottom left-hand corner of the display have a much larger size.  This is equally true whether one measures the photo itself or calculates the angular size of each player.  Other things being equal, larger images of similar objects will appear closer to you.

Related to what has been called the relative size cue is the "linear perspective cue."  Note the lines on the field.  The angular separations of the lines in the lower portion of the picture are clearly greater than the angular separations of the same lines when one observes the upper edge of the display.  This continuous change in angular separation also creates a sense of increasing depth within the display.

Although perhaps not as noticeable, it is possible to see more detail in the texture of the ground surface in the lower part of the display than in the upper portion.  Such systematic changes are often called "texture gradients."  They are another – although related – factor that contributes to our sense of depth.

Other factors in creating apparent depth

Another important type of information for perceiving relative depth comes from the way that some portions of the image break into other portions.  For example, the image of the upper body of the player standing on the "0" at the bottom left-hand corner of the scene interrupts the contours that form the lower leg of the player to his immediate left.  This type of "interposition" provides an important cue that the first player should appear closer to the observer than should the second.  (Note the difficulty involved in our description.  We must find a way to describe the information that serves as the cue.  To say simply that the first player appears closer than the second, because he appears in front of the second, is a circular argument and is not helpful; "in front of" and "closer" translate to the same thing.)

The automatic operation of our eye muscles provides at least two additional sources of information concerning which objects should appear as closer or farther from us.  These are factors that may be referred to as "relative accommodation" and "relative (con)vergence."  The first involves the way in which a young, healthy eye adjusts its focus for objects at different optical distances.  (The mechanism for this adjustment will be discussed in a separate article under the category Seeing.)  The second involves the way in which the two eyes move, so that they point simultaneously toward an object of interest.  By swinging toward one another, the two eyes point at closer objects; by swinging away from each other – and more toward a state of being parallel in their lines of sight – the eyes point at more distant objects.  To the extent that the brain registers these oculomotor changes, we should expect that we are more likely to perceive differences in the distances of actual objects in a real scene.

Shadowing/shading is another factor that can add a sense of solidity to how objects appear.  In the broader context, this implies that shadowing can provide a relative depth cue.  For example, if parts of a player in Figure 1 appear differently illuminated than other parts, we may very well see the player as existing in depth (that is, the player will appear to be a three-dimensional individual, not a flat cardboard cutout).  Such apparent solidity is certainly an aspect of perceiving depth.

One eye versus two

It is also important to consider binocular or stereoscopic information. Because our eyes are set a few inches apart on our face, they will not have identical views of an actual scene (as opposed to a flat photograph).  Although an oversimplification, the constant separation of the eyes means that objects in the foreground of an actual scene will generally result in larger differences between the eyes.  These discrepancies in the left-eye and right-eye views provide a strong cue for apparent depth. 

The overall process of using the differences between the eyes for perceiving depth is referred to as stereopsis.  The depth cue itself was previously called "binocular disparity," although "stereoscopic cue" might be better (for reasons that we may ignore for our present purposes).

As an aside, when necessary, we will refer to factors that are available to the perceiver with even a single eye as "monocular" cues, and to factors that require information from both eyes as "binocular" cues.  Although stereopsis (a binocular factor) may sometimes be the strongest influence upon how we perceive depth, the information represented by the assorted monocular cues does not go away, just because we have both eyes open.  Thus, it is true that, although stereoscopic information can produce quite dramatic appearances of depth – such as in a child's toy Viewmaster® or a modern 3-D movie – it is incorrect to equate "depth perception" with "binocular vision."  Perception of depth can certainly result from various combinations of the cues described in this article, especially if the scene is spread over a wide screen, such as in an iMax theater®.

Additional considerations

Although the preceding material offers the reader a summary of the more important cues, it is not a complete listing.  Two things in particular must be added.

FIRST, people do not typically spend their time viewing stationary scenes.  The game captured at the start of play in Figure 1 is merely one moment in time.  Very shortly, the ball will be moved from its starting position and the players on both sides will move forward, backward or to the sides, as their leaders have directed.  Consequently, we can add relative motions as one more source in the generation of information for perceiving depth. 

Consider a situation in which a group of players scattered across the field begin to run in a similar direction.  If we consider the images of those players, some images will move across our field-of-view at different rates than the others.  The cue of "motion parallax" refers to the apparent depth created by the relative directions and rates of motion.  For example, if we follow a particular player with our eyes, the image of that player remains approximately centered in our vision.  He is said to be temporarily our point of fixation.  The images of players who are closer or farther will move differently across our eyes, providing an additional cue that those players should in fact be perceived at different distances.  (The details of this cue deserve their own article and one is contemplated.)

We may also consider the contributions of changes in relative image size, in interposition, in the oculomotor adjustments and in stereoscopic differences.  Although these various types of change have not been examined to the same degree as motion parallax itself, it seems very likely that they play a role in the depth perception of real-world scenes.

SECOND – and the final point for this article – we should consider whether there are any "built-in" (fancy term:  autochthonous) factors that influence how we will see a scene in depth.  One such factor is Walter Gogel's Equidistance Tendency.  This factor arises neither from external image information nor from oculomotor adjustments.  Instead, it appears to result from the basic operation of our perceptual system.  As such, it serves as a built-in bias, rather than as a cue.  It affects perceived depth, but does not itself contribute actual information about the scene.

The Equidistance Tendency states that we tend to see external objects as if they were at or near to the same distance.  This tendency serves as a stronger influence, when information to the contrary is more limited.  Consider the difficulty in backing your car into a parallel parking space.  From the driver's position (looking out the rear window), the available room between your own car and the one behind may appear small.  However, when you get out and walk around the back of your car, you may be surprised by how much separation there actually is.  For many more examples, see the related article The Equidistance Tendency.