Jason S. McCarley (1), Raechel N. Soicher (2), & Jannah R. Moussaoui (1)
(1: Oregon State University, 2: Massachusetts Institute of Technology)
*Note: For the version with figures and additional resources included, please follow this link: https://www.dropbox.com/s/81a6jalwql5uy4y/March_McCarley_et_al.pdf?dl=0
The Gestalt principles of perceptual organization are a staple of undergraduate psychology. Examples like those in Figure 1 are common in Intro Psych, Cognitive, and Sensation & Perception textbooks, and discussion of the psychology behind them is important for a number of reasons. Historically, the Gestalt movement was an enormously influential school of thought (Rock & Palmer, 1990; Wagemans et al., 2012). Practically, the Gestalt principles are useful for designing displays, graphs, and lecture slides (Kosslyn, 2006; Moore & Fitz, 1993; Wickens et al., 2022). And pedagogically, the Gestalt phenomena reveal perceptual processes that students might normally take for granted.
Figure 1. Visual demonstrations of three familiar Gestalt principles.
Discussion of Gestalt grouping typically focuses on visual processes, like those illustrated in the figure. But perceptual organization isn’t exclusive to vision; Gestalt processes are also necessary to organize messy sensory inputs in other senses, including touch (Gallace & Spence, 2011) and hearing (Bregman, 1990). In hearing, specifically, Gestalt processes help turn soundwaves crashing on the eardrum into a mental representation of the sound sources around us. Bregman (1990) used the term auditory scene analysis to describe the perceptual organization of sound, and auditory streams to denote the output of this analysis. An auditory stream is thus the analogue of a visual object or group.
To accompany his book on auditory scene perception, Bregman provided demonstrations of auditory Gestalt effects. These are useful classroom demonstrations in two ways. First, they establish a unifying principle, showing students that perceptual organization operates in similar ways across different senses. Second, they make Gestalt phenomena accessible to students with visual disabilities.
Below, we present three of Bregman’s auditory examples that can be used as classroom demonstrations in discussions of Gestalt grouping. For each one, we describe and illustrate the analogous visual effect, and explain the correspondence between the auditory and visual phenomena. We also provide links to downloadable ﬁles, made available by Bregman, that demonstrate the auditory phenomena.
A larger set of examples is available on Dr. Bregman’s website.
Grouping by Similarity
In vision, the Gestalt principle of similarity says that items that look alike tend to group with one another. In theory, for example, we could see the dots in Figure 1A as forming columns, arbitrary clusters, or no pattern at all. Instead, we tend to group the dots by color, into rows.
In his demonstration of auditory grouping by similarity, Bregman manipulates pitch to gradually segregate a series of notes into two streams. Figure 2 provides a schematic illustration. The stimulus is a well-known melody interleaved with random distractor tones. To begin, the melody and distractors are within the same pitch range, and the melody is camouﬂaged. As it plays repeatedly, the melody gradually moves into a higher pitch range. Eventually, it segregates from the distractor tones and becomes recognizable. Pitch here plays the same role as color does in Figure 1: sounds of similar pitch are grouped into a distinct auditory stream, standing out from sounds of dissimilar pitch.
Figure 2. Auditory grouping by similarity of pitch. A: When a melody is embedded amongst distractor tones from the same pitch range, it is effectively camouflaged. Here, the notes outlined in black represent the melody and the notes without outlines represent the distractors. B: When the melody and distractors are in different pitch ranges, the melody stands out and is easy to recognize.
Grouping by Proximity
The principle of proximity holds that items near one another are grouped together. In vision, proximity is spatial. In Figure 1B, for instance, the vertical separation between dots is smaller than the horizontal separation, and as a result, we perceive the dots as forming columns.
In hearing, proximity is temporal. Bregman’s demonstration, illustrated in Figure 3, interleaves a series of three descending tones with a series of three ascending tones. The descending tones are in a higher pitch range than the ascending tones. To begin, the tones are played slowly, and we hear a series of notes jumping back and forth between pitch ranges. Next, the tones are played quickly. Now, temporal proximity and similarity combine to segregate the ascending and descending series into two distinct streams that seem to run simultaneously. Near enough to one another in time, the tones of similar pitch group.
Figure 3. Auditory grouping by proximity. A: When interleaved high and low tones are played slowly, we hear a single sequence that jumps between pitch ranges. B: When the interleaved tones are played quickly, notes of similar pitch group together. We perceive two simultaneous streams, one high-pitched and one low-pitched.
Grouping by Connectedness
The principle of connectedness (Rock & Palmer, 1990) holds that items connected to one another are grouped together. Figure 1C shows the inﬂuence of visual connectedness. Dots alternate color from top to bottom, and are closer together horizontally than vertically. But because they are linked by thin vertical lines, the dots perceptually group to form columns. Here, connectedness overpowers similarity and proximity.
Bregman’s demonstration of grouping by connectedness shows an equally powerful effect. Figure 4 illustrates. In the unconnected condition, a series of tones alternates between high (“beep”) and low (“boop”) pitch. The impression is of two distinct streams, one high-pitched (“Beep. Beep. Beep…”) and one low-pitched (“Boop. Boop. Boop…”). In the connected condition, a smoothly rising and falling tone is interposed between the low- and high-pitched tones. Now, we hear a single stream of sound, smoothly modulating between high and low (“Beeeeooooeeeeoooo…”). Just as in vision, connectedness transforms isolated fragments into a unified perceptual object.
Figure 4. Auditory grouping by connectedness. A: When interleaved high and low tones are unconnected, we hear separate high- and low-pitched streams. B: When high and low tones are connected by a rising and falling tone, we here a single, undulating auditory stream.
The Gestalt principles are foundational knowledge for psych undergrads. Teaching them with an exclusive focus on vision, though, can limit their accessibility and give students an unduly narrow view of the role they play in our mental life. Auditory demonstrations give us a way to expand the reach and impact of our lessons on perceptual organization.
Bregman, A. S. (1990). Auditory scene analysis: The perceptual organization of sound. Cambridge, MA: MIT Press.
Gallace, A., & Spence, C. (2011). To what extent do gestalt grouping principles influence tactile perception? Psychological Bulletin, 137(4), 538–561. https://doi.org/10.1037/a0022335
Kosslyn, S. M. (2006). Graph design for the eye and mind. New York, NY: Oxford University Press.
Moore, P., & Fitz, C. (1993). Using Gestalt theory to teach document design and graphics. Technical Communication Quarterly, 2(4), 389–410. https://doi.org/10.1080/10572259309364549
Rock, I., & Palmer, S. (1990). The Legacy of Gestalt Psychology. Scientific American, 263(6), 84–90. https://doi.org/10.1038/scientificamerican1290-84
Wagemans, J., Elder, J. H., Kubovy, M., Palmer, S. E., Peterson, M. A., Singh, M., & von der Heydt, R. (2012). A century of Gestalt psychology in visual perception: I. Perceptual grouping and figureground organization. Psychological Bulletin, 138(6), 1172–1217. https://doi.org/10.1037/a0029333
Wickens, C. D., Helton, W. S., Hollands, J. G., & Banbury, S. (2022). Engineering psychology and human performance (5th edition). New York, NY: Routledge.