Visual Working Memory: Capacity, Resolution, and Expertise

[First published on Nathen’s Miraculous Escape, July 20, 2009.]


Visual Working Memory: Capacity, Resolution, and Expertise

Nathen B. Lester

For PSY 435, Dr. Awh

University of Oregon

June 9, 2008

Visual Working Memory: Capacity, Resolution, and Expertise

Ideas in psychology change and develop in much the same way that they do in a conversation, a conversation taking place over years, primarily in the form of detailed written accounts of hypotheses, experiments, and results. It may be slow and technical, but it has the same form: Assertions are made, evidence is presented, mistakes are pointed out, and new assertions are made. Incrementally, the amount of knowledge is increased.

One such conversation that is ongoing in the scientific literature is about the capacity of visual working memory and how it may be affected by the perceptual expertise of the viewer.

Because working memory capacity is correlated with scholastic ability, attentional control, and scores on intelligence tests (Cowan, Elliot, Saults, Morey, Mattox, Hismjatullina, &Conway, 2005), it is a topic of considerable interest among researchers in cognitive psychology. The following sections describe three articles which are examples of an exchange between researchers on this topic which gives rise to new knowledge as well as raising new questions.

The Capacity of Visual Working Memory for Features and Conjunctions

In their article of 1997, Luck and Vogel argued for a visual working memory capacity of approximately four objects, regardless of how complex those objects are. They found mounting evidence for this in a series of experiments using Phillips’ (1974) change-detection paradigm, where subjects are shown an initial array of objects, a pause, and then a test array, with the task of indicating whether the two arrays were identical or different.  They conducted several experiments using arrays of visual objects which varied in their number and types of features, including a single color only, two colors for each object, object orientation only, color and orientation together, and finally objects which varied in their color, orientation, size, and the presence or absence of a gap. In all conditions of all experiments, subjects could accurately detect changes in about four of the objects in an array. This is strong evidence for an object-based working memory capacity and against a feature-based working memory capacity: Four objects were consistently remembered, whether those objects’ combined features equaled four, eight, or even 16 features.

Several other variations were used to rule out possible alternate explanations. To test for the possibility that their verbal working memories were aiding in the task, subjects were given a verbal load of two digits to remember, with no effect on their performance. To rule out the possibility that capacity estimates were being limited by the very brief presentation of the initial array, that presentation was increased from 100 ms to 500 ms, with no change in the results. To address the fact that more decisions had to be made when viewing larger test arrays, which could lead to more errors, another condition had subjects indicate whether one randomly chosen object had changed. This also did not affect the results. Finally, the condition in which each object had two colors was run to test the possibility that there are separate working memory systems for each kind of feature. Subjects in this condition could remember about four objects whether that meant remembering four colors or eight colors, evidence against a feature-based working memory system for color distinct from the system that remembers the other types of features.

A Visual Short-Term Memory Advantage for Faces

Curby and Gauthier’s 2007 article was an attempt to show the effects of holistic processing on visual working memory capacity, using a variation of the change-detection paradigm. Because of the greater efficiency of holistic processing, they reasoned, objects like upright faces, with which subjects have a lot of expertise, will be stored more efficiently in working memory. They hypothesized that this should result in a larger working memory capacity for faces than for other complex objects. Their experiments resulted in three basic findings: (a) At 500 ms of encoding time, subjects were less accurate in detecting changes between faces than they were in detecting changes between cars or watches. (b) At 2500 ms encoding time, subjects’ accuracy was equivalent for all categories. (c) At 4000 ms encoding time, subjects were more accurate with the faces than they were with the other categories of objects.

Curby and Gauthier’s (2007) explanation of these results was that for complex objects, perceptual encoding processes cause a bottleneck for creating representations in visual working memory. At the shortest encoding time, this results in fewer objects in memory from an array of faces than from arrays of less complex objects such as cars or watches. At the 2500 ms encoding time, the benefits of efficient, holistic processing brought the number of faces encoded up to the number of other objects. By 4000 ms of encoding time, those benefits allowed more faces to be stored in visual working memory than any other kind of object tested. In other words, given enough time, the benefits of the more holistic processing of faces outweighs their disadvantage of being more complex. Finally, based on this, they reasoned that the limits on the storage of complex objects in working memory hypothesized by Alvarez and Cavanagh (2004) are ameliorated to some degree by this efficient processing of faces.

Perceptual Expertise Enhances the Resolution but Not the Number of Representations in Working Memory

Scolari, Vogel, and Awh’s 2008 article was largely a correction and clarification of the meaning of Curby and Gauthier’s (2007) results: The benefit of expertise is not in the number but in the resolution of objects in working memory. The difference in subjects’ ability to detect changes in a face out of an array of faces compared to changes in, for example, a car out of an array of cars, was actually a measurement of comparison errors made between the memories formed and the test display, not the number of objects held in working memory. That is, when Curby and Gauthier (2007) thought they were measuring the quantity of objects in working memory, they were actually measuring their quality.

Scolari et al. (2008) managed to show this in one experiment using four categories of objects: faces, inverted faces, shaded cubes, and colored ovals. Within-category changes between the initial display and the test display replicated Curby and Gauthier’s (2007) results. In other words, subjects were more likely to detect a change from one upright face to another, as compared to a change from one cube to another or one inverted face to another, possibly reflecting the benefits of holistic processing for faces. Changes across categories, however, which eliminated the possibility of comparison errors, replicated Luck and Vogel’s (1997) results. In other words, when the changes were big, a face changing into a cube, for example, working memory capacity estimates were at about four objects, regardless of the complexity of those objects.

Scolari et al. (2008) also found that individual differences in subjects’ performance in the cross-category change detection tasks were correlated to their performance in their simple color-change detection task. This suggests that performance on these tasks produces a more pure estimate of working memory capacity than does performance on within-category tasks. Additionally, the cross-category individual differences were not correlated to the within-category individual differences, indicating that being able to hold a certain number of objects in working memory and being able to store details about those objects are distinct abilities, drawing on different resources.

Two Additional Voices

Two other articles are worth briefly describing to flesh out this conversation. The first of these is Alvarez and Cavanagh’s 2004, “The capacity of visual short-term memory is set both by visual information load and by number of objects.” This article was in part a reply to Luck and Vogel (1997) and set the stage for Curby and Gauthier’s error: Alvarez and Cavanagh measured what they thought was the complexity of the objects they used in their change-detection tasks and found that visual working memory capacity was limited by the complexity, and not just the number, of the objects therein.

The problem was that their operational definition of “complexity,” which was based on visual search rate, was confounded with similarity. That is, Alvarez and Cavanagh (2004) judged categories of objects more complex because they were more difficult to tell apart. This was pointed out by Awh, Barton, and Vogel (2007) in “Visual working memory represents a fixed number of items regardless of complexity.” Setting the stage for Scolari et al. (2008), Awh et al. (2007) showed that it was actually comparison errors, caused by object similarity, and notobject complexity that were producing the lower estimates for visual working memory capacity.


Useful conversations rely on careful logic, clearly defined terms, and up to date information. In one way, the exchange here can be seen as corrective of errors in these areas.

Luck and Vogel (1997) used sound and thorough reasoning in combination with a series of very straightforward experiments to present evidence for an object-based visual working memory capacity of about four items, and, simultaneously, evidence against the idea that visual working memory is limited by the number of features each object has. Alvarez and Cavanagh (2004) presented what they thought was evidence for the complexity of objects being an additional limit to capacity, but Awh et al. (2007) showed that apparent limit to be a resource artifact from the difficulty of distinguishing highly similar items from each other. Almost certainly unaware of Awh and colleagues’ work, Curby and Gauthier (2007) set out to refine the work of Alvarez and Cavanagh (2004), stating that the complexity of objects may limit capacity, but expertise can overcome that limit to some degree. Then Scolari et al. (2008) pointed out that Curby and Gauthier (2007), relying on Alvarez and Cavanagh’s reasoning, had made the same error: Their results did not mean what they had thought. When comparison errors are eliminated, it is obvious that the number of objects stored in working memory is not affected by those objects’ complexity.

While in some ways side-tracks based on faulty reasoning, these works of Alvarez, and Cavanagh (2004), and Curby and Gauthier (2007), have also been useful in extending our knowledge. What we mean by the word “capacity,” for example, has been refined. “Capacity” has been used in a variety of ways, especially in Curby and Gauthier (2007), who used it to mean either the total number of “slots” available in working memory, the number of slots that happened to have been filled in a certain experiment, an amount of total information, and a kind of rate-based encoding capacity, in the vein of “objects encoded per second.” It is now clear to those with up-to-date information, that when discussing visual working memory, “capacity” should refer to the number of slots available for visual objects.

It should also be clear that we currently have a model for visual working memory that has at least two factors: capacity for storing objects and the resolution of those objects. Further, the information-load bottleneck may be a real phenomenon, but it is not about the time it takes to store complex items; it appears to be about the time it takes to build representations of sufficient resolution to avoid comparison errors. Furthermore, expertise, which may result in more holistic processing of visual stimuli, does seem to increase subjects’ ability to encode high-resolution memories, even if it does not increase the number of objects which can be stored.

Many good questions have been raised as well. Since search-rate based measurements have been ruled out, what are good operational definitions of “complexity” and “information load” in regard to visual objects? Is there a kind of “resolution capacity,” and how would it relate to expertise, given a good operational definition of complexity? How does this relate to the “consolidation time” estimates between 50 ms and 500 ms in Curby and Gauthier (2007, p. 627)? What is the role of expertise in the resolution of memories of objects other than faces? The most interesting question follows from the correlation of working memory capacity, intelligence scores and academic achievement (Cowan et al., 2005): Since individual differences in the resolution factor are not correlated with individual differences in the capacity factor (Scolari et al. 2008), what is the relationship between the resolution of visual working memory and intelligence?


Alvarez, G. A., & Cavanagh, P. (2004). The capacity of visual short-term memory is set both by visual information load and by number of objects. Psychological Science, 15, 106-111.

Awh, E., Barton, B., & Vogel, E. K. (2008). Visual working memory represents a fixed number of items regardless of complexity. Psychological Science, 18, 622-628.

Cowan, N., Elliot, E. M., Saults, J. S., Morey, C. C., Mattox, S., Hismjatullina, A., & Conway, A. R. A. (2005). On the capacity of attention: Its estimation and its role in working memory and cognitive aptitudes. Cognitive Psychology 51, 42-100.

Curby, K. M., & Gauthier, I. (2007). A visual short-term memory advantage for faces.Psychonomic Bulletin & Review, 14, 620-628.

Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features and conjunctions. Nature, 390, 279-281.

Phillips, W. A. (1974). On the distinction between sensory storage and short-term visual memory. Perceptual Psychophysiology, 16, 283-290.

Scolari, M., Vogel, E. V., & Awh, E. (2008). Perceptual expertise enhances the resolution but not the number of representations in working memory. Psychonomic Bulletin & Review, 15, 215-222.