Retrieval practice, as its name suggests, is a simple strategy that involves retrieving the target information one or more times prior to testing. It is not the same as repetition or rehearsal! The idea is not to simply repeat the correct information, but to try and retrieve it. Feedback as to the correct answer may or may not follow.
The keyword mnemonic is the most studied mnemonic strategy, and of proven effectiveness in learning vocabulary, most particularly when measured against rote repetition or “use your own methods”, but also when compared with the popular context method (students experience the word to be learned in several different meaningful contexts; they may or may not have to guess the meaning from the context). It has also effectively been used to learn artists’ styles, taxonomic information, attribute information, and the main points in text passages.
Results from using the keyword method have been quite dramatic. For example, in a classic study from the researchers that developed this strategy (Atkinson & Raugh 1975), over a third of the 120 words were remembered more than 80% of the time in the keyword condition, compared to only one item in the control condition (glaz for eye — a mnemonic link so obvious I am sure most of the control participants used it). Moreover, only seven words were remembered less than half the time in the keyword condition, compared to 70 in the control (“use your own method”) condition! Overall, the keyword group recalled 72% of the words when they were tested on the day following the three study days (40 words were studied each day), compared to 46% by the control group. When they were (without warning) tested again six weeks later, the keyword group remembered 43% compared to the control group’s 28%.
As you see, the benefits of the method are quite clear.
Which demonstrates how impressive it is that in a study that compared the two, retrieval practice resulted in the same, and in some cases, better performance than the keyword method.
In this 2007 study1, two lab experiments involving university students compared the learning of German words using either the keyword mnemonic, retrieval practice, or rote repetition, and found no difference in performance between the two experimental groups, and both significantly better than rote repetition. This was followed by an experiment involving 56 secondary school pupils, comparing the learning of German words learned in four different ways (that is, all the pupils were given the same instruction; groups of words were presented in different ways).
In the first section of the instruction booklet, each English word with its German translation was presented with an elaborating sentence (for example, “The German for SHARP is SCHARF, scharf also means hot (as in spicy).”; “The German for LIGHTHOUSE is LEUCHTTURM, Leuchtturm consists of the two words for shine and tower.”) — this was the elaboration strategy. In the next section (retrieval practice), the English and German words were read out when first presented, and on the following pages the students were required to retrieve the German word on seeing the English word. There were filler pages in between each retrieval attempt on the expanding schedule of 1-3-5-7 (that is, one intervening filler item before the first attempt, three items before the second attempt, and so on). In the third, keyword, section, the English and German words were presented with a description of a suggested image (e.g., “The German for SHARP is SCHARF. Imagine cutting a German flag with SHARP scissors.” “The German for LIGHTHOUSE is LEUCHTTURM. Imagine people LOITERING near a lighthouse.”). In the last section, a strategy combining both the keyword and retrieval practice was employed.
The time allowed for each page was controlled, and was only a few seconds.
There were two tests: recalling the English meaning on seeing the German words, and giving the German words when presented with the English meaning. The tests were given twice — immediately, and one week later. For the easier task (giving the English in response to the German), words learned using the elaboration strategy were significantly more poorly remembered, and results from the other three strategies were not significantly different in the immediate test, but after a week, the words learned by the combined method were significantly better remembered than those learned by the others. Words learned by the retrieval practice strategy were slightly, but not quite significantly, better remembered than those learned by the keyword method.
For the harder task (remembering the German), the difference between retrieval practice and keyword mnemonic reached statistical significance.
The big advantage of retrieval practice is of course that it is a very simple, easily learned technique. It also requires much less cognitive effort than the keyword mnemonic, which puts off many people because of the difficulty of finding good keywords, and the effort (which is greater for some than for others) of creating images.
There are two aspects of the retrieval practice strategy, as it was used here, that should be noted. One is the basic principle that retrieval is always better than rehearsal, because retrieval is the task you should be practicing for, and because rehearsal gives you no feedback as to how well you have learned, and retrieval does. That is why testing is so valuable — more valuable as a learning tool than as an assessment tool. Testing teaches; even pretesting (before the student even knows the information to be learned) improves learning. (Two studies on this are reported in a Scientific American article at https://www.scientificamerican.com/article.cfm?id=getting-it-wrong )
The second aspect is that the retrieval occurred on a distributed schedule.
I have talked before about the importance of spacing your learning (rehearsal; practice). So now I’ll just add one thing, from a recent (2009) study2.
Interleaving practice is a related strategy that has (mostly in the area of motor skills, but of wider applicability) been shown to improve learning. With interleaved practice, a lesson is followed by practice problems relating to many earlier lessons, ordered so that no consecutive problems are of the same type. As is readily apparent, interleaving naturally involves distributed practice, so it’s not clear whether interleaving is on its own, separate from the effects of distribution, of benefit. This new study managed to disentangle interleaving from spacing, and found that, even when spacing was held constant, interleaving more than doubled test scores (77% vs 38%).
However, and this is perhaps the really interesting part, it did so having impaired performance during practice. That is, not unexpectedly, performance was poorer during the learning period, when practice was interleaved.
And here we bring in a concept that is also of relevance in discussing the value of testing for learning: the idea of desirable difficulty (a term devised by Robert Bjork and colleagues).
In these days of trying not damage students’ self-esteem by having them experience failure, it is well to remember this concept.
(I have summarized this material in a 7-minute video.)

