Skip to main content

practice

Desirable difficulty for effective learning

When we are presented with new information, we try and connect it to information we already hold. This is automatic. Sometimes the information fits in easily; other times the fit is more difficult — perhaps because some of our old information is wrong, or perhaps because we lack some of the knowledge we need to fit them together.

When we're confronted by contradictory information, our first reaction is usually surprise. But if the surprise continues, with the contradictions perhaps increasing, or at any rate becoming no closer to being resolved, then our emotional reaction turns to confusion.

Confusion is very common in the learning process, despite most educators thinking that effective teaching is all about minimizing, if not eliminating, confusion.

But recent research has suggested that confusion is not necessarily a bad thing. Indeed, in some circumstances, it may be desirable.

I see this as an example of the broader notion of ‘desirable difficulty’, which is the subject of my current post. But let’s look first at this recent study on confusion for learning.

In the study, students engaged in ‘trialogues’ involving themselves and two animated agents. The trialogues discussed possible flaws in a scientific study, and the animated agents took the roles of a tutor and a student peer. To get the student thinking about what makes a good scientific study, the agents disagreed with each other on certain points, and the student had to decide who was right. On some occasions, the agents made incorrect or contradictory statements about the study.

In the first experiment, involving 64 students, there were four opportunities for contradictions during the discussion of each research study. Because the overall levels of student confusion were quite low, a second experiment, involving 76 students, used a delayed manipulation, where the animated agents initially agreed with each other but eventually started to express divergent views. In this condition, students were sometimes then given a text to read to help them resolve their confusion. It was thought that, given their confusion, students would read the text with particular attention, and so improve their learning.

In both experiments, on those trials which genuinely confused the students, those students who were initially confused by the contradiction between the two agents did significantly better on the test at the end.

A side-note: self-reports of confusion were not very sensitive, and students’ responses to forced-choice questions following the contradictions were more sensitive at inferring confusion. This is a reminder that students are not necessarily good judges of their own confusion!

The idea behind all this is that, when there’s a mismatch between new information and prior knowledge, we have to explore the contradictions more deeply — make an effort to explain the contradictions. Such deeper processing should result in more durable and accessible memory codes.

Such a mismatch can occur in many, quite diverse contexts — not simply in the study situation. For example, unexpected feedback, anomalous events, obstacles to goals, or interruptions of familiar action sequences, all create some sort of mismatch between incoming information and prior knowledge.

However, all instances of confusion aren’t necessarily useful for learning and memory. They need to be relevant to the activity, and of course the individual needs to have the means to resolve the confusion.

As I said, I see a relationship between this idea of the right level and type of confusion enhancing learning, and the idea of desirable difficulty. I’ve talked before about the ‘desirable difficulty’ effect (see, for example, Using 'hard to read' fonts may help you remember more). Both of these ideas, of course, connect to a much older and more fundamental idea: that of levels of processing. The idea that we can process information at varying levels, and that deeper levels of processing improve memory and learning, dates back to a paper written in 1972 by Craik and Lockhart (although it has been developed and modified over the years), and underpins (usually implicitly) much educational thinking.

But it’s not so much this fundamental notion that deeper processing helps memory and learning, and certain desirable difficulties encourage deeper processing, that interests me as much as idea of getting the level right.

Too much confusion is usually counter-productive; too much difficulty the same.

Getting the difficulty level right is something I have talked about in connection with flow. On the face of it, confusion would seem to be counterproductive for achieving flow, and yet ... it rather depends on the level of confusion, don't you think? If the student has clear paths to follow to resolve the confusion, the information flow doesn't need to stop.

This idea also, perhaps, has connections to effective practice principles — specifically, what I call the ‘Just-in-time rule’. This is the principle that the optimal spacing for your retrieval practice depends on you retrieving the information just before you would have forgotten it. (That’s not as occult as it sounds! But I’m not here to discuss that today.)

It seems to me that another way of thinking about this is that you want to find that moment when retrieval of that information is at the ‘right’ level of difficulty — neither too easy, nor too hard.

Successful teaching is about shaping the information flow so that the student experiences it — moment by moment — at the right level of difficulty. This is, of course, impossible in a factory-model classroom, but the mechanics of tailoring the information flow to the individual are now made possible by technology.

But technology isn't the answer on its own. To achieve optimal results, it helps if the individual student is aware that the success of their learning depends on (or will at least be more effective — for some will be successful regardless of the inadequacy of the instruction) managing the information flow. Which means they need to provide honest feedback, they need to be able to monitor their learning and recognize when they have ‘got’ something and when they haven’t, and they need to understand that if one approach to a subject isn’t working for them, then they need to try a different one.

Perhaps this provides a different perspective for some of you. I'd love to hear of any thoughts or experiences teachers and students have had that bear on these issues.

References

D’Mello, S., Lehman B., Pekrun R., & Graesser A. (Submitted). Confusion can be beneficial for learning. Learning and Instruction.

Practice counts! So does talent

The thing to remember about Ericsson’s famous expertise research, showing us the vital importance of deliberate practice in making an expert, is that it was challenging the long-dominant view that natural-born talent is all-important. But Gladwell’s popularizing of Ericsson’s “10,000 hours” overstates the case, and of course people are only too keen to believe that any height is achievable if you just work hard enough.

The much more believable story is that, yes, practice is vital — a great deal of the right sort of practice — but we can’t disavow “natural” abilities entirely.

Last year I reported on an experiment in which 57 pianists with a wide range of deliberate practice (from 260 to more than 31,000 hours) were compared on their ability to sight-read. Number of hours of practice did indeed predict much of the difference in performance (nearly half) — but not all. Working memory capacity also had a statistically significant impact on performance, although this impact was much smaller (accounting for only about 7% of the performance difference). Nevertheless, there’s a clear consequence: given two players who have put in the same amount of effective practice, the one with the higher WMC is likely to do better. Why should WMC affect sight-reading? Perhaps by affecting how many notes a player can look ahead as she plays — this is a factor known to affect sight-reading performance.

Interestingly, the effect of working memory capacity was quite independent of practice, and hours of practice apparently had no effect on WMC. Although it’s possible (the study was too small to tell) that a lot of practice at an early age might affect WMC. After all, music training has been shown to increase IQ in children.

So, while practice is certainly the most important factor in developing expertise, other factors, some of them less amenable to training, have a role to play too.

But do general abilities such as WMC or intelligence matter once you’ve put in the requisite hours of good practice? It may be that ability becomes less important once you achieve expertise in a domain.

The question of whether WMC interacts with domain knowledge in this way has been studied by Hambrick and his colleagues in a number of experiments. One study used a memory task in which participants listened to fictitious radio broadcasts of baseball games and tried to remember major events and information about the players. Baseball knowledge had a very strong effect on performance, and WMC had a much smaller effect, but there was no interaction between the two. Similarly, in two poker tasks, in which players had to assess the likelihood of drawing a winning card, and players had to remember hands during a game of poker, both poker knowledge and WMC affected performance, but again there was no interaction between domain knowledge and WMC.

Another study took a different tack. Participants were asked to remember the movements of spaceships flying from planet to planet in the solar system. What they didn’t know was that the spaceships flew in a pattern that matched the way baseball players run around a baseball diamond. They were then given the same task, this time with baseball players running around a diamond. Baseball knowledge only helped performance in the task in which the baseball scenario was explicit — activating baseball knowledge. But activation of domain knowledge had no effect on the influence of WMC.

Although these various studies fail to show an interaction between domain knowledge and WMC, this doesn’t mean that domain knowledge never interacts with basic abilities. The same researchers recently found such an interaction in a geological bedrock mapping task, in which geological structure of a mountainous area had to be inferred. Visuospatial ability predicted performance only at low levels of geological knowledge; geological experts were not affected by their visuospatial abilities. Unfortunately, that study is not yet published, so I don’t know the details. But I assume they mean visuospatial working memory capacity.

It’s possible that general intelligence or WMC are most important during the first stages of skill acquisition (when attention and working memory capacity are so critical), and become far less important once the skill has been mastered.

Similarly, Ericsson has argued that deliberate practice allows performers to circumvent limits on working memory capacity. This is, indeed, related to the point I often make about how to functionally increase your working memory capacity — if you have a great amount of well-organized and readily accessible knowledge on a particular topic, you can effectively expand how much your working memory can hold by keeping a much larger amount of information ‘on standby’ in what has been termed long-term working memory.

Proponents of deliberate practice don’t deny that ‘natural’ abilities have some role, but they restrict it to motivation and general activity levels (plus physical attributes such as height where that is relevant). But surely these would only affect number of hours. Clearly the ability to keep yourself on task, to motivate and discipline yourself, impinges on your ability to keep your practice up. And the general theory makes sense — that if you show some interest in something, such as music or chess, when you’re young, your parents or teachers usually encourage you in that direction; this encouragement and rewards lead you to spend more time and energy in that domain, and if you have enough persistence, enough dedication, then lo and behold, you’ll get better and better. And your parents will say, well, it was obvious from an early age that she was talented that way.

But is it really the case that attributes such as intelligence make no difference? Is it really as simple as “10,000 hours of deliberate practice = expert”? Is it really the case that each hour has the same effect on any one of us?

A survey of 104 chess masters found that, while all the players that became chess masters had practiced at least 3,000 hours, the amount of practice it took to achieve that mastery varied considerably. Although, consistent with the “10,000 hour rule”, average time to achieve mastery was around 11,000 hours, time ranged from 3,016 hours to 23,608 hours. The difference is even more extreme if you only consider individual practice (previous research has pointed to individual practice being of more importance than group practice): a range from 728 hours to 16,120 hours! And some people practiced more than 20,000 hours and still didn't achieve master level.

Moreover, a comparison of titled masters and untitled international players found that the two groups practiced the same amount of hours in the first three years of their serious dedication to chess, and yet there were significant differences in their ratings. Is this because of some subtle difference in the practice, making it less effective? Or is it that some people benefit more from practice?

A comparison of various degrees of expertise in terms of starting age is instructive. While the average age of starting to play seriously was around 18 for players without an international rating, it was around 14 for players with an international rating, and around 11 for masters. But the amount of variability within each group varies considerably. For players without an international rating, the age range within one standard deviation of the mean is over 11 years, but for those with an international rating, FIDE masters, and international masters, the range is only 2-3 years, and for grand masters, the range is less than a year. [These numbers are all approximate, from my eyeball estimates of a bar graph.]

It has been suggested that the younger starting age of chess masters and expert musicians is simply a reflection of the greater amount of practice achieved with a young start. But a contrary suggestion is that there might be other advantages to learning a skill at an early age, reflecting what might be termed a ‘sensitive period’. This study found that the association between skill and starting age was still significant after amount of practice had been taken account of.

Does this have to do with the greater plasticity of young brains? Expertise “grows” brains — in the brain regions involved in that specific domain. Given that younger brains are much more able to create new neurons and new connections, it would hardly be a surprise that it’s easier for them to start building up the dense structures that underlie expertise.

This is surely easier if the young brain is also a young brain that has particular characteristics that are useful for that domain. For music, that might relate to perceptual and motor abilities. In chess, it might have more to do with processing speed, visuospatial ability, and capacious memory.

Several studies have found higher cognitive ability in chess-playing children, but the evidence among adults has been less consistent. This may reflect the growing importance of deliberate practice. (Or perhaps it simply reflects the fact that chess is a difficult skill, for which children, lacking the advantages that longer education and training have given adults, need greater cognitive skills.)

Related to all this, there’s a popular idea that once you get past an IQ of around 120, ‘extra’ IQ really makes no difference. But in a study involving over 2,000 gifted young people, those who scored in the 99.9 percentile on the math SAT at age 13 were eighteen times more likely to go on to earn a doctorate in a STEM discipline (science, technology, engineering, math) compared to those who were only(!) in the 99.1 percentile.

Overall, it seems that while practice can take you a very long way, at the very top, ‘natural’ ability is going to sort the sheep from the goats. And ‘natural’ ability may be most important in the early stages of learning. But what do we mean by ‘natural ability’? Is it simply a matter of unalterable genetics?

Well, palpably not! Because if there’s one thing we now know, it’s that nature and nurture are inextricably entwined. It’s not about genes; it’s about the expression of genes. So let me remind you that aspects of the prenatal, the infant, and the child’s, environment affect that ‘natural’ ability. We know that these environments can affect IQ; the interesting question is what we can do, at each and any of these stages, to improve affect basic processes such as speed of processing, WMC, and inhibitory control. (Although I should say here that I am not a fan of the whole baby-Einstein movement! Nor is there evidence that many of those practices work.)

Bottom line:

  • talent still matters
  • effective practice is still the most important factor in developing expertise
  • individuals vary in how much practice they need
  • individual abilities do put limits on what’s achievable (but those limits are probably higher than most people realize).

How to Revise and Practice

References

Campitelli, G., & Gobet F. (2011).  Deliberate Practice. Current Directions in Psychological Science. 20(5), 280 - 285.

Campitelli, G., & Gobet, F. (2008). The role of practice in chess: A longitudinal study. Learning and Individual Differences, 18, 446–458.

Gobet, F., & Campitelli, G. (2007). The role of domain-specific practice, handedness and starting age in chess. Developmental Psychology, 43, 159–172.

Hambrick, D. Z., & Meinz, E. J. (2011). Limits on the Predictive Power of Domain-Specific Experience and Knowledge in Skilled Performance. Current Directions in Psychological Science, 20(5), 275 –279. doi:10.1177/0963721411422061

Hambrick, D.Z., & Engle, R.W. (2002). Effects of domain knowledge, working memory capacity and age on cognitive performance: An investigation of the knowledge-is-power hypothesis. Cognitive Psychology, 44, 339–387.

Hambrick, D.Z., Libarkin, J.C., Petcovic, H.L., Baker, K.M., Elkins, J., Callahan, C., et al. (2011). A test of the circumvention-of-limits hypothesis in geological bedrock mapping. Journal of Experimental Psychology: General, Published online Oct 17, 2011.

Hambrick, D.Z., & Oswald, F.L. (2005). Does domain knowledge moderate involvement of working memory capacity in higher level cognition? A test of three models. Journal of Memory and Language, 52, 377–397.

Meinz, E. J., & Hambrick, D. Z. (2010). Deliberate Practice Is Necessary but Not Sufficient to Explain Individual Differences in Piano Sight-Reading Skill. Psychological Science, 21(7), 914–919. doi:10.1177/0956797610373933

 

Attributes of effective practice

One of my perennial themes is the importance of practice, and in the context of developing expertise, I have talked of ‘deliberate practice’ (a concept articulated by the well-known expertise researcher K. Anders Ericsson). A new paper in the journal Psychology of Music reports on an interesting study that shows how the attributes of music practice change as music students develop in expertise. Music is probably the most studied domain in expertise research, but I think we can gain some general insight from this analysis. Here’s a summary of the findings.

[Some details about the U.K. study for those interested: the self-report study involved 3,325 children aged 6-19, ranging from beginner to Grade 8 level, covering a variety of instruments, with violin the most common at 28%, and coming from a variety of musical settings: junior conservatoires, youth orchestras, Saturday music schools, comprehensive schools.]

For a start, and unsurprisingly, amount of practice (both in terms of amount each day, and number of days in the week) steadily increases as expertise develops. Interestingly, there is a point where it plateaus (around grade 5-6 music exams) before increasing more sharply (presumably this reflects a ‘sorting the sheep from the goats’ effect — that is, after grade 6, it’s increasingly only the really serious ones that continue).

It should not be overlooked, however, that there was huge variability between individuals in this regard.

More interesting are the changes in the attributes of their practice.

 

These attributes became less frequent as the players became more expert:

Practicing strategies:

Practicing pieces from beginning to end without stopping

Going back to the beginning after a mistake

Analytic strategies:

Working things out by looking at the music without actually playing it

Trying to find out what a piece sounds like before trying to play it

Analyzing the structure of a piece before learning it

Organization strategies:

Making a list of what to practice

Setting targets for each session.

 

These attributes became more frequent as the players became more expert:

Practicing strategies:

Practicing small sections;

Getting recordings of a piece that is being learned;

Practicing things slowly;

Knowing when a mistake has been made;

When making a mistake, practicing a section slowly;

When something was difficult playing it over and over again;

Marking things on the part;

Practicing with a metronome;

Recording practice and listening to the tapes;

Analytic strategies:

Identifying difficult sections;

Thinking about how to interpret the music;

Organization strategies:

Doing warm-up exercises;

Starting practice with studies;

Starting practice with scales.

 

Somewhat surprisingly, levels of concentration and distractability didn’t vary significantly as a function of level of expertise. The researchers suggest that this may reflect the reliance on self-reported data rather than reality, but, also somewhat surprisingly, enjoyment of practice didn’t change as a function of expertise either.

Interestingly (but perhaps not so surprisingly once you think about it), the adoption of systematic practicing strategies followed a U-shaped curve rather than a linear trend. Those who had passed Grade 1 scored relatively high on this, but those who had most recently passed Grade 2 scored more poorly, and those with Grade 3 were worst of all. After that, it begins to pick up again, achieving the same level at Grade 6 as at Grade 1.

Organization of practice, on the other hand, while it varied with level of expertise, showed no systematic relationship (if anything, it declined with expertise! But erratically).

The clearest result was the very steady and steep decline in the use of ineffective strategies. These include:

  • Practicing pieces from beginning to end without stopping;
  • Going back to the beginning after a mistake;
  • Immediate correction of errors.

It should be acknowledged that these strategies might well be appropriate at the beginning, but they are not effective with longer and more complex pieces. It’s suggested that the dip at Grade 3 probably reflects the need to change strategies, and the reluctance of some students to do so.

But of course grade level in itself is only part of the story. Analysis on the basis of how well the students did on their most recent exam (in terms of fail, pass, commended, and highly commended) reveals that organization of practice, and making use of recordings and a metronome, were the most important factors (in addition to the length of time they had been learning).

The strongest predictor of expertise, however, was not using ineffective strategies.

This is a somewhat discouraging conclusion, since it implies that the most important thing to learn (or teach) is what not to do, rather than what to do. But I think a codicil to this is also implicit. Given the time spent practicing (which is steadily increasing with expertise), the reduction in wasting time on ineffective strategies means that, perforce, time is being spent on effective strategies. The fact that no specific strategies can be inequivocally pointed to, suggests that (as I have repeatedly said), effective strategies are specific to the individual.

This doesn’t mean that identifying effective strategies and their parameters is a pointless activity! Far from it. You need to know what strategies work to know what to choose from. But you cannot assume that because something is the best strategy for your best friend, that it is going to be equally good for you.

Notwithstanding this, the adoption of systematic practice strategies was significantly associated with expertise, accounting for the largest chunk of the variance between individuals — some 11%.

Similarly, organization of practice (accounting for nearly 8% of variance), making use of recordings and a metronome (nearly 8% of variance), analytic strategies (over 7% of variance) were important factors in developing expertise in music, and it seems likely that many if not most individuals would benefit from these.

It’s also worth noting that playing straight through the music was the strongest predictor of expertise — as a negative factor.

So what general conclusions can we draw from these findings?

The wide variability in practice amount is worth noting — practice is hugely important, but it’s a mistake to have hard-and-fast rules about the exact number of hours that is appropriate for a given individual.

Learning which strategies are a waste of time is very important (and one that many students don’t learn — witness the continuing popularity of rote repetition as a method of learning).

Organization — in respect of structuring your learning sessions — is perhaps one of those general principles that doesn’t necessarily apply to every individual, and certainly the nature and extent of organization is likely to vary by individual. Nevertheless, given its association with better performance, it is certainly worth trying to find the level of organization that is best for you (or your student). The most important factors in this category were starting practice with scales (for which appropriate counterparts are easily found for other skills being practiced, including language learning, although perhaps less appropriate for other forms of declarative learning), and making a list of what needs to be practiced.

Having expert models/examples/case studies (as appropriate), and appropriate levels of scaffolding, are very helpful (in the case of music, this is instantiated by the use of recordings, both listening to others and self-feedback, and use of a metronome).

Identifying difficult aspects, and dealing with them by tackling them on their own, using a slow and piecemeal process, is usually the most helpful approach. (Of the practice strategies, the most important were practicing sections slowly when having made a mistake, practicing difficult sections over and over again, slow practice, gradually speeding up when learning fast passages, and recognizing errors.)

Preparing for learning is also a generally helpful strategy. In music this is seen in the most effective analytic strategies: trying to find out what a piece sounds like before trying to play it, and getting an overall idea of a piece before practicing it. In declarative learning (as opposed to skill learning), this can be seen in such strategies as reading the Table of Contents, advance organizers and summaries (in the case of textbooks), or doing any required reading before a lecture, and (in both cases) thinking about what you expect to learn from the book or lecture.

How to Revise and Practice

References

Hallam, S., Rinta, T., Varvarigou, M., Creech, a., Papageorgi, I., Gomes, T., & Lanipekun, J. (2012). The development of practising strategies in young people. Psychology of Music, 40(5), 652–680. doi:10.1177/0305735612443868

The value of intensive practice

Let’s talk about the cognitive benefits of learning and using another language.

In a recent news report, I talked about the finding that intensive learning of a very novel language significantly grew several brain regions, of which two were positively associated with language proficiency. These regions were the right hippocampus and the left superior temporal gyrus. Growth of the first of these probably reflects the learning of a great many new words, and the second may reflect heavy use of the phonological loop (a part of working memory).

There are several aspects to this study that are worth discussing in the context of using language learning as a means of protecting against age-related cognitive decline.

First of all, let me start with a general reminder. We now know that, analogous to muscles, we can ‘grow’ specific brain regions by working them. But an adult brain is confined by the skull — growth in one part is generally at the expense of another part. So, unlike body-building, you can’t just grow your whole brain!

This suggests that it pays to think about the areas you want to improve (which goes right back to the first chapter of The Memory Key: it’s no good talking about improving ‘your memory’ — rather, you should pick the memory tasks you want to improve).

One of the big advantages of growing the parts of the brain involved in language is that language is so utterly critical to our intellectual ability. Most of us use language to think and to communicate. There’s a reason why so many studies of older adults’ cognitive performance use verbal fluency as the measure!

But, in the same way that the increase in London cab drivers’ right posterior hippocampus appears to be at the expense of the anterior hippocampus, the growth in the right hippocampus may be at the expense of other functions (perhaps spatial navigation).

Is this a reason for not learning? Certainly not! But it is perhaps a reminder that we should be aiming for two things in preventing cognitive decline. The first is in ‘growing’ brain tissue: making new neurons, and new connections. This is to counteract the shrinkage (brain atrophy) that tends to occur with age.

The second concerns flexibility. Retaining the brain’s plasticity is a vital part of fighting cognitive decline, even more vital, perhaps, than retaining brain tissue. To keep this plasticity, we need to keep the brain changing.

Here’s a question we don’t yet know the answer to: how much age-related cognitive decline is down to people steadily experiencing fewer and fewer novel events, learning less, thinking fewer new thoughts?

But we do know it matters.

So let’s go back to our intensive language learners growing parts of their brain. Does the growth in the right hippocampus (unfortunately we don’t know how much that growth was localized within the right hippocampus) mean that it will now remain that size, at the expense, presumably, of some other area (and function)?

No, it doesn’t. As far as language is concerned, the hippocampus is primarily a short-term processor. As those new words are consolidated, they’ll move into long-term memory, in the language network across the cortex. Once the interpreters stop acquiring new vocabulary at this rate, I would expect to see this region reduce. Indeed (and I am speculating here), I would expect this to happen once a solid ‘semantic network’ for the new language was established in long-term memory. At this point, new vocabulary will be more and more encoded in terms of that network, and reliance on the short-term processes of the hippocampus will become less (although still important!).

I think that intensity is important. Intensity by its very nature is rarely maintained. People at the top of their field — champion sportspeople, top-ranking musicians, ‘geniuses’, and so on —they have to maintain that intensity as long as they want to stay at the top, and I would expect their brains to show more enduring changes (that is, particular regions that are unusually large, and others that are smaller than average). For the rest of us, any enduring changes are less marked.

But making those changes is important!

In recent years, research has come to suggest that, although regular moderate exercise is highly beneficial for physical and mental health, short bouts of intense activity have their own specific benefits above and beyond that. I think the same might be true for mental activity.

This may be particularly (or differently) true as we get older, when it does tend to get harder to learn — making (relatively) short bouts of intensive study/learning/activity so vital. We need that concentrated practice more than we did when we were young and learning came easier. And concentrated practice may be exactly the way to produce significant change in our brains.

But we don’t need to worry about becoming ‘muscle-bound’ — if we learn thousands of new words in a few months (an excellent step in acquiring a new language), we will then go on to acquire grammar and practice reading and writing whole sentences. The words will consolidate; different language skills will build different parts of the brain; those areas no longer being intensively worked will diminish (a little).

Moreover, it’s not only about growing particular regions, it’s also very much about building new or stronger connections between regions — building new networks. Because language learning involves so many regions, it may be especially good for that aspect too (see, for example, another recent news report, on how language learning grows white matter and reorganizes brain structures).

The important thing is that your brain is changing; the important thing is that your brain keeps changing. I think intensive periods of new learning are the way to achieve this, interspersed with consolidation periods.

As I’ve said before, variety is key. By providing variety in learning and experiences across tasks and domains, you can keep your brain flexible. By providing intense focus for a period, you can better build specific ‘mental muscles’.

How to Revise and Practice

Variety is the key to learning

On a number of occasions I have reported on studies showing that people with expertise in a specific area show larger gray matter volume in relevant areas of the brain. Thus London taxi drivers (who are required to master “The Knowledge” — all the ways and byways of London) have been found to have an increased volume of gray matter in the anterior hippocampus (involved in spatial navigation). Musicians have greater gray matter volume in Broca’s area.

Other research has found that gray matter increases in specific areas can develop surprisingly quickly. For example, when 19 adults learned to match made-up names against four similar shades of green and blue in five 20-minute sessions over three days, the areas of the brain involved in color vision and perception increased significantly.

This is unusually fast, mind you. Previous research has pointed to the need for training to extend over several weeks. The speed with which these changes were achieved may be because of the type of learning — that of new categories — or because of the training method used. In the first two sessions, participants heard each new word as they regarded the relevant color; had to give the name on seeing the color; had to respond appropriately when a color and name were presented together. In the next three sessions, they continued with the naming and matching tasks. In both cases, immediate feedback was always given.

But how quickly brain regions may re-organize themselves to optimize learning of a specific skill is not the point I want to make here. Some new research suggests our ideas of cortical plasticity need to be tweaked.

In my book on note-taking, I commented on how emphasis of some details (for example by highlighting) improves memory for those details but reduces memory of other details. In the same way, increase of one small region of the brain is at the expense of others. If we have to grow an area for each new skill, how do we keep up our old skills, whose areas might be shrinking to make up for it?

A rat study suggests the answer. While substantial expertise (such as our London cab-drivers and our professional musicians) is apparently underpinned by permanent regional increase, the mere learning of a new skill does not, it seems, require the increase to endure. When rats were trained on an auditory discrimination task, relevant sub-areas of the auditory cortex grew in response to the new discrimination. However, after 35 days the changes had disappeared — but the rats retained their new perceptual abilities.

What’s particularly interesting about this is what the finding tells us about the process of learning. It appears that the expansion of bits of the cortex is not the point of the process; rather it is a means of generating a large and varied set of neurons that are responsive to newly relevant stimuli, from which the most effective circuit can be selected.

It’s a culling process.

This is the same as what happens with children. When they’re young, neurons grow with dizzying profligacy. As they get older, these are pruned. Gone are the neurons that would allow them to speak French with a perfect accent (assuming French isn’t a language in their environment); gone are the neurons that would allow them to finely discriminate the faces of races other than those around them. They’ve had their chance. The environment has been tested; the needs have been winnowed; the paths have been chosen.

In other words, the answer’s not: “more” (neurons/connections); the answer is “best” (neurons/connections). What’s most relevant; what’s needed; what’s the most efficient use of resources.

This process of throwing out lots of trials and seeing what wins, echoes other findings related to successful learning. We learn a skill best by varying our practice in many small ways. We learn best from our failures, not our successes — after all, a success is a stopper. If you succeed without sufficient failure, how will you properly understand why you succeeded? How will you know there aren’t better ways of succeeding? How will you cope with changes in the situation and task?

Mathematics is an area in which this process is perhaps particularly evident. As a student or teacher, you have almost certainly come across a problem that you or the student couldn’t understand when expressed in one way, and maybe several different ways. Until, at some point, for no clear reason, understanding ‘clicks’. And it’s not necessarily that this last way of expressing / representing it is the ‘right’ one — if it had been presented first, it may not have had that effect. The effect is cumulative — the result of trying several different paths and picking something useful from each of them.

In a recent news item I reported on a finding that people who learned new sequences more quickly in later sessions were those whose brains had displayed more 'flexibility' in the earlier sessions — that is, different areas of the brain linked with different regions at different times. And most recently, I reported on a finding that training on a task that challenged working memory increased fluid intelligence in those who improved at the working memory task. But not everyone did. Those who improved were those who found the task challenging but not overwhelming.

Is it too much of a leap to surmise that this response goes hand in hand with flexible processing, with strategizing? Is this what the ‘sweet spot’ in learning really reflects — a level of challenge and enjoyability that stimulates many slightly different attempts? We say ‘Variety is the spice of life’. Perhaps we should add: ‘Variety is the key to learning’.

How to Revise and Practice

References

Kwok, V., Niu Z., Kay P., Zhou K., Mo L., Jin Z., et al. (2011). Learning new color names produces rapid increase in gray matter in the intact adult human cortex. Proceedings of the National Academy of Sciences.

The most effective learning balances same and different context

I recently reported on a finding that memories are stronger when the pattern of brain activity is more closely matched on each repetition, a finding that might appear to challenge the long-standing belief that it’s better to learn in different contexts. Because these two theories are very important for effective learning and remembering, I want to talk more about this question of encoding variability, and how both theories can be true.

First of all, let’s quickly recap the relevant basic principles of learning and memory (I discuss these in much more detail in my books The Memory Key, now out-of-print but available from my store as a digital download, and its revised version Perfect Memory Training, available from Amazon and elsewhere):

network principle: memory consists of links between associated codes

domino principle: the activation of one code triggers connected codes

recency effect: a recently retrieved code will be more easily found

priming effect: a code will be more easily found if linked codes have just been retrieved

frequency (or repetition) effect: the more often a code has been retrieved, the easier it becomes to find

spacing effect: repetition is more effective if repetitions are separated from each other by other pieces of information, with increasing advantage at greater intervals.

matching effect: a code will be more easily found the more the retrieval cue matches the code

context effect: a code will be more easily found if the encoding and retrieval contexts match

Memory is about two processes: encoding (the way you shape the memory when you put it in your database, which includes the connections you make with other memory codes already there) and retrieving (how easy it is to find in your database). So making a ‘good’ memory (one that is easily retrieved) is about forming a code that has easily activated connections.

The recency and priming effects remind us that it’s much easier to follow a memory trace (by which I mean the path to it as well as the code itself) that has been activated recently, but that’s not a durable strength. Making a memory trace more enduringly stronger requires repetition (the frequency effect). This is about neurobiology: every time neurons fire in a particular sequence, it makes it a little easier for it to fire in that way again.

Now the spacing effect (which is well-attested in the research) seems at odds with this most recent finding, but clearly the finding is experimental evidence of the matching and context effects. Context at the time of encoding affects the memory trace in two ways, one direct and one indirect. It may be encoded with the information, thus providing additional retrieval cues, and it may influence the meaning placed on the information, thus affecting the code itself.

It is therefore not at all surprising that the closer the contexts, the closer the match between what was encoded and what you’re looking for, the more likely you are to remember. The thing to remember is that the spacing effect does not say that it makes the memory trace stronger. In fact, most of the benefit of spacing occurs with as little as two intervening items between repetitions — probably because you’re not going to benefit from repeating a pattern of activation if you don’t give the neurons time to reset themselves.

But repeating the information at increasing intervals does produce better learning, measured by your ability to easily retrieve the information after a long period of time (see my article on …), and it does this (it is thought) not because the memory trace is stronger, but because the variations in context have given you more paths to the code.

This is the important thing about retrieving: it’s not simply about having a strong path to the memory. It’s about getting to that memory any way you can.

Let’s put it this way. You’re at the edge of a jungle. From where you stand, you can see several paths into the dense undergrowth. Some of the paths are well-beaten down; others are not. Some paths are closer to you; others are not. So which path do you choose? The most heavily trodden? Or the closest?

If the closest is the most heavily trodden, then the choice is easy. But if it’s not, you have to weigh up the quality of the paths against their distance from you. You may or may not choose correctly.

I hope the analogy is clear. The strength of the memory trace is the width and smoothness of the path. The distance from you reflects the degree to which the retrieval context (where you are now) matches the encoding context (where you were when you first input the information). If they match exactly, the path will be right there at your feet, and you won’t even bother looking around at the other options. But the more time has passed since you encoded the information, the less chance there is that the contexts will match. However, if you have many different paths that lead to the same information, your chances of being close to one of them obviously increases.

In other words, yes, the closer the match between encoding and retrieval context, the easier it will be to remember (retrieve) the information. And the more different contexts you have encoded with the information, the more likely it is that one of those contexts will match your current retrieval context.

A concrete example might help. I’ve been using a spaced retrieval program to learn the basic 2200-odd Chinese characters. It’s an excellent program, and groups similar-looking characters together to help you learn to distinguish them. I am very aware that every time a character is presented, it appears after another character, which may or may not be the same one it appeared after on an earlier occasion. The character that appeared before provides part of the context for the new character. How well I remember it depends in part on how often I have seen it in that same context.

I would ‘learn’ them more easily if they always appeared in the same order, in that the memory trace would be stronger, and I would more easily and reliably recall them on each occasion. However in the long-term, the experience would be disadvantageous, because as soon as I saw a character in a different context I would be much less likely to recall it. I can observe this process as I master these characters — with each different retrieval context, my perception of the character deepens as I focus attention on different aspects of it.

More about motor memory

I don’t often talk about motor or skill memory — that is, the memory we use when we type or drive a car or play the piano. It’s one of the more mysterious domains of memory. We all know, of course, that this is a particularly durable kind of memory. It’s like riding a bicycle, we say — meaning that it’s something we’re not likely to have forgotten, something that will come back to us very readily, even if it’s been a very long time since we last used the skill.

For several decades there’s been argument over where motor memory is created. Now at last the dispute has apparently been settled, in favor of both contenders. What we needed to clarify the evidence was to realize that short-term motor memory is a quite different animal from long-term motor memory, and the two are created in different places.

The differences between short- and long-term motor memory have important implications, so let’s take a look at them.

First of all, it appears that short-term motor memory is created in the Purkinje cells of the cerebellar cortex, while long-term motor memory is transferred to the vestibular nucleus (axons from the Purkinje cells extend from the cerebellum to the vestibular nucleus in the medulla oblongata.

A similar process occurs of course in other types of memory. Most memory (for experiences, for information) is created in the hippocampus, and later passed on to regions in the cerebral cortex for long-term storage. However, that process of consolidation and transfer takes weeks. Motor memory moves from short-term to long-term much more quickly — within as little as a few hours, in some cases, or a few days at most.

There’s another important way in which motor memory differs from ‘ordinary’ memory. Again, it’s not qualitatively different, but an extension of the normal process. We don’t usually remember everything. Long-term memory is more a memory of gist than precision. Details are lost; what we remember for the most part are the broad strokes on the canvas. Similarly (though rather more markedly perhaps), short-term motor memory is quickly lost, passing on only the rough shape of the process to long-term memory.

For example, in the mouse experiments that demonstrated all this, the mice were taught to follow the movement of an object by moving their eyes in a particular way. With practice they got better at this particular eye movement, and if they practiced the task on a daily basis for several days, they were able to maintain this skill. It had been established in long-term memory.

However, this is a simple skill. When monkeys were taught a more complex skill — to follow a moving ball as its speed increases for a fifth to a tenth of a second — although they usually mastered the task quite quickly, it was also forgotten just as quickly. The researchers say such “sophisticated” motor memory is easily lost in just ten to 15 minutes.

A more human example is how a baseball batter can learn to hit a curve ball after the movement of the ball has been observed several times and memorized. It’s an advantage to pick this information up quickly, but the price seems to be that it is also forgotten quickly.

Riding a bicycle is the archetypal example of the durability of motor memory, but there’s also always a caveat: with just a little practice, we say, you’ll pick it up again. But you need that practice, and to get as skilled as you were in your heyday, you need more practice. Motor memory may be durable, but it’s only the broad outlines of the procedure that are ‘locked in’.

Of course, what constitutes the ‘broad outlines’ is clearly something that must change with practice. A concert pianist who’s been in retirement for five years and someone who learned the piano as a child are not starting off on the same foot! The ‘broad outlines’ the concert pianist has salted away must be considerably more sophisticated than those of the childhood pianist. It would be interesting to see the differences between experts and novices explored.

But in the meantime, there are two useful lessons we can take from these studies. The first is the need to brush up your skills before expecting them to be at their best (the researchers suggest that even professional musicians, accustomed to playing every day, need to ‘remind’ themselves of their skill before a concert). The second is one connected to the speed with which short-term motor memory transfers to long-term memory.

The researchers found that the animals learned more quickly when their training was broken into shorter intervals with breaks — for example, dividing an hour-long training session into four 15-minute exercises with intervals of 30 minutes between them. For this to be true, however, the cerebellar cortex needed to be active. This implies that something happens in this part of the brain during periods of inactivity that’s important for creating long-term memory. I’m reminded here of other recent research pointing to the importance of “quiet time” for consolidating new learning.

None of this contradicts what we already know about how to learn and practice a skill, but it does add to our understanding and reinforces the idea that it’s better to practice a skill regularly in small bites, rather than in lengthy sessions (I’m not denouncing the long sessions a musician, say, puts in on a daily basis. But the recommendation would be not to practice one specific thing for too long at one go — better to move on to something else, and, repeatedly, come back to it.)

For more about how to practice, check out Learning a new skill, Spacing your learning and Acquiring expertise through deliberate practice

 

Acquiring expertise through deliberate practice

K. Anders Ericsson, the guru of research into expertise, makes a very convincing case for the absolutely critical importance of what he terms “deliberate practice”, and the minimal role of what is commonly termed “talent”. I have written about this question of talent and also about the principles of expertise. Here I would like to talk briefly about Ericsson’s concept of deliberate practice.

Most people, he suggests, spend very little (if any) time engaging in deliberate practice even in those areas in which they wish to achieve some level of expertise. Experts, on the other hand, only achieve their expertise after several years (at least ten, in general) of maintaining high levels of regular deliberate practice.

What distinguishes deliberate practice from less productive practice? Ericsson suggests several factors are of importance:

The acquisition of expert performance needs to be broken down into a sequence of attainable training tasks.

  • Each of these tasks requires a well-defined goal.
  • Feedback for each step must be provided.
  • Repetition is needed — but that repetition is not simple; rather the student should be provided with opportunities that gradually refine his performance.
  • Attention is absolutely necessary — it is not enough to simply mechanically “go through the motions”.
  • The aspiring expert must constantly and attentively monitor her progress, adjusting and correcting her performance as required.

For these last two reasons, deliberate practice is limited in duration. Whatever the particular field of endeavor, there seems a remarkable consistency in the habits of elite performers that suggests 4 to 5 hours of deliberate practice per day is the maximum that can be maintained. This, of course, cannot all be done at one time without resting. When the concentration flags, it is time to rest — this most probably is after about an hour. But the student must train himself up to this level; the length of time he can concentrate will increase with practice.

Higher levels of concentration are often associated with longer sleeping, in particular in the form of day-time naps.

Not all practice is, or should be, deliberate practice. Deliberate practice is effortful and rarely enjoyable. Some practice is however, what Ericsson terms “playful interaction”, and presumably provides a motivational force — it should not be despised!

In general, experts reduce the amount of time they spend on deliberate practice as they age. It seems that, once a certain level of expertise has been achieved, it is not necessary to force yourself to continue the practice at the same level in order to maintain your skill. However, as long as you wish to improve, a high level of deliberate practice is required.

This article first appeared in the Memory Key Newsletter for November 2005

How to Revise and Practice

References

Ericsson, K.A. 1996. The acquisition of expert performance: An introduction to some of the issues. In K. Anders Ericsson (ed.), The Road to Excellence: The acquisition of expert performance in the arts and sciences, sports, and games. Mahwah, NJ: Lawrence Erlbaum.

The most effective way of spacing your learning

We don’t deliberately practice our memories of events — not as a rule, anyway. But we don’t need to — because just living our life is sufficient to bring about the practice. We remember happy, or unpleasant, events to ourselves, and we recount our memories to other people. Some will become familiar stories that we re-tell again and again. But facts, the sort of information we learn in formal settings such as school and university, these are not something we tend to repeatedly recount to ourselves or others — not for pleasure anyway! (Unless you’re a teacher, and that’s part of the reason teaching is such a good way of learning!)

So, this is one of the big issues in learning: how to get the repetition we need to fix something in our brain. Simple repetition — the sort of drill we deplore in pre-modern schools — is not a great answer. Not simply because it’s boring, but because boring tasks are not particularly effective means of getting the brain to do things. Our brains respond much better to the surprising, the novel, the emotional, the interesting.

Teachers today are of course aware of this, and do try (or I hope they do!) to provide as much variety, and interest, as they can. But there is another aspect to repetition that is less widely understood, and that is the spacing between repetitions. Now the basic principle has been known for some time: spaced repetition is better than massed practice. But research has been somewhat lacking as to what constitutes the optimal spacing for learning. Studies have tended to use quite short intervals. But now a new study has finally given us something to work with.

For a start, the study was much bigger than the usual such study — over 1350 people took part — increasing the faith we can have in the findings. And, crucially, the interval between the initial learning session and the second review session ranged from several minutes to 3.5 months (specifically, 3 minutes; one day; 2 days; 4 days; 7 days; 11 days; 14 days; 21 days; 35 days; 70 days; 105 days). The time until test also covered more ground — up to nearly a year (more specifically: 7 days; 35 days; 70 days; 350 days). The initial learning session involved the participants learning 32 obscure facts to a criterion level of one perfect recall for each fact. The review session involved the participants being tested twice on each fact. They were then shown the correct answer. Testing included both a recall test and a recognition (multi-choice) test. The participants, by the way, ranged in age from 18 to 72 years, with an average of 34 (the study was done using the internet; so nice to get away from the usual undergraduate fodder).

So there we are, a very systematic study, made possible by having such a large pool of participants (the benefits of the internet!). What was found? Well, first of all, the benefits of spacing review were quite significant, much larger than had been seen in earlier research when shorter intervals had been used. Given a fixed amount of study time, the optimal gap, compared to no gap (i.e. 3 minutes), improved recall by 64% and recognition by 26%.

Secondly, at any given test delay, longer intervals between initial study session and review session first improved test performance, then gradually reduced it. In other words, there was an optimal interval between study and review. This optimal gap increased as test delay increased — that is, the longer you want to remember the information, the more you should spread the gap between study and review (this simplifies the situation of course — if you’re serious about study, you’re going to review it more than once!). So, for those remembering for a week, the optimal gap was one day; for remembering for a month, it was 11 days; for 2 months (70 days) it was 3 weeks, and similarly for remembering for a year. Extrapolating, it seems likely that if you’re wanting to remember information for several years, you should review it over several months.

Note that the general rule is absolute rather than relative: when measured as a proportion of test delay, the optimal gap declined from about 20 to 40% of a 1-week test delay to about 5 to 10% of a 1-year test delay. In other words, although the optimal gap between study and review increases as the length of time you want to remember for increases, the ratio of gap to that length of time will decrease. Which seems very commonsensical.

As the researchers point out (and as has been said before), “the interaction of gap and test delay implies that many educational practices are highly inefficient”, concentrating topics tightly into short periods of time. This practice is likely to give misleadingly high levels of immediate mastery (as shown in tests given at the end of this time) — performance which is unlikely to be sustained over longer periods of time.

It’s also worth noting that the costs of using a gap that is longer than the optimal gap are decidedly less than the costs of using a shorter gap — in other words, better to space your learning longer than too short.

This article first appeared in the Memory Key Newsletter for December 2008

How to Revise and Practice

Flashcards

Flashcards are cards with a word (or phrase) on one side and its translation on the other. You can buy ready-made flashcards, and these can certainly be helpful, particularly if you're inexperienced at learning another language. However, it is more effective if you make them yourself. Not only will the cards be customized to your own use, but the activity of selecting words and writing them down help you learn them.

A standard way of using flashcards is simply to go through a set number each day, separating out those you have trouble with, so you can review them more often. Keep these ones handy so that you can go through them at odd moments during the day when you're waiting for something.

Use the flashcards as a handy way to group words in different ways. Deal out the cards and move them around, looking for connections.

If you have word-family flashcards (recommended) - e.g., cards with various related forms of a word - you can make different sentences with your cards. You could also play cards with them, if you have others to play with. You could play a version of rummy, for example, where the sets are infinitive, present tense, future tense, past perfect. Use your imagination!

A bingo game with flashcards is another fun way to practice. Construct bingo cards (large cards divided into a certain number of spaces the same size as your flashcards) with the native language words on it. While this is better played with others, you can at a pinch play with yourself, simply picking out a flashcard from the pile and seeing how quickly you can match it with its counterpart.

Learning words in isolation will not help you much in dealing with words in context. You do need to practice reading/writing/speaking/listening sentences. But flashcards are a useful means of memorizing vocabulary.

Flashcard software

VTrain (Vocabulary Trainer): is flashcard software apparently used in the language labs of 40 Universities and hundreds of high schools; it's free for educational establishments. It's shareware.