Marshmallow Test Experiment and Delayed Gratification

Take-home Messages

  • The marshmallow test is an experimental design that measures a child’s ability to delay gratification. The child is given the option of waiting a bit to get their favourite treat, or if not waiting for it, receiving a less-desired treat. The minutes or seconds a child waits measures their ability to delay gratification.
  • The original marshmallow test showed that preschoolers’ delay times were significantly affected by the experimental conditions, like the physical presence/absence of expected treats.
  • The original test sample was not representative of preschooler population, thereby limiting the study’s predictive ability. (Preschool participants were all recruited from Stanford University’s Bing Nursery School, which was then largely patronized by children of Stanford faculty and alumni.)
  • A 2018 study on a large, representative sample of preschoolers sought to replicate the statistically significant correlations between early-age delay times and later-age life outcomes, like SAT scores, which had been previously found using data from the original marshmallow test. The replication study found only weak statistically significant correlations, which disappeared after controlling for socio-economic factors.
  • However, the 2018 study did find statistically significant differences between early-age delay times and later-age life outcomes between children from high-SES families and children from low-SES families, implying that socio-economic factors play a more significant role than early-age self-control in important life outcomes.

In a 1970 paper, Walter Mischel, a professor of psychology at Stanford University, and his graduate student, Ebbe Ebbesen, had found that preschoolers waiting 15 minutes to receive their preferred treat (a pretzel or a marshmallow) waited much less time when either treat was within sight than when neither treat was in view.

Children with treats present waited 3.09 ± 5.59 minutes; children with neither treat present waited 8.90 ± 5.26 minutes.

The study had suggested that gratification delay in children involved suppressing rather than enhancing attention to expected rewards. For instance, some children who waited with both treats in sight would stare at a mirror, cover their eyes, or talk to themselves, rather than fixate on the pretzel or marshmallow.

Mischel, Ebbesen, and Antonette Zeiss, a visiting faculty member at the time, set out to investigate whether attending to rewards cognitively made it more difficult for children to delay gratification.

The Stanford Marshmallow Experiments

Mischel, Ebbesen and Zeiss (1972) designed three experiments to investigate, respectively, the effect of overt activities, cognitive activities, and the lack of either, in the preschoolers’ gratification delay times.

Experiment 1

Fifty-six children from the Bing Nursery School at Stanford University were recruited. To build rapport with the preschoolers, two experimenters spent a few days playing with them at the nursery.


Children were randomly assigned to one of five groups (A – E).

The children were individually escorted to a room where the test would take place. Each child was taught to ring a bell to signal for the experimenter to return to the room if they ever stepped out.

Treat vs. No Treats Condition

Children in groups A, B, C were shown two treats (a marshmallow and a pretzel) and asked to choose their favourite.

They were then told that the experimenter would soon have to leave for a while, but that they’d get their preferred treat if they waited for the experimenter to come back without signalling for them to do so.

They were also explicitly allowed to signal for the experimenter to come back at any point in time, but told that if they did, they’d only get the treat they hadn’t chosen as their favourite. Both treats were left in plain view in the room.

Children in groups D and E were given no such choice or instructions.

Children in groups A, B, or C who waited the full 15 minutes were allowed to eat their favoured treat. Those in groups A, B, or C who didn’t wait the 15 minutes were allowed to have only their non-favoured treat.

Children in groups D and E weren’t given treats. All children got to play with toys with the experiments after waiting the full 15 minutes or after signalling.

Distraction vs. No Entertainment Condition

Children in groups A and D were given a slinky and were told they had permission to play with it.

Children in groups B and E were asked to “think of anything that’s fun to think of” and were told that some fun things to think of included singing songs and playing with toys.


Each child’s comprehension of the instructions was tested. Six children didn’t seem to comprehend, and were excluded from the test. The remaining 50 children were included.

All 50 were told that whether or not they rung the bell, the experimenter would return, and when he did, they would play with toys.

Waiting time was scored from the moment the experimenter shut the door. The experimenter returned either as soon as the child signalled or after 15 minutes, if the child did not signal.


marshmallow-test-results for treat vs no treat condition

The results suggested that children were much more willing to wait longer when they were offered a reward for waiting (groups A, B, C) than when they weren’t (groups D, E)

The results also showed that children waited much longer when they were given tasks that distracted or entertained them during their waiting period (playing with a slinky for group A, thinking of fun things for group B) than when they weren’t distracted (group C).

Experiment 2

This test differed from the first only in the following ways:

  1. Thirty-eight children were recruited, with six lost due to incomplete comprehension of instructions.
  2. Thirty-two children were randomly assigned to three groups (A, B, C).
  3. All children were given a choice of treats, and told they could wait without signalling to have their favourite treat, or simply signal to have the other treat but forfeit their favoured one.
  4. In all cases, both treats were left in plain view.
  5. Children in group A were asked to think of fun things, as before.
  6. Those in group B were asked to think of sad things, and likewise given examples of such things.
  7. Those in group C were asked to think of the treats.


marshmallow-test-results for distracted vs not distracted condition

The results suggested that children who were given distracting tasks that were also fun (thinking of fun things for group A) waited much longer for their treats than children who were given tasks that either didn’t distract them from the treats (group C, asked to think of the treats) or didn’t entertain them (group B, asked to think of sad things).

Experiment 3

This test differed from the first only in the following ways :

  1. Sixteen children were recruited, and none excluded.
  2. Children were randomly assigned to three groups (A, B, C),
  3. All children were given a choice of treats, and told they could wait without signalling to have their favourite treat, or simply signal to have the other treat but forfeit their favoured one.
  4. In all cases, both treats were obscured from the children with a tin cake cover (which children were told would keep the treats fresh).
  5. Children in group A were asked to think about the treats.
  6. Those in group B were asked to think of fun things, as before.
  7. Those in group C were given no task at all.


marshmallow-test-results for distracted vs not distracted condition

The results suggested that when treats were obscured (by a cake tin, in this case), children who were given no distracting or fun task (group C) waited just as long for their treats as those who were given a distracting and fun task (group B, asked to think of fun things).

On the other hand, when the children were given a task which didn’t distract them from the treats (group A, asked to think of the treats), having the treats obscured did not increase their delay time as opposed to having them unobscured (as in the second test).

Final Conclusions

The studies convinced Mischel, Ebbesen and Zeiss that children’s successful delay of gratification significantly depended on their cognitive avoidance or suppression of the expected treats during the waiting period, eg by not having the treats within sight, or by thinking of fun things.

Children, they reasoned, could wait a relatively long time if they –

  1. Believed they really would get their favoured treat if they waited (eg by trusting the experimenter, by having the treats remain in the room, whether obscured or in plain view).
  2. Shifted their attention away from the treats.
  3. Occupied themselves with non-frustrating or pleasant internal or external stimuli (eg thinking of fun things, playing with toys).

Critical Evaluation

  • Sample size determination was not disclosed.
  • The study population (Stanford’s Bind Nursery School) was not characterised, and so may differ in relevant respects from the general human population, or even the general preschooler population. (In fact, the school was mostly attended by middle-class children of faculty and alumni of Stanford.)
  • The findings might also not extend to voluntary delay of gratification (where the option of having either treat immediately is available, in addition to the studied option of having only the non-favoured treat immediately).

Longitudinal Studies Using Stanford Data

Delayed Gratification and SAT Scores

In 1990, Yuichi Shoda, a graduate student at Columbia University, Walter Mischel, now a professor at Columbia University, and Philip Peake, a graduate student at Smith College, examined the relationship between preschoolers’ delay of gratification and their later SAT scores.


Six-hundred and fifty-three preschoolers at the Bing School at Stanford University participated at least once in a series of gratification delay studies between 1968 and 1974.

Four-hundred and four of their parents received follow-up questionnaires. One-hundred and eighty-five responded. Ninety-four parents supplied their children’s SAT scores.

Children were divided into four groups depending on whether a cognitive activity (eg thinking of fun things) had been suggested before the delay period or not, and on whether the expected treats had remained within sight throughout the delay period or not.


The difference in the mean waiting time of the children of parents who responded and that of the children of parents who didn’t respond was not statistically significant (p = 0.09, n = 653).


marshmallow-test-results for delayed gratification and future SAT scores

Preschoolers’ delay times correlated positively and significantly with their later SAT scores when no cognitive task had been suggested and the expected treats had remained in plain sight.

Other correlations were not significant.


Shoda, Mischel and Peake (1990) urged caution in extrapolating their findings, since their samples were uncomfortably small.

Delayed Gratification and Positive Functioning

In a 2000 paper, Ozlem Ayduk, at the time a postdoctoral researcher at Columbia, and colleagues, explored the role that preschoolers’ ability to delay gratification played in their later self-worth, self-esteem, and ability to cope with stress.


Five-hundred and fifty preschoolers’ ability to delay gratification in Prof. Mischel’s Stanford studies between 1968 and 1974 was scored.

Each preschooler’s delay score was taken as the difference from the mean delay time of the experimental group the child had been assigned to and the child’s individual score in that group.

Between 1993 and 1995, 444 parents of the original preschoolers were mailed with questionnaires for themselves and their now adult-aged children. A hundred and eighty-seven parents and 152 children returned them.

The questionnaires measured, through nine-point Likert-scale items, the children’s self-worth, self-esteem, and ability to cope with stress. The scores on these items were standardized to derive a positive functioning composite.


The positive functioning composite, derived either from self-ratings or parental ratings, was found to correlate positively with delay of gratification scores.

Preschoolers who were better able to delay gratification were more likely to exhibit higher self-worth, higher self-esteem, and a greater ability to cope with stress during adulthood than preschoolers who were less able to delay gratification.

Delayed Gratification and Body Mass Index

In a 2013 paper, Tanya Schlam, a doctoral student at the University of Wisconsin, and colleagues, explored a possible association between preschoolers’ ability to delay gratification and their later Body Mass Index.


Prof. Mischel’s data were again used. Of 653 preschoolers who participated in his studies as preschoolers, the researchers sent mailers to all those for whom they had valid addresses (n = 306) in December 2002 / January 2003 and again in May 2004.

Of these, 146 individuals responded with their weight and height. Individual delay scores were derived as in the 2000 Study.


Preschoolers’ ability to delay gratification accounted for a significant portion of the variance seen in the sample (p < 0.01, n = 146).

Specifically, each additional minute a preschooler delayed gratification predicted a 0.2-point reduction in BMI in adulthood.

Marshmallow Test Replication Study

In a 2018 paper, Tyler Watts, an assistant professor and postdoctoral researcher at New York University, and Greg Duncan and Haonan Quan, both doctoral students at UC, Irvine, set out to replicate longitudinal studies based on Prof. Mischel’s data.

Data on 918 individuals, from a longitudinal, multi-centre study on children by the National Institute of Child Health and Human Development (an institute in the NIH), were used for the study.


The sample was split into two groups –

  1. Data on children of mothers who had not completed university college by the time their child was one month old (n = 552);
  2. Data on children of mothers who had completed university college by that time (n = 366).

The first group (children of mothers without degrees) was more comparable to a nationally representative sample (from the Early Childhood Longitudinal Survey—Kindergarten by the National Center for Education Statistics). Even so, Hispanic children were underrepresented in the sample.


A variant of the marshmallow test was administered to children when they were 4.5 years old. An interviewer presented each child with treats based on the child’s own preferences.

Children were then told they would play the following game with the interviewer –

  1. The interviewer would leave the child alone with the treat;
  2. If the child waited 7 minutes, the interviewer would return, and the child would then be able to eat the treat plus an additional portion as a reward for waiting;
  3. If the child did not want to wait, they could ring a bell to signal the interviewer to return early, and the child would then be able to eat the treat without an additional portion.

Delay of gratification was recorded as the number of minutes the child waited.

Academic achievement was measured at grade 1 and age 15. Measures included mathematical problem solving, word recognition and vocabulary (only in grade 1), and textual passage comprehension (only at age 15). Scores were normalized to have mean of 100 ± 15 points.

Behavioral functioning was measured at age 4.5, grade 1 and age 15. Mothers were asked to score their child’s depressive and anti-social behaviors on 3-point Likert-scale items.


For intra-group regression analyses, the following socio-economic variables, measured at or before age 4.5, were controlled for –

  1. Demographic characteristics like gender, race, birth weight, mother’s age at child’s birth, mother’s level of education, family income, mother’s score in a measure-of-intelligence test;
  2. Cognitive functioning characteristics like sensory-perceptual abilities, memory, problem solving, verbal communication skills; and
  3. Home environment characteristics known to support positive cognitive, emotional and behavioral functioning (the HOME inventory by Caldwell & Bradley, 1984).


marshmallow test replication results

  • Watts, Duncan and Quan (2018) did find statistically significant correlations between early-stage ability to delay gratification and later-stage academic achievement, but the association was weaker than that found by researchers using Prof. Mischel’s data.
  • In addition, the significance of these bivariate associations disappeared after controlling for socio-economic and cognitive variables.
  • There were no statistically significant associations, even without controlling for confounding variables, between early gratification delay and later behavioral functioning at age 15.


These results further complicated the relation between early delay ability and later life outcomes.

Prof. Mischel’s findings, from a small, non-representative cohort of mostly middle-class preschoolers at Stanford’s Bing Nursery School, were not replicated in a larger, more representative sample of preschool-aged children.

Increasing Delayed Gratification

The following factor has been found to increase a child’s gratification delay time –

Trust in rewarders:

Children who trust that they will be rewarded for waiting are significantly more likely to wait than those who don’t. Kidd, Palmeri and Aslin, 2013, replicating Prof. Mischel’s marshmallow study, tested 28 four-year-olds twice.

In the first test, half of the children didn’t receive the treat they’d been promised. In the second test, the children who’d been tricked before were significantly less likely to delay gratification than those who hadn’t been tricked.

The following factors may increase an adult’s gratification delay time –

Knowledge of time-to-reward:

Individuals who know how long they must wait for an expected reward are more likely continue waiting for said reward than those who don’t.

McGuire and Kable (2012) tested 40 adult participants. One group was given known reward times, while the other was not. The first group was significantly more likely to delay gratification.

Probability of the expected reward materialising:

When the individuals delaying their gratification are the same ones creating their reward.

For example, someone going on a diet to achieve a desired weight, those who set realistic rewards are more likely to continue waiting for their reward than those who set unrealistic or improbable rewards.

Gelinas et al. (2013) studied the association between unrealistic weight loss expectations and weight gain before a weight-loss surgery in 219 adult participants.

The correlation coefficient r = 0.377 was statistically significant at p < 0.008 for male (n = 53) but not female (n = 166) participants.)


Ayduk, O., Mendoza-Denton, R., Mischel, W., Downey, G., Peake, P. K., & Rodriguez, M. (2000). Regulating the interpersonal self: strategic self-regulation for coping with rejection sensitivity. Journal of personality and social psychology, 79 (5), 776.

Bradley, R. H., & Caldwell, B. M. (1984). The HOME Inventory and family demographics. Developmental psychology, 20 (2), 315.

Gelinas, B. L., Delparte, C. A., Hart, R., & Wright, K. D. (2013). Unrealistic weight loss goals and expectations among bariatric surgery candidates: the impact on pre-and postsurgical weight outcomes. Bariatric Surgical Patient Care, 8 (1), 12-17.

Kidd, C., Palmeri, H., & Aslin, R. N. (2013). Rational snacking: Young children’s decision-making on the marshmallow task is moderated by beliefs about environmental reliability. Cognition, 126 (1), 109-114.

McGuire, J. T., & Kable, J. W. (2012). Decision makers calibrate behavioral persistence on the basis of time-interval experience. Cognition, 124 (2), 216-226.

Mischel, W., & Ebbesen, E. B. (1970). Attention in delay of gratification. Journal of Personality and Social Psychology, 16 (2), 329.

Mischel, W., Ebbesen, E. B., & Raskoff Zeiss, A. (1972). Cognitive and attentional mechanisms in delay of gratification. Journal of personality and social psychology, 21 (2), 204.

Schlam, T. R., Wilson, N. L., Shoda, Y., Mischel, W., & Ayduk, O. (2013). Preschoolers” delay of gratification predicts their body mass 30 years later. The Journal of pediatrics, 162 (1), 90-93.

Shoda, Y., Mischel, W., & Peake, P. K. (1990). Predicting adolescent cognitive and self-regulatory competencies from preschool delay of gratification: Identifying diagnostic conditions. Developmental psychology, 26 (6), 978.

Watts, T. W., Duncan, G. J., & Quan, H. (2018). Revisiting the marshmallow test: A conceptual replication investigating links between early delay of gratification and later outcomes. Psychological science, 29 (7), 1159-1177.

Saul Mcleod, PhD

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Educator, Researcher

Saul Mcleod, Ph.D., is a qualified psychology teacher with over 18 years experience of working in further and higher education.

Angel E. Navidad

Philosophy Expert

B.A. Philosophy, Harvard University - Cambridge, Massachusetts.

Angel E. Navidad is a graduate of Harvard University with a B.A. Philosophy.