Recent replication studies indicate the classic marshmallow test is fundamentally flawed because children's performance depends on trust and social context [1].
For decades, the Stanford University study was viewed as a definitive measure of a child's innate self-discipline and a predictor of future success. However, these new findings suggest that the test may actually measure a child's perception of their environment rather than their internal willpower [1, 2].
In the original experiment, children were asked to wait 10 to 15 minutes [1] to receive a second treat. New research conducted in U.S. university labs shows that behavior changes significantly based on the reliability of the adult in the room. When children were told that the experimenter would keep their promise, approximately 70% of them waited [2].
"The marshmallow test, long held up as a litmus test for self‑control, is more a measure of trust than willpower," the Atlantic editorial team said [1].
Social dynamics also play a critical role in the results. When a second child was added to the room, the waiting rate dropped from roughly 70% to about 30% [3]. This shift suggests that the presence of peers alters the decision-making process, calling into question the individual-focused conclusions of the original study.
"Adding a second child to the room changes the dynamics entirely, calling into question the test’s individual‑focused conclusions," Megan R. said [3].
These findings contradict long-held beliefs that passing the test reflects a child's inherent self-discipline. Instead, researchers found that children are far more likely to wait when they are assured the experimenter will keep their promise [2].
“The marshmallow test... is more a measure of trust than willpower.”
The shift in understanding the marshmallow test moves the focus of developmental psychology from innate personality traits to environmental factors. If a child's ability to delay gratification is tied to trust and social stability, then 'self-control' is not a fixed internal asset but a response to a reliable environment. This undermines the use of the test as a tool for predicting long-term life achievement.


