Nobody likes to make a mistake — but a recent study suggests there may be a bright side to failure, so long as we want there to be one.

Researchers from the University of Southern California first found that in the past decades, significant advances have been toward the understanding of the computational and neural bases of reward-based learning and decision making. Reward-based learning is exactly how it sounds: the brain feels rewarded for reaching the right answer. There’s also avoidance-learning, which punishes the brain so it doesn’t make the same mistake twice.

Since less is known about the underlying mechanisms of punishment-based learning, researchers recruited 28 volunteers to undergo fMRI brain scans as they performed instrumental learning tasks that involved learning how to maximize rewards and minimize punishments; their answer either earned or lost them money. This way, researchers could see participants’ neural activity in the brain valuation system, if information encoding had relative (comparative) or absolute (non-comparative) value.

The tasks at hand prompted a person’s brain to respond to getting the wrong answer with both punishment- and reward-based thinking. Afterwards participants completed a post-learning assessment to allow them the chance to reflect and understand their mistakes. The results showed punishment avoidance performances were matched to reward seeking ones, a result that couldn’t be explained by absolute value learning. On the other hand, the assessment participants took to reflect displayed “significant biases” that could be explained by relative value learning. For the most part, they responded positively, thus activating their brain's reward circuit.

"We show that, in certain circumstances, when we get enough information to contextualize the choices, then our brain essentially reaches towards the reinforcement mechanism, instead of turning toward avoidance," Giorgio Coricelli, a USC-Dornsife associate professor of economics and psychology, said in a press release. Put it another way: that time to reflect on their mistakes is what helped researchers focus more on reward, less on punishment.

Punishment avoidance is computationally challenging, Coricelli and his team noted; “how can the instrumental response (avoid a punishment) be maintained despite the absence of further extrinsic reinforcement (punishment)?” Absolute value learning, it turns out, just isn’t capable of coping with this problem.

In the brain, “value updating” is originally implemented by two different systems for the reward and punishment: the ventral striatum and the anterior insula, respectively. As a result of their positive reflections, participants were able oppress avoidance and instead mimick the brain's reward-based learning response. Which is all to say if we make a mistake, it's worthwhile to take that time to understand what happened. It’s not unlike working through feelings of regret.

“With regret, for instance, if you have done something wrong, then you might change your behavior in the future," Coricelli said.

Source: Palminteri S. et al. Contextual modulation of value signals in reward and punishment learning. Nature Communications. 2015.