Evidence that positive rewards are learnt faster than negative rewards?

Question

Its folk psychology wisdom that its easier to reward positive behavior than punish negative behavior (e.g. any book on parenting or dog training), but is there any evidence in the cognitive science literature that this is indeed the case? And if so, then have possible mechanisms for this phenomena been proposed?

For some context, I'm asking this because coming from a computational perspective this doesn't really make sense. The typical reinforcement learning algorithm tries to maximize the expected discounted cumulative reward of behavior, and it makes no difference if that entails avoiding negative rewards or seeking out positive rewards.

I think, didactically spoken, positive reinforcement is better, especially in kids that don't really grasp negative reinforcement. However, in certain experimental lab conditions, negative reinforcement may work a lot faster and have longer-lasting effects :) I think it depends heavily on the context. — AliceD, Commented Jan 15, 2015 at 2:21
Seem to be two different questions: "Positive rewards learned faster?" and "Easier to reward positive behavior?" The first does not take into account difficulty. As for the first one, the negativity bias would seem to be against it. — dwn, Commented Jan 15, 2015 at 3:24
It is. As for why: People are not machines. If you are beaten, you will despise your teacher. This hinders learning. If you are smiled at, you will love your teacher. This makes you want to learn. Or in cogsci terms: you have a confounding variable that comes from the differing emotional value of positive and negative rewards. (This is a good example for why a computing background can hinder psychological theorizing. You need to let go of the human = computer metaphor and look at people from a human perspective.) — user3116, Commented Jan 15, 2015 at 8:28
@what This is precisely the sort of folk intuition that I'd like to back up with empirical evidence. Obviousness is a terrible replacement for true experimentation. — zergylord, Commented Jan 20, 2015 at 3:14
@what what is the obvious statement that doesn't need to be shown? That beating children is a less effective at getting them to learn than smiling at them? Seems like a non-obvious statement to me, especially given the history of education. I've definitely heard the sentiment you're expressed before, and it does seem to be a central part of modern perspectives in education, at least in the west. But dismissing zergylord with a story or apples-and-oranges is not conducive to learning, nor is his caution unwarranted given how notorious educational-psych is for not doing well controlled studies. — Artem Kaznatcheev, Commented Jan 20, 2015 at 22:11

Arnon Weinberg · Accepted Answer · 2015-02-01 04:45:54Z

This is a great question.

Short answer: No, the evidence does not suggest that positive reinforcement is universally more effective than negative reinforcement or punishment. However, there are still good reasons to focus on rewards over punishment in real-life training/learning situations.

Long answer:

The trouble for folk psychology began with Skinner's somewhat unfortunate choice of terminology... While Skinner advocated positive reinforcement over negative reinforcement or punishment, it seems to have remained an industry secret that what he meant by those terms is not what the general public thinks he meant.

The modern view of positive and negative reinforcement is that they are essentially synonyms. They are different ways of looking at the same thing, like describing a glass of water based on how full or how empty it is. Computationally, as you say, learning algorithms that assign positive values to targets or negative values to non-targets are mathematically equivalent.

Although they are often confused with positive and negative reinforcement, rewards and aversives are different terms with different meanings. Testing whether one is more effective than the other is tricky, as in practice they are usually qualitatively different. For example, is ice cream or spanking more effective for getting your kid to do their homework? The answer is: It depends - how much ice cream, how much spanking...? Surely we can find a ratio of ice cream to spanking at which they are equally effective. The type of research that examines this question is interested in determining where that boundary is (here is an example). Thought experiment: How can qualitatively different feedback mechanisms be applied in machine learning?

Research more likely to answer your question compares positive and negative feedback that is arguably qualitatively equivalent. For example, compare gaining money to avoiding loss of money, reducing risk to avoiding increase in risk, and it's even possible to compare the effects on animals that are trained in a token economy. In recent years, the folk psychology idea that positive feedback is universally superior to negative feedback has been called into question by such research. A few examples:

Michael Perone reviews research demonstrating undesirable effects of positive reinforcement.
Comparing "well done!" to "got it wrong this time", Eveline Crone et al found that before the age of twelve, children perform better with positive feedback, but older children and adults do better with negative feedback.
A study comparing "keep the word Good on the screen" to "keep the word Bad off the screen" found that negative reinforcement may be more effective for some types of learning.
Ayelet Fishbach and Stacey Finkelstein review research that suggests that negative feedback is more effective for experts, and motivates goal pursuit when it signals insufficient progress.
Lots of research into Loss Aversion suggests that avoiding the loss of money is a more powerful motivator than gaining an equivalent amount of money. However, this phenomenon may not translate to an effect on learning.
Despite all this, note that the evidence in favour of positive reinforcement is much greater - these are just some notable exceptions.

Neurological reasons for the difference in effectiveness between positive and negative feedback have been proposed - for example by Eveline Crone in the study cited above.

Training using rewards is preferred for animal trainers, parents, and teachers in most practical situations:

There is usually a much wider range of undesirable behaviour to punish than desirable behaviour to reward. Animals and young children have difficulty determining what the desirable behaviour is with only clues about what behaviours are undesirable. Adults are easier to work with because you can explain the desirable behaviour in words.
Punishment must be applied to all undesirable behaviour to be effective, while reward need only be applied to desirable behaviour, and even then only intermittently, to be effective.
Due to classical conditioning, subjects may attribute punishment to factors unrelated to their behaviour, such as the trainer or the classroom. They may learn to avoid the trainer, or avoid getting caught, rather than the desirable behaviour.
Notice how none of these points apply to machine learning, where the process is structured, constrained, and automated.

John Maag summarizes reasons to promote positive reinforcement in schools: Namely that teachers often find it convenient and effective (in the short-term) to administer punishment and are overly reliant on it in many situations where positive reinforcement would be more effective and desirable.

Nice video on the topic.

Stack Exchange Network

Evidence that positive rewards are learnt faster than negative rewards?

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
reinforcement-learning
or ask your own question.

Linked

Hot Network Questions

Evidence that positive rewards are learnt faster than negative rewards?

1 Answer 1

Not the answer you're looking for? Browse other questions tagged reinforcement-learning or ask your own question.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
reinforcement-learning
or ask your own question.