9
$\begingroup$

Its folk psychology wisdom that its easier to reward positive behavior than punish negative behavior (e.g. any book on parenting or dog training), but is there any evidence in the cognitive science literature that this is indeed the case? And if so, then have possible mechanisms for this phenomena been proposed?

For some context, I'm asking this because coming from a computational perspective this doesn't really make sense. The typical reinforcement learning algorithm tries to maximize the expected discounted cumulative reward of behavior, and it makes no difference if that entails avoiding negative rewards or seeking out positive rewards.

$\endgroup$
7
  • $\begingroup$ I think, didactically spoken, positive reinforcement is better, especially in kids that don't really grasp negative reinforcement. However, in certain experimental lab conditions, negative reinforcement may work a lot faster and have longer-lasting effects :) I think it depends heavily on the context. $\endgroup$
    – AliceD
    Commented Jan 15, 2015 at 2:21
  • $\begingroup$ Seem to be two different questions: "Positive rewards learned faster?" and "Easier to reward positive behavior?" The first does not take into account difficulty. As for the first one, the negativity bias would seem to be against it. $\endgroup$
    – dwn
    Commented Jan 15, 2015 at 3:24
  • 1
    $\begingroup$ It is. As for why: People are not machines. If you are beaten, you will despise your teacher. This hinders learning. If you are smiled at, you will love your teacher. This makes you want to learn. Or in cogsci terms: you have a confounding variable that comes from the differing emotional value of positive and negative rewards. (This is a good example for why a computing background can hinder psychological theorizing. You need to let go of the human = computer metaphor and look at people from a human perspective.) $\endgroup$
    – user3116
    Commented Jan 15, 2015 at 8:28
  • 2
    $\begingroup$ @what This is precisely the sort of folk intuition that I'd like to back up with empirical evidence. Obviousness is a terrible replacement for true experimentation. $\endgroup$
    – zergylord
    Commented Jan 20, 2015 at 3:14
  • 2
    $\begingroup$ @what what is the obvious statement that doesn't need to be shown? That beating children is a less effective at getting them to learn than smiling at them? Seems like a non-obvious statement to me, especially given the history of education. I've definitely heard the sentiment you're expressed before, and it does seem to be a central part of modern perspectives in education, at least in the west. But dismissing zergylord with a story or apples-and-oranges is not conducive to learning, nor is his caution unwarranted given how notorious educational-psych is for not doing well controlled studies. $\endgroup$ Commented Jan 20, 2015 at 22:11

1 Answer 1

8
+50
$\begingroup$

This is a great question.

Short answer: No, the evidence does not suggest that positive reinforcement is universally more effective than negative reinforcement or punishment. However, there are still good reasons to focus on rewards over punishment in real-life training/learning situations.

Long answer:

The trouble for folk psychology began with Skinner's somewhat unfortunate choice of terminology... While Skinner advocated positive reinforcement over negative reinforcement or punishment, it seems to have remained an industry secret that what he meant by those terms is not what the general public thinks he meant.

The modern view of positive and negative reinforcement is that they are essentially synonyms. They are different ways of looking at the same thing, like describing a glass of water based on how full or how empty it is. Computationally, as you say, learning algorithms that assign positive values to targets or negative values to non-targets are mathematically equivalent.

Although they are often confused with positive and negative reinforcement, rewards and aversives are different terms with different meanings. Testing whether one is more effective than the other is tricky, as in practice they are usually qualitatively different. For example, is ice cream or spanking more effective for getting your kid to do their homework? The answer is: It depends - how much ice cream, how much spanking...? Surely we can find a ratio of ice cream to spanking at which they are equally effective. The type of research that examines this question is interested in determining where that boundary is (here is an example). Thought experiment: How can qualitatively different feedback mechanisms be applied in machine learning?

Research more likely to answer your question compares positive and negative feedback that is arguably qualitatively equivalent. For example, compare gaining money to avoiding loss of money, reducing risk to avoiding increase in risk, and it's even possible to compare the effects on animals that are trained in a token economy. In recent years, the folk psychology idea that positive feedback is universally superior to negative feedback has been called into question by such research. A few examples:

Neurological reasons for the difference in effectiveness between positive and negative feedback have been proposed - for example by Eveline Crone in the study cited above.

Training using rewards is preferred for animal trainers, parents, and teachers in most practical situations:

  • There is usually a much wider range of undesirable behaviour to punish than desirable behaviour to reward. Animals and young children have difficulty determining what the desirable behaviour is with only clues about what behaviours are undesirable. Adults are easier to work with because you can explain the desirable behaviour in words.
  • Punishment must be applied to all undesirable behaviour to be effective, while reward need only be applied to desirable behaviour, and even then only intermittently, to be effective.
  • Due to classical conditioning, subjects may attribute punishment to factors unrelated to their behaviour, such as the trainer or the classroom. They may learn to avoid the trainer, or avoid getting caught, rather than the desirable behaviour.
  • Notice how none of these points apply to machine learning, where the process is structured, constrained, and automated.

John Maag summarizes reasons to promote positive reinforcement in schools: Namely that teachers often find it convenient and effective (in the short-term) to administer punishment and are overly reliant on it in many situations where positive reinforcement would be more effective and desirable.

Nice video on the topic.

$\endgroup$
0

Not the answer you're looking for? Browse other questions tagged or ask your own question.