Avoid Extinction by Not Withholding

One of the newer ideas being tossed around the positive reinforcement training crowd is the mantra “Do not withhold”. By this, they mean, do not withhold reinforcement when an animal makes an error, or stops responding in training.

Background

When I use positive reinforcement to reward the behaviours I like, I naturally withhold the reward if the animal offers me something I don’t want. When I first learned how to train behaviours using shaping, this was what I was taught to do. In general my dogs have accepted this procedure in stride. Or at least I thought they did. I realize now that they often did become a little frustrated, showing this by barking or whining at me. But this was as far as it went.

That is of course until my most recent dog Nimbus. Nimbus is teaching me that ignoring those big feelings causes big problems in living with, loving and teaching dogs.

Like Nimbus, many dogs find this commonly used extinction procedure has a very profound impact. For some dogs I’ve worked with, the choice to withhold reinforcement causes them to disengage from the training and seek something with a more clear path to reinforcement. Here I see behaviours like sniffing or running zoomies. I’ve also seen highly social breeds leave work to greet people at ringside. With other dogs, I’ve seen the choice to withhold make them frustrated and even angry, unable to think or learn. I’ve even seen this result in aggression toward the handler.

What can we do about it?

Conscientious dog trainers have looked to the work with other species for ideas. Many large mammals, such as horses and orca are dangerous to work with. Especially if the animal has any big feelings about our training procedures. We found that with many other species, successful positive reinforcement trainers, do not withhold reinforcement when an animal makes an error.

I know. Mind bender. Just let that sit for a minute.

Successful positive reinforcement trainers do not withhold reinforcement when an animal makes an error…

I don’t know about you, but this seems so counter-intuitive. Withholding after an ‘error’ has been foundational to how I have worked with my dogs. How does my learner know which behaviour is right if they get a reward for both correct and incorrect behaviours? Let’s look at a few strategies and theories to help us understand how using a reinforcement instead of withholding reinforcement can help us progress our training plan forward. 

Differential Reinforcement

This procedure makes the most sense to me because it is the closest I get to withholding without actually withholding. In this procedure I have 2 different values of rewards, for example, kibble and roasted chicken. When the dog performs the incorrect behaviour, I give kibble. When they perform the correct behaviour, I give roasted chicken. Over several repetitions, the dogs learn that performing the correct behaviour has a better payoff.

Jackpots

While the science behind the effectiveness of jackpots is mixed, they are widely used in dog training in order to reinforce one behaviour more notably than another. Basically, this is similar to differential reinforcement. Except that the value of the reward is increased not by changing the type of reward, rather by changing the amount or by adding an additional reinforcement into the procedure such as verbal praise.

Reset Cookie

This procedure was a little harder to wrap my head around because with this one, I can use the same reinforcement the dog would get if they had done the behaviour correctly to reset the dog for the next repetition.

An easy way for to set the dog up for a successful repetition is with a reset cookie. I can use the placement of that cookie to almost ensure that the next rep is what I want by making the correct choice very easy for the dog. The “successful loop” aspect cannot be forgotten. It is essential. If I can get the dog back into a pattern of successful loops “the matching law” states that the behaviour that happens more frequently will be more likely to occur. 

Focus on the ‘Big Picture’

Once an error has occurred, there is nothing I can do to fix that error. It has happened. It is in the past. I have already written in a previous post that Reinforcement BUILDS behaviour, so if I want to increase the likelihood of a different behaviour I have to use reinforcement. My quickest pathway to BUILDING behaviour after an error is to quickly set the animal back up for a successful loop. 

As long as I ensure that I don’t get too many errors as compared to correct behaviours, the matching law will ensure that the desired behaviour still increases. In addition to this, the reduction in frustration behaviours and any other emotional baggage that happens as a result of extinction from withholding, is avoided.

So, while using a reset cookie after an error is counter-intuitive on a single repetition, I find it is beneficial to the overall shift in behaviour I am looking for and helps my learner maintain an enjoyable experience throughout the session. My dogs are always keen to keep working as their job is very clear.

So next time you are designing a training plan, think about what you will do when your dog makes an error. How will you maintain their good feelings about the training session and training in general? Consider one of the above options.

Let me know your thoughts and how “not withholding” sits with you and your dogs.

Until next time, 

Love That Dog

Related Articles

Responses

  1. Again another excellent article Heather, like Numbis Riley has taught me lots and now I understand why her zoomed.
    Kibbles and high value treats work for her and she understands the difference.
    Thanks Heather and Riley 🐶

Comments are closed.