22
\$\begingroup\$

What is the need for an external watchdog timer for a microcontroller?

Most of the microcontrollers are designed with an internal watchdog timer. However, in some of the circuits they are using an external watchdog timer (such as PIC16F1824).

\$\endgroup\$

7 Answers 7

15
\$\begingroup\$

Some products must meet safety requirements, either determined by the manufacturer or to meet international safety standards such as IEC 60730-1, or the older UL1998 which is still in use in the US. The internal watchdog functionality in any given microcontroller may or may not be adequate to be used. An external WDT may be used in combination with the internal WDT in some cases.

Certain microcontrollers such as TI's Hercules series take safety critical system applications very seriously and are more likely to meet strict requirements, however they may not be appropriate for cost-sensitive applications.

Typically the WDT is one of a number of ways to reduce the likelihood of a failure causing catastrophic damage to property or injury to life. Other things such as memory protection to detect unexpected access to MCU memory or program fetches from unused memory are usually used in conjunction with a WDT.

Examples of inexpensive products that perform safety-critical functions are automotive subsystems, garage door controllers and gas (natural gas or propane) ignition controllers used in furnaces, dryers and water heaters . Of course many medical and aerospace products are also safety-critical but there may be sufficient room for redundancy and other approaches. In some cases, there may be no easily reachable safe state- for example, in an aircraft.

Ideally the watchdog timer is very simple, independent of the MCU (for example, it should have its own clock source and perhaps a clock monitor), cannot be set (by software) to a longer time than would cause damage by any software error, and will render the system into a safe state if it is not "petted" on time, either with a time-out or in a windowed fashion so too frequent resets can be detected. For example, a WDT in a thermal control application might be set to a few seconds because no damage is possible if the microcontroller locks up for that length of time.

The WDT is most useful as a part of a system-level approach to reliability and safety.

\$\endgroup\$
35
\$\begingroup\$

A watchdog timer can guard against hardware bugs in buggy piece of ... cutting edge microcontrollers. One that we recently used, from a famous brand, had I/O pins that occasionally missed their interrupts, sometimes did not start up correctly, and where the integrated watchdog sometimes failed to reset the system in a known-good state.

This did not show up until we started with long time reliability testing, and it was easier to add an external watchdog than to change the microcontroller.

If you have more than one IC on the PCB you may also need an external reset-IC or voltage monitor to make everything boot up reliably. Many of these can also serve as a watchdog.

\$\endgroup\$
5
  • 14
    \$\begingroup\$ "integrated watchdog sometimes failed to reset the system in a known-good state" - this is functionally equivalent to "there is no internal watchdog". \$\endgroup\$ Commented Nov 16, 2018 at 12:18
  • \$\begingroup\$ @DmitryGrigoryev -- "there is no internal watchdog" ... Yes, but do you then disable the internal watchdog? I'm proposing that a malfunctioning internal watchdog may be worse than turning the internal watchdog off (and replacing it with an external watchdog) because of the need to limit the exponential growth of complexity. Reiterated, do we keep the internal watchdog on because It might be helpful, or keep it off because it is unreliable? \$\endgroup\$ Commented Jun 20, 2023 at 19:31
  • 1
    \$\begingroup\$ @MicroservicesOnDDD In a commercial product with safety requirements, an unreliable watchdog would be enough of a reason to pick a different MCU. In a toy-grade product you can of course live with a buggy watchdog, or without a watchdog altogether for that matter. \$\endgroup\$ Commented Jun 22, 2023 at 8:59
  • \$\begingroup\$ @DmitryGrigoryev -- Sorry that I was not more specific. This answer by 'pipe' said, "This did not show up until we started with long time reliability testing, and it was easier to add an external watchdog than to change the microcontroller." So, in that case, your company has already chosen not to "pick a different MCU". Therefore, in this situation, do we keep the internal watchdog on because It might be helpful, or keep it off because it is unreliable? (All other things being out of our control.) \$\endgroup\$ Commented Jun 22, 2023 at 12:46
  • 1
    \$\begingroup\$ @MicroservicesOnDDD I missed that bold part, sorry. I would disable the internal watchdog if an external one is added. Keeping it enabled is a pure liability: you need to maintain the watchdog software functions, test them, etc. but you still can't rely on them. \$\endgroup\$ Commented Jun 22, 2023 at 16:15
35
\$\begingroup\$

It is hard to argue, that the internal clock of the internal watchdog is actually independent of all the other clocks and always running like it should.

So for certification it is usually much easier to place an external watchdog on the board and say: look there is our watchdog, it must be triggered by the MCU at that interval, which is shorter than our time to failure, so our device is safe as we defined it.


To address some of the comments:

"and always running like it should" - Good point. It may be harder to prove that your software correctly initializes the internal watchdog under all circumstances than just employing a watchdog chip and refer to its datasheet.

This is usually proven by a fault insertion test, which you present to a body of the certification. So you show them the code where your initialization happens, and where the triggering of the watchdog happens. They usually ask you to modify the code in such a way that the triggering of the watchdog is stopped after a certain time has elapsed and check whether the controller is reset correctly.

Or to prove that your code doesn't contain a bug that accidentally disables the internal watchdog.

At least on some controllers the watchdog is called independent and has its own clock source and cannot be disabled by software means, only a reset of the controller will disable the watchdog. At least in theory - it's easy to show that you cannot stop it by software but hard to prove that the clock is truly independent and will not stop under EMI.

Or to prove that your code doesn't run wild continuously resetting the external watchdog as fast as it can. Problem solved. ;-)

For that case you use a window watchdog which has to be triggered at certain intervals and if you fail to do so (trigger it too often or too less) will reset the circuit. The STM32 I'm working with have an internal window watchdog, but it runs from PCLK1 which is derived from the main clock, so I don't think it is as useful as an external watchdog with its own clock source.

Or that some genius doesn't put the watchdog service routine inside a timer ISR, so the main code can crash but the interrupt keeps firing & servicing the watchdog perfectly...

That certainly is true, but hopefully a review will put that genius back on his chair - but hey when I started out, that was my first idea as well :D. During the certification processes I've been part in, they always had a look at the watchdog part of the software.

\$\endgroup\$
6
  • 4
    \$\begingroup\$ "and always running like it should" - Good point. It may be harder to prove that your software correctly initializes the internal watchdog under all circumstances than just employing a watchdog chip and refer to its datasheet. \$\endgroup\$
    – JimmyB
    Commented Nov 15, 2018 at 13:20
  • 4
    \$\begingroup\$ @JimmyB Or to prove that your code doesn't contain a bug that accidentally disables the internal watchdog. \$\endgroup\$
    – TripeHound
    Commented Nov 15, 2018 at 16:57
  • 2
    \$\begingroup\$ @TripeHound Or to prove that your code doesn't run wild continuously resetting the external watchdog as fast as it can. Problem solved. ;-) \$\endgroup\$
    – JimmyB
    Commented Nov 15, 2018 at 17:15
  • 2
    \$\begingroup\$ Or that some genius doesn't put the watchdog service routine inside a timer ISR, so the main code can crash but the interrupt keeps firing & servicing the watchdog perfectly... \$\endgroup\$
    – John U
    Commented Nov 15, 2018 at 18:00
  • \$\begingroup\$ @JohnU, that, however, would not seem like something an external IC would help with. \$\endgroup\$
    – ilkkachu
    Commented Nov 15, 2018 at 21:04
12
\$\begingroup\$

The watchdogs built in to microcontrollers have particular properties that mean they themselves can fail in ways that a different external watchdog might not.

For example, a common design is to use a watchdog timer running from a low power RC oscillator. That oscillator can fail. An external watchdog based on capacitor discharge rather than an oscillator could still reset the microcontroller in many cases.

Another reason is that the external watchdog can be more robust. A microcontroller might only operate reliably over a certain voltage range, and being a complex device may be subject to latching up in a way that makes its own internal watchdog ineffective. An external watchdog may have a wider acceptable supply range and be less prone to problems when subjected to electrical noise.

External watchdogs often offer a much wider range of time-out values too. A microcontroller I use often, the XMEGA, has a maximum time-out of around 7 seconds. For one product I added an additional external watchdog with time-out of 2 hours. That allowed me to wake the microcontroller once an hour rather than once every few seconds, reducing power consumption in a battery powered device.

External watchdogs sometimes have multiple functions, such as a timer and a voltage monitor/reset control. Again, these can be lower power than a microcontroller's built-in system too.

One other interesting advantage of an external watchdog is that it can be used to reset devices other than the microcontroller. For example, it might control the enable pin of a voltage regulator, de-powering an entire circuit to reset multiple devices at once. Using some simple logic the watchdog reset signal from multiple sources can be combined, allowing the watchdog to require several devices to be continually resetting it.

\$\endgroup\$
2
  • \$\begingroup\$ Based on my experience of external wdogs, I'd say they are far less robust than internal ones, simply because another external circuit is another thing that can break. Soldering, EMI, ESD and so on. If you manage to short the clock input to something, then the external wdog is effectively disabled. So I don't really buy the increased safety aspect. If you have both at once, then sure, you'll increase safety ever so slightly. \$\endgroup\$
    – Lundin
    Commented Nov 19, 2018 at 11:59
  • \$\begingroup\$ @Lundin most of them use an internal RC oscillator or capacitor discharge, rather than an external clock. In fact I can't think of any off hand that use an external clock. Also failures are not additive, if you have an external and internal watchdog and one fails that's clearly still better than having just one. \$\endgroup\$
    – user
    Commented Nov 19, 2018 at 15:02
3
\$\begingroup\$

Certain certifications, such as UL, may require protection from two points of failure. An external watchdog timer would be considered protection from a first point of failure, the microcontroller.

\$\endgroup\$
2
\$\begingroup\$

A watchdog is really no different in this regard from any other built-in peripheral you find in an MCU. MCUs come with timers, RTCs, ADCs, EEPROM and reset controllers, yet all these functions also exist as separate ICs. If available built-in blocks don't meet your requirements, you have to use external ones. Or you can try to find an MCU with all the right blocks, which may not exist or be too expensive or hard to port your code to.

\$\endgroup\$
1
\$\begingroup\$

A watchdog is a timer and activates its output when the IC lost an input pulse for a period.

It is a building block and can be used for any application.

So, they can be used in any application, for example, change data routing in fail-safe mode. When the microcontroller clock fails, we can't disable some critical outputs.

An external watchdog is not related to the complex clock domain of the microcontroller, and some of them have their analog RC charge timing or internal clock.

Digital circuits in radiation may trigger the outputs when charge hits their flip-flops. But some analog circuit is safer because they integrate charge in a capacitor, and it is safe when we integrate it.

\$\endgroup\$

Not the answer you're looking for? Browse other questions tagged or ask your own question.