0

We know that by opening /dev/watchdog the watchdog activates and by sending a character in less that a minute it will reset. the instructions are here.

The processor used for BBB AM335x enables its internal watchdog by default. But when the U-Boot or Ubuntu starts, this watchdog is disabled. and after OS is booted up the /dev/watchdog can be used.

I want to ensure that the watchdog works even when U-Boot or kernel can't boot. So how can it be done?

  • The kernel and U-Boot should not disable the watchdog timer.
  • The default timeout of watchdog should be more than a minute before U-Boot starts the kernel so that the OS can boot up completely

I need to mention that Changing some parts of U-Boot code or Linux kernel code is acceptable. but External watchdog is not an option.

  • from what I've experienced, when the os stops (an oops of kernel) while booting, the watchdog won't reset the system. so how is that enabled! Mybe you mean that it is enabled for the system to use it in /dev/watchdog ? – hmojtaba Sep 08 '16 at 16:31
  • 1
    The watchdog is enabled in U-Boot, see include/configs/ti_am335x_common.h – Tom Rini Sep 08 '16 at 16:29

1 Answers1

1

After my investigations, I figured out that linux kernel does not disable the watchdog on boot, but it actually uses a timer to reset the watchdog. and when the kernel oops or panics it's still reseting the watchdog so the system won't restart by watchdog timer overflow.

Following this answer for this question from man proc:

/proc/sys/kernel/panic

This file gives read/write access to the kernel variable panic_timeout. If this is zero, the kernel will loop on a panic; if nonzero it indicates that the kernel should autoreboot after this number of seconds

It is obvious that a nonzero value should be passed to this file. According to this answer, to pass a value to /proc/sys/kernel/panic we should modify /etc/sysctl.conf and add parameter kernel.panic = 3 for 3 seconds of wait before restarting after a kernel panic occurred.

But that did not fix my problem. By investigating other panic related issued I found from man proc:

/proc/sys/kernel/panic_on_oops (since Linux 2.5.68)

This file controls the kernel's behavior when an oops or BUG is encountered. If this file contains 0, then the system tries to continue operation. If it con- tains 1, then the system delays a few seconds (to give klogd time to record the oops output) and then panics. If the /proc/sys/kernel/panic file is also nonzero then the machine will be rebooted.

And my problem was not a panic, but a kernel oops! So by adding kernel.panic_on_oops = 1 to /etc/sysctl.conf, /proc/sys/kernel/panic_on_oops flag is changed to 1. and now whenever the kernel stops, it restarts after 3 seconds.

  • I found your information very helpful. What I found on my unit running an old kernel is that nothing was opening and using the watchdog on boot and that was why my unit was not rebooting. I confirmed that was the case by issuing echo t > /proc/sysrq-trigger. The system hung and did not watchdog reboot. Touching /dev/watchdog and then issuing the same command caused the system watchdog reboot as expected after 60s. The addition of the lines in the sysctl.conf file are helpful for rebooting after an obvious issue, but it wouldn't protect from a deadlock that the kernel is non-functional for. – Matt Minga Aug 03 '22 at 21:20