Red Hat Bugzilla – Bug 132719
watchdog i8xx_tco causing machine to reboot
Last modified: 2015-01-04 17:09:44 EST
Description of problem: Install rawhide (0911 tree) onto an x86 box,
with a GIGABYTE '8I845GVM-RZ' i845GV motherboard, and find that it
reboots in under 2 minutes during the startup process
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install Fedora
2. Let it boot
It reboots during the boot process, which is under 2 minutes
Fedora Core boots up and is stable
So the solution is to boot into single user mode, rmmod i8xx_tco
(before it hits the 2m mark), Ctrl+d, and then I get a usable Fedora
Is there a way to blacklist i8xx_tco from loading?
The motherboard is a run-of-the-mill cheapish one, but is very popular
*** Bug 133209 has been marked as a duplicate of this bug. ***
*** Bug 133321 has been marked as a duplicate of this bug. ***
Alan/Arjan, can this be fixed by removing the one line in the
kernel that says MODULE_DEVICE_TABLE() in this kernel driver?
Should still allow the module to work, but not autoload it due
to certain pci ids being present.
that is the wrong fix.
What we need is a blacklist in kudzu to make sure certain modules are
never autoloaded. We need the same for the firmware flashing modules
etc etc etc.
Disagree. I had a talk with rusty a long time ago and we came to the
conclusion that there is probably a case for
hints in the kernel. That is where they belong so there is only one
I would suspect the behaviour for our user space tools between removing
the MODULE_DEVICE_TABLE() line and adding a MODULE_NO_AUTOLOAD would
be exactly the same.
Given this was discussed "a long time ago" who is fixing this hopefully
I ahve changed rc.sysinit to remove this module for "other", but the
number of hacks in initscripts is really big.
Well, there's two sorts of autoload we want to avoid:
- the detrimental to autoload (i8xx_tco, etc)
- the 'impolite' to autoload (*fb)
*fb modules are easy to exclude b/c of their PCI class. But excluding
all the 'other' to get rid of i8xx_tco means you lose the hwrandom
driver and othre random assorted things.
I've done something similar to the following in rc.sysinit:
other=`echo $other | sed 's/i8xx_tcp//'`
Can probably be done nicer and could then also cover *fb.
This would fix this shortterm as I suspect the
MODULE_NO_AUTOLOAD discussion will take quite some time
until it is finished and all players agree on a common plan here?
For PCI class detection: the one line removal in the kernel also
removes pci ids. They currently show up in
"modprobe -c | grep i8xx_tco", so maybe the *fb problems are
the same type of items as the i8xx_tco? (??)
Worthwhile to note that as notting mentioned on IRC (in case folks
want the quick fix), adding the following to /etc/modprobe.conf is a
install i8xx_tco /bin/true
*** Bug 133606 has been marked as a duplicate of this bug. ***
*** Bug 134016 has been marked as a duplicate of this bug. ***
initscripts-7.86-1 will read the normal hotplug /etc/hotplug/blacklist
file. (Which, as of hwdata-0.140-1, will have i8xx_tco in it.)
There is a much more serious unanswered question. When the i810_tco
driver was loaded - who opened the file. Someone has to open the file
for the timer to start and if someone did every other watchdog is
going to fall foul of this if deployed (eg softdog for telco boxes).
That I don't understand. I have a box here that loads i8xx_tco, and it
On those that are seeing the problem, if you:
a) boot without it loaded
b) load it by hand
does it then reboot 30 seconds later?
If so, can you do those steps, and run 'lsof | grep /dev/watchdog'
at some point before it reboots?
Booting with 'init s' (while still having it autoloaded) and
performing the same lsof will also work. Assuming it boots to single
user mode fast enough. :)
I booted into single user mode and the module hadn't loaded as I had
it aliases in modprobe.conf. Undid this, modprobed the module (and
confirmed with lsmod that it had now loaded), and _60_ seconds later
it rebooted spontaneously.
The whole time while waiting, the results of 'lsof | grep
/dev/watchdog' showed no match.
In my case something in a gnome session start/login appears to be
tickling the watchdog. The module would load on startup, but the
machine would be stable until you login (gdm graphic login), or until
you issue startx from a text console login. 30 seconds later the box
I have not retried this in the last couple of weeks and have applied
current rawhide updates. Will try and reproduce later.
Nigel - would be interested to know in your case if before you start
the gnome stuff you do the following
then see if you get the same reboot behaviour.
Followup to Alan's query in comment #19.
Using the then current kernel (590), my laptop still reboots 30
seconds after starting gnome if i810_tco is loaded.
If I do not load i810_tco, and load softdog instead, gnome does not
cause a reboot. The default timeout for softdog is 60, that for
i810_tco is 30 seconds, so I repeated this with the softdog timeout
explicitly set to 30 seconds - no reboot.
I cannot identify anything kicking /dev/watchdog
Is there some other means that could set i810_tco off?
I guess I could try instrumenting the module a little and retrying it
- see if I can find who/what is kicking it.
Cool, its not kernel directly and its not us opening ./dev/watchdog in
error (that was my big concern). Looks like an i81x X server bug, must
be touching something related ??
X folks (and if X folks cant see it to fix it then we need to
I'm not running any sort of X (see my posting above) and yet I still
have the problem of spontaneous reboots when i8xx_tco is loaded..
Further testing on my system goes as follows:
If I load i8xx_tco manually, after the system has booted (whether in X
or with init s), no spontaneous reboots occur. No output from lsof |
If I comment "#" the i8xx_tco entry in /etc/hotplug/blacklist, the
watchdog module is still not loaded. Hmm, load_module() in rc.sysinit
ain't that smart. OK whatever, I delete the entry. Reboot, now the
module is loaded during the boot.
If the module i8xx_tco is loaded automatically in rc.sysinit during
the boot, (whether in X or with init s), the system WILL spontaneously
reboot. Still no output from lsof | grep /dev/watchdog prior to it
Hmm, if I boot with init s, then rmmod i8xx_tco right away, no
spontaneous rebooting occurs. I can then edit /etc/hotplug/blacklist
to put the i8xx_tco back in.
This above tested with kernel-2.6.8-1.541 and kernel-2.6.8-1.603, both
yield same results.
What can be gleamed from this?
Created attachment 104997 [details]
Dmesg output for compaq laptop which reboots after gnome start
Created attachment 104998 [details]
lspci -v output for compaq laptop which reboots after gnome start
Re Alan's Comment #21
The laptop thats showing the reboot after gnome start problem, uses
a radeon card - its not all 8xx based stuff. The dmesg & lspci stuff
is attached above.
Normally when a watchdog driver is loaded, it makes sure that the
watchdog driver isn't active. It's only after you open /dev/watchdog
(and thus tickle the watchdog) that you really start/activate/kick the
watchdog. From then on it counts down
Also: I checked the attachment of comment #24 : it doesn't contain the
load of the watchdog module (which should be visible in dmesg like
i.e. "i8xx TCO timer: initialized (0x0460). heartbeat=30 sec
I don't think that it is the watchdog module itself that is causing
the problem. I'll create a debug version/patch tomorrow evening so
that we can easily read the timer's value and see wether or not it is
really the TCO that is causing the reboot.
Hmm, the module is indeed not behaving like I thought it would be...
after initialization it does a tco_timer_keepalive() instead of what I
would expect: tco_timer_stop()...
I'll do some testing together with Reuben Farrelly...
Created attachment 105233 [details]
i8xx_tco.c debug version
In attachment the debug version of i8xx_tco.c .
(Note: i patched the initialization so that it stop's the watchdog).
When running the debug version I get this:
i8xx TCO timer: initialized (0x1060). heartbeat=30 sec (nowayout=0)
... and the problem has gone away. The module is still loaded,
running and after 7 mins uptime neither the box has reloaded, nor has
anything been logged. This looks good.
I see a fix has been committed to the mainline kernel:
ChangeSet 1.1988.69.31, 2004/10/17 20:35:47+02:00, firstname.lastname@example.org
[WATCHDOG] v2.6.9-rc3 i8xx_tco.c-stop_reboot-patch
Fix for Bugzilla Bug 132719: "watchdog i8xx_tco causing machine to
Is it safe to close this bugzilla report as an UPSTREAM fixed?
Yup.And if it went into 2.6.9-rc3, please test with 2.6.9-1.640.
The patch went in post 2.6.9 release so it won't be in there.. :(