Bug 110678

Summary: Kudzu causes problems with the realtime clock.
Product: [Fedora] Fedora Reporter: Sam Varshavchik <mrsam>
Component: kudzuAssignee: Bill Nottingham <notting>
Status: CLOSED DEFERRED QA Contact: David Lawrence <dkl>
Severity: medium Docs Contact:
Priority: medium    
Version: 1CC: p.van.egdom, rvokal
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-04-28 17:58:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
lspci.txt
none
kernelboot.txt none

Description Sam Varshavchik 2003-11-23 03:07:29 UTC
Description of problem:

I'm pointing my accusatory finger at kudzu in FC1 for causing several
instabilities on the hardware described in the attachments to this
bug. Basically, all of the following symptoms were resolved simply by
disabling kudzu.

Kudzu in RH9 did not cause the following results on the same hardware.

After yet another uneventful upgrade from RH9 to FC1 (no different
than similar previous upgrade runs on different hardware), the machine
hung after rebooting, at the "Configuring kernel parameters" prompt.

An investigation showed that hwclock, being invoked from rc.sysinit,
was hanging here:

open("/etc/adjtime", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=46, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0xbf370000
read(3, "0.069092 1069550582 0.000000\n106"..., 4096) = 46
close(3)                                = 0
munmap(0xbf370000, 4096)                = 0
open("/dev/rtc", O_RDONLY|O_LARGEFILE)  = 3
ioctl(3, RTC_UIE_ON, 0)                 = 0
read(3,

The read() was never coming back from the kernel.

hwclock when invoked from the halt script suffered the same fate.

After commenting out hwclock, the machine booted succesfully, but one
of the two NICs (3c59x.o, the 3c900 combo Boomerang) was refusing to
play nice:

Nov 22 14:42:14 brimstone kernel: eth0: Transmit error, Tx status
register d0.
Nov 22 14:42:14 brimstone kernel:   Flags; bus-master 1, dirty 1(1)
current 1(1)

[ ... ]

Nov 22 14:42:59 brimstone kernel: eth0: Host error, FIFO diagnostic
register 2000.
Nov 22 14:42:59 brimstone kernel: eth0: PCI bus error, bus status 008b0029

Booting an RH9 kernel did not fix some of these symptoms, so I began
suspecting a hardware failure that happen to occur at the same time as
an upgrade to FC1.

After some further investigation, I accidentally discovered that by

1) Disabling kudzu, and
2) Doing a complete cold reset, as in "Arctic Circle" cold reset --
meaning removing the power cord from the power supply, waiting, and
plugging it back in -- resolved all symptoms, brought the hardware
clock, and the misbehaving NIC card up to speed, and returned the
server to 100% production mode.

Basically, running Kudzu just once permanently screws up this hardware
I have here.  Even an ordinary power off, using the power button, does
not fix the situation.  I must remove all power from the machine, by
physically disconnecting it from the power supply, in order to restore
it.  This is completely reproducible.

Comment 1 Sam Varshavchik 2003-11-23 03:08:25 UTC
Created attachment 96141 [details]
lspci.txt

lspci output

Comment 2 Sam Varshavchik 2003-11-23 03:12:09 UTC
Created attachment 96142 [details]
kernelboot.txt

Here's how a succesful kernel boot looks like.

Comment 3 Bill Nottingham 2003-11-24 04:10:17 UTC
hwclock is something different, almost certainly.

The 3c59x is already in bugzilla.

*** This bug has been marked as a duplicate of 107389 ***

Comment 4 Bill Nottingham 2003-11-24 04:11:24 UTC
Reopening for the hwclock issue.

Is this reproducible as well? Or is just the 3c59x problem reproducible?

Comment 5 Sam Varshavchik 2003-11-24 12:06:27 UTC
hwclock funniness is also reproducible.


Comment 6 Bill Nottingham 2005-04-28 17:58:12 UTC
Closing bugs on older, no longer supported, releases. Apologies for any lack of
response. Please open a new bug if it persists on current releases.