Bug 141141

Summary: Hard lock up with kernels 2.6.9-1.681 and kernel-2.6.9-1.678
Product: [Fedora] Fedora Reporter: Gérard Milmeister <gemi>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED CANTFIX QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 3CC: barryn, fedora, pfrields, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-10-03 00:36:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gérard Milmeister 2004-11-29 16:27:06 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3)
Gecko/20041020 Galeon/1.3.18

Description of problem:
After a while (a few hours), the system locks up with Caps Lock and
Scroll Lock blinking. Only a hardware resets helps. There are no such
problems with kernel-2.6.9-1.667.

Version-Release number of selected component (if applicable):
kernel-2.6.9-1.681_FC3

How reproducible:
Always

Steps to Reproduce:
 

Additional info:

Comment 1 Doug Brott 2004-11-29 23:22:07 UTC
I didn't get the additional Caps Lock / Scroll Lock information that
you received, but I too am having this exact same issue.

Additional information specific to my case:

Fails (Single or Dual CPU)
  Kernel 2.6.9-1.678
  Kernel 2.6.9-1.681

Fails (Dual CPU)
  Kernel 2.6.9-1.667

Works (Single CPU Mode)
  Kernel 2.6.9-1.667

I always receive a Hard-Lock with little to no information for
recovery.  Previous kernel versions (2.6.8) worked fine in Dual CPU mode.

I have a Dual Athlon 2100+ system, 3.5GB RAM, (2)180GB HDD

Comment 2 Doug Brott 2004-12-08 17:50:36 UTC
I've turned off ACPI on the kernal line and disabled Plug&Play in the
BIOS and things actually seem to be working very well now (SMP and
all).  It's only been up for about 30 minutes now, but that's about 29
minutes longer than I could ever get it to run in SMP since 2.6.9 kernel.

I believe my problem was actually with a linuxant modem driver that
was having trouble getting an IRQ for the modem card I have installed.
 This caused something in the kernel to go whacko.  It's very likely
that turning off PNP was all I needed to do.

You can take me off the list of folks having the lock-up problems

Comment 3 Gérard Milmeister 2004-12-13 16:02:52 UTC
I set "kernel /vmlinuz-2.6.9-1.698_FC3" to "acpi=off" in grub.
Now the system doesn't lock up anymore. So it seems acpi was the
reason. It would be useful to find out what exactly goes wrong here
(and not in .667), but I don't know how to do that.
Is acpi useful on NON-Laptop and NON-HT machines?

Comment 4 Gérard Milmeister 2004-12-13 18:47:35 UTC
Well, that was a little premature. This time the lock occurs after 30
hours. Back to .667 again.

Comment 5 Graham TerMarsch 2004-12-21 19:02:18 UTC
I'm experiencing similar lockups on my Tyan Tiger MPX S2462
motherboard, with twin Athlon MP 1600s; each night the machine locks
up between 1am and 2am, and I get the flashing num+scroll locks.

I've had this trouble with all of the SMP kernels that have been
released for FC3, but also experienced similar lockups on FC2 with
2.6.9 kernel releases (although in FC2 I didn't get the flashing
num+scroll locks).

Like yourself I've tried "acpi=off", which helps but doesn't solve the
problem; it locks up every other day instead of every day.  Note that
I've not tried any of the UP kernels; I've only used SMP kernels on
this machine.

Comment 6 Gérard Milmeister 2005-01-04 07:34:30 UTC
Same with kernel-2.6.9-1.724_FC3.
I am worried about this. If this problem persists into FC4, I won't be
able to upgrade.

Comment 7 Graham TerMarsch 2005-01-04 19:45:08 UTC
Am having success with 2.6.9-698_FC3 and "acpi=off"; my machine has 
now been up for 13 days without locking up. 
 
FYI, I have not yet tried leaving ACPI on. 

Comment 8 Gérard Milmeister 2005-01-05 16:53:29 UTC
I just tried kernel-2.6.10-1.727_FC3, which doesn't even boot
correctly. The system hangs while processing the script in rhgb. Back
again to 2.6.9-1.667.
If it helps here are some specifications of my system:
ASUS P4PE Pentium 2400 MHz
ASUS GeForce 6800
Soundblaster Live! Platinum
Two connected monitors
3 HDDs
1 Plextor DVD-Rewriter
no floppy drive
1 Hauppauge TV-card
Quite a few USB devices (Mouse, Cardreader, Wacom tablet, Epson
printer, USB Bluetooth, Psion, USB hub)

If it is possible to make a post-mortem analysis, I would be eager to
help out.

Comment 9 Barry K. Nathan 2005-01-15 00:39:45 UTC
Try removing "rhgb" from the kernel boot options and see if there's
anything more descriptive on the screen when it hangs (e.g. what's the
last line that gets printed to the screen before it freezes?).

Comment 10 Warren Togami 2005-01-15 07:22:36 UTC
We can sympathize with those suffering from these kernel issues, but
the symptoms reported here are most likely describing more than one
issue.    Also there is nothing the kernel developers can do without
console dmesg, oops or panic dumps.

Gerard's original report mentioned blinking scroll and caps lock. 
This means a kernel panic has happened.  Most likely it was possible
to capture console output from the serial port or netconsole in such
an occurance.

Because it seems to be multiple failures, please attempt to get
console dumps of these failures.  Hopefully a useful backtrace or some
other info goes there.  Otherwise there is nothing anybody can do to
fix or even diagnose the problem.

Comment 11 Dave Jones 2005-02-13 04:39:45 UTC
Gérard, what modules do you have in use ? (lsmod)



Comment 12 Gérard Milmeister 2005-02-13 11:57:55 UTC
Modules (2.6.9-1.667):
Module                  Size  Used by
snd_seq                56785  0
parport_pc             24705  1
lp                     11565  0
parport                41737  2 parport_pc,lp
eeprom                  8545  0
asb100                 20061  0
i2c_sensor              3521  2 eeprom,asb100
i2c_i801                7757  0
rfcomm                 36701  1
l2cap                  25285  5 rfcomm
iptable_nat            23045  0
iptable_mangle          2753  0
ipt_REJECT              6465  1
ipt_state               1857  6
ip_conntrack           40693  2 iptable_nat,ipt_state
iptable_filter          2753  1
ip_tables              16193  5
iptable_nat,iptable_mangle,ipt_REJECT,ipt_state,iptable_filter
dm_mod                 54741  0
button                  6481  0
battery                 8517  0
ac                      4805  0
hci_usb                15041  2
bluetooth              46917  7 rfcomm,l2cap,hci_usb
snd_usb_audio          59809  2
snd_usb_lib            12097  1 snd_usb_audio
pwc                    77684  0
joydev                  8705  0
wacom                  11585  0
sd_mod                 16961  0
usb_storage            61321  0
scsi_mod              118417  2 sd_mod,usb_storage
uhci_hcd               31449  0
ehci_hcd               31557  0
tuner                  18413  0
tvaudio                20961  0
msp3400                21353  0
bttv                  150541  2
video_buf              21701  1 bttv
i2c_algo_bit            8521  1 bttv
v4l2_common             5953  1 bttv
btcx_risc               4425  1 bttv
i2c_core               22081  9
eeprom,asb100,i2c_sensor,i2c_i801,tuner,tvaudio,msp3400,bttv,i2c_algo_bit
videodev                9664  4 pwc,bttv
snd_bt87x              13449  2
emu10k1_gp              3649  0
nvidia               3473436  12
snd_intel8x0           34829  2
gameport                4801  2 emu10k1_gp,snd_intel8x0
snd_mpu401_uart         8769  1 snd_intel8x0
snd_emu10k1            93769  3
snd_rawmidi            26725  3 snd_usb_lib,snd_mpu401_uart,snd_emu10k1
snd_pcm_oss            47609  0
snd_mixer_oss          17217  6 snd_pcm_oss
snd_pcm                97993  5
snd_usb_audio,snd_bt87x,snd_intel8x0,snd_emu10k1,snd_pcm_oss
snd_timer              29765  2 snd_seq,snd_pcm
snd_seq_device          8137  3 snd_seq,snd_emu10k1,snd_rawmidi
snd_ac97_codec         64401  2 snd_intel8x0,snd_emu10k1
snd_page_alloc          9673  4 snd_bt87x,snd_intel8x0,snd_emu10k1,snd_pcm
snd_util_mem            4801  1 snd_emu10k1
snd_hwdep               9413  1 snd_emu10k1
snd                    54053  22
snd_seq,snd_usb_audio,snd_bt87x,snd_intel8x0,snd_mpu401_uart,snd_emu10k1,snd_rawmidi,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer,snd_seq_device,snd_ac97_codec,snd_hwdep
soundcore               9889  6 snd
b44                    22213  0
mii                     4673  1 b44
ext3                  116809  2
jbd                    74969  1 ext3

Comment 13 Gérard Milmeister 2005-02-17 20:55:35 UTC
I finally found out the hardware configuration that causes the problem.
I have 3 HD and 1 CDROM on ide, 2 HD on ide1, and 1 HD and 1 CDROM on ide2.
I have always noticed that the HD on ide2 generated errors about dma, but without
in any way failing in working. Now I find that 3 HD, or 2 HD and 1 CDROM work
fine, but not all 4. Probably the system can't take 4 devices on IDE.

Comment 14 Dave Jones 2005-07-15 19:55:12 UTC
An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which
may contain a fix for your problem.   Please update to this new kernel, and
report whether or not it fixes your problem.

If you have updated to Fedora Core 4 since this bug was opened, and the problem
still occurs with the latest updates for that release, please change the version
field of this bug to 'fc4'.

Thank you.

Comment 15 Dave Jones 2005-10-03 00:36:26 UTC
This bug has been automatically closed as part of a mass update.
It had been in NEEDINFO state since July 2005.
If this bug still exists in current errata kernels, please reopen this bug.

There are a large number of inactive bugs in the database, and this is the only
way to purge them.

Thank you.