Bug 141141
Summary: | Hard lock up with kernels 2.6.9-1.681 and kernel-2.6.9-1.678 | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Gérard Milmeister <gemi> |
Component: | kernel | Assignee: | Dave Jones <davej> |
Status: | CLOSED CANTFIX | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 3 | CC: | barryn, fedora, pfrields, wtogami |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-10-03 00:36:26 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Gérard Milmeister
2004-11-29 16:27:06 UTC
I didn't get the additional Caps Lock / Scroll Lock information that you received, but I too am having this exact same issue. Additional information specific to my case: Fails (Single or Dual CPU) Kernel 2.6.9-1.678 Kernel 2.6.9-1.681 Fails (Dual CPU) Kernel 2.6.9-1.667 Works (Single CPU Mode) Kernel 2.6.9-1.667 I always receive a Hard-Lock with little to no information for recovery. Previous kernel versions (2.6.8) worked fine in Dual CPU mode. I have a Dual Athlon 2100+ system, 3.5GB RAM, (2)180GB HDD I've turned off ACPI on the kernal line and disabled Plug&Play in the BIOS and things actually seem to be working very well now (SMP and all). It's only been up for about 30 minutes now, but that's about 29 minutes longer than I could ever get it to run in SMP since 2.6.9 kernel. I believe my problem was actually with a linuxant modem driver that was having trouble getting an IRQ for the modem card I have installed. This caused something in the kernel to go whacko. It's very likely that turning off PNP was all I needed to do. You can take me off the list of folks having the lock-up problems I set "kernel /vmlinuz-2.6.9-1.698_FC3" to "acpi=off" in grub. Now the system doesn't lock up anymore. So it seems acpi was the reason. It would be useful to find out what exactly goes wrong here (and not in .667), but I don't know how to do that. Is acpi useful on NON-Laptop and NON-HT machines? Well, that was a little premature. This time the lock occurs after 30 hours. Back to .667 again. I'm experiencing similar lockups on my Tyan Tiger MPX S2462 motherboard, with twin Athlon MP 1600s; each night the machine locks up between 1am and 2am, and I get the flashing num+scroll locks. I've had this trouble with all of the SMP kernels that have been released for FC3, but also experienced similar lockups on FC2 with 2.6.9 kernel releases (although in FC2 I didn't get the flashing num+scroll locks). Like yourself I've tried "acpi=off", which helps but doesn't solve the problem; it locks up every other day instead of every day. Note that I've not tried any of the UP kernels; I've only used SMP kernels on this machine. Same with kernel-2.6.9-1.724_FC3. I am worried about this. If this problem persists into FC4, I won't be able to upgrade. Am having success with 2.6.9-698_FC3 and "acpi=off"; my machine has now been up for 13 days without locking up. FYI, I have not yet tried leaving ACPI on. I just tried kernel-2.6.10-1.727_FC3, which doesn't even boot correctly. The system hangs while processing the script in rhgb. Back again to 2.6.9-1.667. If it helps here are some specifications of my system: ASUS P4PE Pentium 2400 MHz ASUS GeForce 6800 Soundblaster Live! Platinum Two connected monitors 3 HDDs 1 Plextor DVD-Rewriter no floppy drive 1 Hauppauge TV-card Quite a few USB devices (Mouse, Cardreader, Wacom tablet, Epson printer, USB Bluetooth, Psion, USB hub) If it is possible to make a post-mortem analysis, I would be eager to help out. Try removing "rhgb" from the kernel boot options and see if there's anything more descriptive on the screen when it hangs (e.g. what's the last line that gets printed to the screen before it freezes?). We can sympathize with those suffering from these kernel issues, but the symptoms reported here are most likely describing more than one issue. Also there is nothing the kernel developers can do without console dmesg, oops or panic dumps. Gerard's original report mentioned blinking scroll and caps lock. This means a kernel panic has happened. Most likely it was possible to capture console output from the serial port or netconsole in such an occurance. Because it seems to be multiple failures, please attempt to get console dumps of these failures. Hopefully a useful backtrace or some other info goes there. Otherwise there is nothing anybody can do to fix or even diagnose the problem. Gérard, what modules do you have in use ? (lsmod) Modules (2.6.9-1.667): Module Size Used by snd_seq 56785 0 parport_pc 24705 1 lp 11565 0 parport 41737 2 parport_pc,lp eeprom 8545 0 asb100 20061 0 i2c_sensor 3521 2 eeprom,asb100 i2c_i801 7757 0 rfcomm 36701 1 l2cap 25285 5 rfcomm iptable_nat 23045 0 iptable_mangle 2753 0 ipt_REJECT 6465 1 ipt_state 1857 6 ip_conntrack 40693 2 iptable_nat,ipt_state iptable_filter 2753 1 ip_tables 16193 5 iptable_nat,iptable_mangle,ipt_REJECT,ipt_state,iptable_filter dm_mod 54741 0 button 6481 0 battery 8517 0 ac 4805 0 hci_usb 15041 2 bluetooth 46917 7 rfcomm,l2cap,hci_usb snd_usb_audio 59809 2 snd_usb_lib 12097 1 snd_usb_audio pwc 77684 0 joydev 8705 0 wacom 11585 0 sd_mod 16961 0 usb_storage 61321 0 scsi_mod 118417 2 sd_mod,usb_storage uhci_hcd 31449 0 ehci_hcd 31557 0 tuner 18413 0 tvaudio 20961 0 msp3400 21353 0 bttv 150541 2 video_buf 21701 1 bttv i2c_algo_bit 8521 1 bttv v4l2_common 5953 1 bttv btcx_risc 4425 1 bttv i2c_core 22081 9 eeprom,asb100,i2c_sensor,i2c_i801,tuner,tvaudio,msp3400,bttv,i2c_algo_bit videodev 9664 4 pwc,bttv snd_bt87x 13449 2 emu10k1_gp 3649 0 nvidia 3473436 12 snd_intel8x0 34829 2 gameport 4801 2 emu10k1_gp,snd_intel8x0 snd_mpu401_uart 8769 1 snd_intel8x0 snd_emu10k1 93769 3 snd_rawmidi 26725 3 snd_usb_lib,snd_mpu401_uart,snd_emu10k1 snd_pcm_oss 47609 0 snd_mixer_oss 17217 6 snd_pcm_oss snd_pcm 97993 5 snd_usb_audio,snd_bt87x,snd_intel8x0,snd_emu10k1,snd_pcm_oss snd_timer 29765 2 snd_seq,snd_pcm snd_seq_device 8137 3 snd_seq,snd_emu10k1,snd_rawmidi snd_ac97_codec 64401 2 snd_intel8x0,snd_emu10k1 snd_page_alloc 9673 4 snd_bt87x,snd_intel8x0,snd_emu10k1,snd_pcm snd_util_mem 4801 1 snd_emu10k1 snd_hwdep 9413 1 snd_emu10k1 snd 54053 22 snd_seq,snd_usb_audio,snd_bt87x,snd_intel8x0,snd_mpu401_uart,snd_emu10k1,snd_rawmidi,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer,snd_seq_device,snd_ac97_codec,snd_hwdep soundcore 9889 6 snd b44 22213 0 mii 4673 1 b44 ext3 116809 2 jbd 74969 1 ext3 I finally found out the hardware configuration that causes the problem. I have 3 HD and 1 CDROM on ide, 2 HD on ide1, and 1 HD and 1 CDROM on ide2. I have always noticed that the HD on ide2 generated errors about dma, but without in any way failing in working. Now I find that 3 HD, or 2 HD and 1 CDROM work fine, but not all 4. Probably the system can't take 4 devices on IDE. An update has been released for Fedora Core 3 (kernel-2.6.12-1.1372_FC3) which may contain a fix for your problem. Please update to this new kernel, and report whether or not it fixes your problem. If you have updated to Fedora Core 4 since this bug was opened, and the problem still occurs with the latest updates for that release, please change the version field of this bug to 'fc4'. Thank you. This bug has been automatically closed as part of a mass update. It had been in NEEDINFO state since July 2005. If this bug still exists in current errata kernels, please reopen this bug. There are a large number of inactive bugs in the database, and this is the only way to purge them. Thank you. |