Bug 18510
Summary: | kernel 2.2.16-22 freeze | ||
---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Bernhard Ege <bme> |
Component: | kernel | Assignee: | Arjan van de Ven <arjanv> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Brian Brock <bbrock> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 7.0 | CC: | mw, turchi, waananen |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2003-06-05 21:51:07 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
I probably had exactly the same problem on my computer. When I left it idle for enough time, it just froze with blank screen and nothing in the logs. It happened only with the shipped 2.2.16-22 kernel, after I upgraded to 2.4.0-test9, the problem seems to be gone. My CPU is Pentium III 500MHz (Katmai) on MSI 6163 motherboard(440BX) with TNT2 video (nv driver, agpgart module was loaded too I think). Other hardware is 3c905B and sb PCI 128, all working well. I thought it was somehow related to power management, because the system froze only after being idle, but adding Option "NoPM" "true" to XF86Config didn't change anything. If necessary I can restore the old kernel and perform more testing. Mike I've seen the same problems with USB. Looks like a bug in the USB back port. unloading the USB modules stopped the oopes here. -Thomas My system is AMD ATHLON 600MHZ, nvidia TNT2-Vanta, stock kernel from 7.0. The additional remark is that my system freezes even if I disable the onboard USB support in the BIOS. This happens 2 minutes after starting X. When I had USB enabled and usb module loaded, my system froze as soon as kudzu started, hence I could never boot my system. Mate It seems that if I add append="x86_serial_nr=1" to lilo.conf, my system is fine (previously, it froze 2 minutes after starting X). On the other hand, if I enable USB in the BIOS, my system immediately freezes when I start X. Mate |
I have had the kernel freeze on my several times with no indications why in the log. The I had the idea that I should let the virtual console 1 display to see if I still got the kernel crash. Well I did, and the result I put through scripts/ksymoops to clarify it a bit: unable to handle kernel paging request at virtual address ffffffff current->tss.cr3 = 00101000, %cr3 = 00101000 *pde = 00000000 oops: 0000 cpu: 0 eip: 0010:[<ffffffff>] eflags: 00210286 eax: 0000000f ebx: c7e471c0 ecx: 00000000 edx: 00000001 esi: c885b9bc edi: c022c3a4 ebp: c0247f4c esp: c0247f18 ds: 0018 es: 0018 ss: 0018 stack: 00000001 c7e471c0 c01127ae c7e471c0 00000001 c027dea0 00259eae c022c3a4 00000001 c010ae3a 00000000 00000000 00000000 c0247f60 c01196d9 00000000 c0246000 c010b19a 00000e00 c010ae60 00000000 c0246000 00000000 c0246000 call trace: [<c01127ae>] [<c010ae3a>] [<c01196d9>] [<c010b19a>] [<c010ae60>] [<c01088dd>] [<c0106000>] [<c0108900>] [<c010a06c>] [<c0106000>] [<c0106077>] [<c0106000>] [<c0100175>] code: bad eip value. Warning: trailing garbage ignored on Code: line Text: 'code: bad eip value.' Garbage: 'ip value.' Oops_code_values invalid value 0xbad in Code line, not a multiple of 2 digits, value ignored Oops_code_values invalid value 0xe in Code line, not a multiple of 2 digits, value ignored >>EIP: ffffffff <END_OF_CODE+37649e23/??? Trace: c01127ae <timer_bh+2be/404> Trace: c010ae3a <do_8259A_IRQ+9a/a8> Trace: c01196d9 <do_bottom_half+49/70> Trace: c010b19a <do_IRQ+3a/3c> Trace: c010ae60 <common_interrupt+18/20> Trace: c01088dd <cpu_idle+5d/6c> Trace: c0106000 <get_options+0/70> Trace: c0108900 <sys_idle+14/20> Code: ffffffff <END_OF_CODE+37649e23/??? 00000000 <_EIP>: <=== aiee, killing interrupt handler kernel panic: attempted to kill the idle task! in swapper task - not syncing 1737 warnings and 5 errors issued. Results may not be reliable. As suggested by a third party, I should be able to find the offending module (if it is a module) this way: #!/bin/sh cd /lib/modules/2.2.16-22 for i in `find -name '*.o'`;do echo $i objdump --disassemble-all --reloc $i | grep '^0.*9bc <' done *9bc originates from the ESI register (used by a function call, I was explained) and the only valid match was this: ./misc/agpgart.o 000009bc <agp_generic_remove_memory>: This is the AGP part of the /var/log/messages file: Oct 5 15:01:11 overmind kernel: Linux agpgart interface v0.99 (c) Jeff Hartmann Oct 5 15:01:11 overmind kernel: agpgart: Maximum main memory to use for agp memory: 96M Oct 5 15:01:11 overmind kernel: agpgart: Detected AMD Irongate chipset Oct 5 15:01:11 overmind kernel: agpgart: AGP aperture is 64M @ 0xe0000000 The strange thing is, that my system is much more stable (if not completely, hard to say without waiting 14 days) if I disable USB in the bios, which causes usb-ohci and usbcore not to be loaded. If loaded, 1-3 freezes a day occurs. I am using the nvidia drivers for XFree86, but have seen the kernel freeze even with the nv driver (XF86 driver). I have disabled AGP usage in XF86Config-4 and this is also seen by lsmod, which shows agpgart to be loaded and not used by anything. output from lspci: 00:00.0 Host bridge: Advanced Micro Devices [AMD] AMD-751 [Irongate] System Controller (rev 23) 00:01.0 PCI bridge: Advanced Micro Devices [AMD] AMD-751 [Irongate] AGP Bridge (rev 01) 00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-756 [Viper] ISA (rev 01) 00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-756 [Viper] IDE (rev 03)00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-756 [Viper] ACPI (rev 03) 00:07.4 USB Controller: Advanced Micro Devices [AMD] AMD-756 [Viper] USB (rev 06) 00:08.0 Multimedia audio controller: Yamaha Corporation YMF-724F [DS-1 Audio Controller] (rev 03) 00:0a.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8139 (rev 10) 01:05.0 VGA compatible controller: nVidia Corporation NV11 (rev a1) I am not using the kernel YMF724 driver (which causes kernel hangs as well, but that bug is reported by someone else). In this particular kernel freeze, the nvidia driver and the usb driver were both using the same interrupt (only ones sharing interrupts). regards, Bernhard Ege