Bug 229603
Summary: | NMI/freezes with e1000 connected; seems stable with e1000 'off' | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Tom London <selinux> |
Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> |
Status: | CLOSED RAWHIDE | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | rawhide | CC: | bmr, bos, david.graham, dhollis, dm, jdelvare, jesse.brandeburg, jwboyer, konradr, pcfe, redhat-bugzilla, rjb, tcallawa, wtogami |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2007-04-23 15:57:59 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Tom London
2007-02-21 23:59:03 UTC
I can 'replicate' this behavior: Booting with e1000 connected to cable/network - system crashes within 5-30 minutes. Booting with e1000 disconnected (i.e., no cable plugged in), ipw3945 connects, and system runs stably. Booting with .2925 with e1000 connected, system seems stable. Reviewing /var/log/messages, it appears that the e1000 is 'recognized/activated' a bit earlier than in older logs. Any possible connection? Am I just cursed? System with e1000 enabled remains unstable with .2940 Is there some way I can get debugging output? System either generates NMI and freezes, or freezes without NMI.... Continues with .2942 Don't suppose this has anything to do with this....:http://www.gossamer-threads.com/lists/engine?do=post_view_flat;post=735847;page=1;mh=-1;list=linux;sb=post_latest_reply;so=ASC Still getting this with .2953..... Hard freeze with .2960. Works great with RJ-45 unplugged. Still getting hard freeze with .2966 Still freezing with .2967 .2975 lasted almost 4 hours before producing: Mar 8 12:40:02 localhost ntpd[2717]: time reset +0.187384 s Mar 8 12:40:59 localhost kernel: Uhhuh. NMI received for unknown reason b0 on CPU 0. Mar 8 12:40:59 localhost kernel: You have some hardware problem, likely on the PCI bus. Mar 8 12:40:59 localhost kernel: Dazed and confused, but trying to continue Mar 8 12:48:04 localhost syslogd 1.4.2: restart. I can confirm this (Thinkpad X60s). It seems to go away if the CPU does not switch frequencies. It seems to be triggered by CPU frequency switching or massive traffic on the ethernet interface. I can keep the system running all day by not transferring large amounts of traffic (<10MBit) OK, scratch that low traffic comment, it just hung doing that. A quite surefire way to trigger this for me is to transfer data using scp or rsync over ssh. My testcases get 3-4MB/sec, that is sufficient. More datapoints. Locking the CPU on max (1.67GHz) hangs the machine during a rsync/ssh copy, locking it on min (1GHz) makes the operation succeed. Even more datapoints: I installed the latest FC6 update kernel on my RH system (which actually seems to work) and conducted my rsync-ssh test (copy a 350MB file via rsync/ssh to a remote system). Three tests were done for each kernel/frequency pair. kernel 2.6.19-1.2911.fc6PAE: 1.0GHz: all succeed 1.66Ghz: all succeed kernel 2.6.20-1.2981.fc7PAE: 1.0GHz: all succeed 1.66GHz: all fail (machine hangs) Not sure its relevant, but got the following just around the time of the freeze: Mar 9 10:28:27 localhost avahi-daemon[3135]: Invalid legacy unicast query packet. Mar 9 10:28:27 localhost last message repeated 2 times Mar 9 10:28:28 localhost avahi-daemon[3135]: Recieved repsonse with invalid source port 1190 on interface 'eth1.0' Mar 9 10:28:31 localhost last message repeated 3 times Mar 9 10:28:32 localhost avahi-daemon[3135]: Invalid legacy unicast query packet. Mar 9 10:28:34 localhost last message repeated 3 times Mar 9 10:28:35 localhost avahi-daemon[3135]: Recieved repsonse with invalid source port 1190 on interface 'eth1.0' Mar 9 10:28:36 localhost avahi-daemon[3135]: Invalid legacy unicast query packet. Mar 9 10:28:43 localhost avahi-daemon[3135]: Recieved repsonse with invalid source port 1190 on interface 'eth1.0' Mar 9 10:29:31 localhost last message repeated 2 times Mar 9 10:30:35 localhost avahi-daemon[3135]: Recieved repsonse with invalid source port 1190 on interface 'eth1.0' Mar 9 10:32:43 localhost avahi-daemon[3135]: Recieved repsonse with invalid source port 1190 on interface 'eth1.0' Mar 9 10:36:59 localhost avahi-daemon[3135]: Recieved repsonse with invalid source port 1190 on interface 'eth1.0' Mar 9 10:38:27 localhost ntpd[2855]: synchronized to 63.200.199.38, stratum 2 Mar 9 10:44:18 localhost ntpd[2855]: synchronized to 62.112.194.64, stratum 2 Mar 9 11:15:16 localhost syslogd 1.4.2: restart. Attaching 'ethtool/ethtool -e' output. Also, changing summary text.... [root@localhost ~]# ethtool eth1 Settings for eth1: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: 100Mb/s Duplex: Full Port: Twisted Pair PHYAD: 1 Transceiver: internal Auto-negotiation: on Supports Wake-on: umbg Wake-on: g Current message level: 0x00000007 (7) Link detected: yes [root@localhost ~]# [root@localhost ~]# ethtool -e eth1 Offset Values ------ ------ 0x0000 00 16 d3 26 85 6f 30 0b b2 ff 51 00 ff ff ff ff 0x0010 53 00 03 02 6b 02 7e 20 aa 17 9a 10 86 80 df 80 0x0020 00 00 00 20 54 7e 00 00 14 00 da 00 04 00 00 07 0x0030 c9 6c 50 31 3e 07 0b 04 8b 29 00 00 00 f0 02 0f 0x0040 08 10 00 00 04 0f ff 7f 01 4d ff ff ff ff ff ff 0x0050 14 00 1d 00 14 00 1d 00 af aa 1e 00 00 00 1d 00 0x0060 00 01 00 40 1f 12 07 40 ff ff ff ff ff ff ff ff 0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff a4 11 [root@localhost ~]# Also, ethtool -i: [root@localhost ~]# ethtool -i eth1 driver: e1000 version: 7.4.27-NAPI firmware-version: 0.5-1 bus-info: 0000:02:00.0 [root@localhost ~]# Non-freezing 7.3.15: Linux lain.camperquake.de 2.6.20-1.2981.fc7PAE #1 SMP Thu Mar 8 19:35:55 EST 2007 i686 i686 i386 GNU/Linux Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: 100Mb/s Duplex: Full Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: umbg Wake-on: g Current message level: 0x0000ffff (65535) Link detected: yes driver: e1000 version: 7.3.15-NAPI firmware-version: 0.5-1 bus-info: 0000:02:00.0 Offset Values ------ ------ 0x0000 00 16 d3 32 bc a6 30 0b b2 ff 51 00 ff ff ff ff 0x0010 53 00 03 02 6b 02 7e 20 aa 17 9a 10 86 80 df 80 0x0020 00 00 00 20 54 7e 00 00 14 00 da 00 04 00 00 27 0x0030 c9 6c 50 31 3e 07 0b 04 8b 29 00 00 00 f0 02 0f 0x0040 08 10 00 00 04 0f ff 7f 01 4d ff ff ff ff ff ff 0x0050 14 00 1d 00 14 00 1d 00 af aa 1e 00 00 00 1d 00 0x0060 00 01 00 40 1f 12 07 40 ff ff ff ff ff ff ff ff 0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 6d ae Freezing 7.3.20 (kernel default): Linux lain.camperquake.de 2.6.20-1.2981.fc7PAE #1 SMP Thu Mar 8 19:35:55 EST 2007 i686 i686 i386 GNU/Linux Settings for eth0: Supported ports: [ TP ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Full Advertised auto-negotiation: Yes Speed: 100Mb/s Duplex: Full Port: Twisted Pair PHYAD: 0 Transceiver: internal Auto-negotiation: on Supports Wake-on: umbg Wake-on: g Current message level: 0x0000ffff (65535) Link detected: yes driver: e1000 version: 7.3.20-k2-NAPI firmware-version: 0.5-1 bus-info: 0000:02:00.0 Offset Values ------ ------ 0x0000 00 16 d3 32 bc a6 30 0b b2 ff 51 00 ff ff ff ff 0x0010 53 00 03 02 6b 02 7e 20 aa 17 9a 10 86 80 df 80 0x0020 00 00 00 20 54 7e 00 00 14 00 da 00 04 00 00 27 0x0030 c9 6c 50 31 3e 07 0b 04 8b 29 00 00 00 f0 02 0f 0x0040 08 10 00 00 04 0f ff 7f 01 4d ff ff ff ff ff ff 0x0050 14 00 1d 00 14 00 1d 00 af aa 1e 00 00 00 1d 00 0x0060 00 01 00 40 1f 12 07 40 ff ff ff ff ff ff ff ff 0x0070 ff ff ff ff ff ff ff ff ff ff ff ff ff ff 6d ae Ralf, Tom: Have you tried running with nmi_watchdog=0 ? What is the result with the test-cases? Thank you. (In reply to comment #17) > Ralf, Tom: > > Have you tried running with nmi_watchdog=0 ? What is the result with the > test-cases? Thank you. [root@localhost kernel]# cat nmi_watchdog 0 [root@localhost kernel]# Don't believe I've ever changed it..... Same behaviour, ThinkPad X60 (type 1706-GMG). Just got the machine, so most tests run with 2.6.20-1.2986.fc7 but some were run with 2.6.20-1.2970.2.1.fc7.jwltest.4 rsyncing large amounts of data (/home from my old laptop to this machine) will repeatedly give me (with more or less default BIOS settings, see below for work-around) [start] Uhhuh. NMI received for unknown reason b0. You have some hardware problem, likely on the PCI bus. Dazed and confused, but trying to continue [end] after a while (well under an hour, mostly within 10 minutes). At this stage machine is completely unresponsive, not even SysRq will work. All latest tests I have done were in runlevel 1 where I manually brought up eth0, started sshd and had both machines connected with a crossover cable. Both machines are on AC power. On my side I have tried: - disabling 'PCI Bus Power Management' in the BIOS (version 2.07) + that did not help. - disabling 'CPU Power Management' and setting SpeedStep on AC to 'Disabled' + that allowed me to push my /home through rsync over ssh (40Gigs) Definitely not a solution, but a temporary work-around. Hopefully this also helps you in debugging this further. I will wipe this box now and put RHEL5 on it and will open a separate BZ if it happens there too (will probably be a few days though, quite busy) FWIW: this bug does not seem to happen on RHEL5 x86_64 (kernel 2.6.18-8.1.1.el5) Installed it on my box yesterday, no freeze yet (at least not this one) I continue to get freezes with 2.6.20-1.2997.fc7PAE Seeing this consistently on a Thinkpad T60 widescreen. Oddly, I only see it while using yum. Heavy ssh + rsync and firefox hasn't triggered it at all, while yum triggered it 10 times in the past few hours for me. Warren, can you mail <auke-jan.h.kok>, <jesse.brandeburg> or <john.ronciak> with your instructions on how to reproduce this on a T60? As far as I know they have been unable to make a T60 hang so far (and they seem to have no X60) In addition to Comment #19 this was with HDD firmware MBZIC60R updated this to MBZIC65R today, but machine now installed with RHEL5, so unable to quickly test. In any case, nothing in the firmware changelog indicates that the behavious would change with new firmware. (side note: if you update the drive firmware and after that your grub does not get past the loading stage; you may want to know that for me entering the BIOS, setting the same options for AHCI and boot order, saving and exit solved the issue, no idea why though) (In reply to comment #25) > In addition to Comment #19 > this was with HDD firmware MBZIC60R > updated this to MBZIC65R today, but machine now installed with RHEL5, so unable > to quickly test. In any case, nothing in the firmware changelog indicates that > the behavious would change with new firmware. RHEL 5 has an older kernel and likely won't exhibit this issue. I don't see the problem on anything other than rawhide kernels which are 2.6.21-rcN based. Further testing info: - Locking cpufreq to 1GHz does seem to avoid a kernel deadlock. 1.33GHz could not avoid the lockup while using the e1000. My BIOS is set to "Battery Optimized" when running off of battery. Lockup happened for me when I accidentally unplugged the AC power, perhaps indicating that e1000 is having trouble with simultaneous cpufreq changes? - Lockups happen similarly with one or two cores active, so SMP doesn't seem to be an issue. good news! we reproduced this here at Intel (finally). It appears if we're running at 1Gb speeds it is not as reproducible. :-) Once we switched to 100Mb it reproduced immediately. Is there also a way to get people with cpu frequency changing knowledge involved? Not sure its relevant, but disabling 'cpuspeed' and rebooting still freezes (with azureus running). Indeed, it seems that preventing cpu frequency changes only reduces the incidence of the deadlock, but does not eliminate it completely. has anyone had any luck in figuring out what version of the kernel this problem started? We are also going to look into that. The last working kernel for me was .2925; started failing with .2932 for sure. .2925 says "2.6.20" in changelog. (Feb 04 2007) No longer have .2932, but the 'next' report in the changelog says: 2.6.20-git10 (Feb 14 2007) Also, this one is 'close': - 2.6.20-git14 (Now tickless on 32bit x86). For people that don't know how, here's the workaround i'm using with .3040 and has been stable for almost 2 days: sudo /sbin/service cpuspeed stop sudo cpufreq-selector -c 0 -f 1000 sudo cpufreq-selector -c 1 -f 1000 Note, this was done on a Thinkpad T60p which is a Core Duo. If you don't have dual cores, only one call to cpufreq-selector is needed. This did not work for me on Thinkpad X60. The only thing I changed was the frequency setting from 1000->1830. After doing the above and starting several torrent downloads I got the freeze. Still freezes with .3054 Seeing the same problems with 3054, also on an X60. can someone experiencing this try some -rcX kernels from 2.6.21-rcX (kernel.org kernels) to see if this goes away in any of those versions? If you don't get to it this weekend I'll be looking into it more on monday. What git commits in linus' tree do .2925 and .2932 correspond to? Where can we get a copy of these kernels srpms to confirm this? I still saw this on a T60p with .3045 and .3054. Those correspond to 2.6.21-rc5-git12 and 2.6.21-rc6-git2 respectively. Not sure its relevant, but if I have 'rhythmbox' playing when it 'freezes', the system appears to constantly repeat about .25-.5 second audio 'loop'. (In reply to comment #40) > Not sure its relevant, but if I have 'rhythmbox' playing when it 'freezes', the > system appears to constantly repeat about .25-.5 second audio 'loop'. I've heard this too. My guess would be that it's the buffer the sound card uses is not being updated any longer due to the kernel being hung. So you just get to hear the last thing left in the buffer over and over. I'm not a sound card expert though, so the above could be complete crack. It is not uncommon for sound cards to work this way. we are reproducing this reliably now with 'ssh T60 ls -laR /', with link at 100Mb/Full autonegotiated. I am experiencing similar hard lockups with 2.6.21-rcX kernels, and found that reverting this patch: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=60cba200f11b6f90f35634c5cd608773ae3721b7 fixes the problem for me. (In reply to comment #44) > I am experiencing similar hard lockups with 2.6.21-rcX kernels, and found that > reverting this patch: > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=60cba200f11b6f90f35634c5cd608773ae3721b7 > fixes the problem for me. > Thanks for this. I've reverted this from 7.4.35-NAPI and am now 'banging' (testing) the system with heavy bittorrent (about 400-500KByte down, 30-50KByte up), mixed with HTTPS, music download, etc. Stayed up for about 90 minutes so far.... FYI, Linus today reverted that "e1000: fix NAPI performance on 4-port adapters" patch in mainline: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=46fcc86dd71d70211e965102fb69414c90381880 Which kernel build carries this fix? 2.6.20-1.3079.fc7 or greater. Though for the fix that went into upstream, you'll want .3094 or greater I believe. I'm having a similar problem even on the latest releases. I know the intel guys are CC'd on this. Please have a look at let me know what you think. System: Fedora Core 7 test 4, latest patches. Default kernel 2.6.21-1.3175.fc7 installed. Problem: Intel e1000 network driver kernel panics either upon initial bootup or on soft reboot. Can be reproduced every time. Hang happens when the system runs /etc/init.d/network start and not upon initial driver load. Kernel module loads successfully. Machine can be booted by using interactive mode and bypassing eth0 initialization. Even manual "ifconfig eth0 up" hangs with this error if that is run after booting the system. Environment: Connected via new manufactured CAT6 cable to a Cisco 2950 10/100 switch, port is configured for auto/auto negotiation as is the network card. DMESG identifier messages on driver load: e1000: 0000:00:08.0: e1000_probe: (PCI:33MHz:32-bit) 00:0e:0c:d8:69:ab e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection Panic message is: <lots of stuff I can't capture yet due to lack of serial cable> Call Trace: [<c042290d>] run_rebalance_domains+0x6a/0x332 [<c05a62d1>] net_rx_action+0x94/0x185 [<c042b505>] __do_softirq+0x59/0xb1 [<c04071b7>] local_bh_enable_ip+0x35/0x40 [<c042b371>] dev_open+0x44/0x62 [<c05a42c1>] dev_change_flags+0x47/0xe4 [<c05e121b>] devnet_ioctl+0x250/0x56a [<c04e8b30>] copy_to_user+0x3c/0x50 [<c059b293>] sock_ioctl+0x1a2/0x1c1 [<c059b0f1>] sock_ioctl+0x0/0x1c1 [<c048044f>] do_ioctl+0x1f/0x62 [<c04806d6>] vfs_ioctl+0x244/0x256 [<c0480734>] sys_ioctl+0x4c/0x64 [<c0404f70>] syscall_call+0x7/0xb [<c0600000>] wext_handle_ioctl+0x1bd/0x370 ================= Code: 00 00 00 01 00 e9 91 fe ff ff 8b 96 e4 00 00 00 bf a0 0f 00 00 e9 ec fe ff ff b8 c4 00 00 00 e8 ad 47 00 00 89 c1 e9 0a ff ff ff <0f> 0b eb fe 66 90 83 ec 0c 89 1c 24 8d 98 00 05 00 00 89 74 24 EIP: [<f8e14bea>] e1000_clean+0x33a/0x340 [e1000] SS:ESP 0068:c0762f94 Kernel panic - not syncing: Fatal exception in interrupt =========================== Motherboard: Biostar U8668-D v7.X, latest BIOS (U8668R41, 4/7/2006) http://www.biostar.com.tw/products/mainboard/board.php?name=U8668-D%20v7.x CPU: Intel Pentium 4 2.8 Northwood 2.8GHz 512KB L2 Cache Socket 478 Processor No overclocking now or ever. Part #RK80532PE072512 Memory: Kingston 1GB 184-Pin DDR SDRAM DDR 266 (PC 2100) Part # KVR266X64C2/1G Network Card: Intel PWLA8391GT Using 33MHz PCI slot (only PCI card in the system) ============================ Complete DMESG output, note that eth0 is the on-board chip. eth1 is the PCI Intel gigabit card that is having the issues. I stopped it from starting the network rc script so I could capture this. Linux version 2.6.21-1.3175.fc7 (kojibuilder.phx.redhat.com) (gcc version 4.1.2 20070502 (Red Hat 4.1.2-12)) #1 SMP Mon May 21 11:35:59 EDT 2 007 BIOS-provided physical RAM map: sanitize start sanitize end copy_e820_map() start: 0000000000000000 size: 000000000009fc00 end: 000000000009 fc00 type: 1 copy_e820_map() type is E820_RAM copy_e820_map() start: 000000000009fc00 size: 0000000000000400 end: 00000000000a 0000 type: 2 copy_e820_map() start: 00000000000f0000 size: 0000000000010000 end: 000000000010 0000 type: 2 copy_e820_map() start: 0000000000100000 size: 000000007f6f0000 end: 000000007f7f 0000 type: 1 copy_e820_map() type is E820_RAM copy_e820_map() start: 000000007f7f0000 size: 0000000000003000 end: 000000007f7f 3000 type: 4 copy_e820_map() start: 000000007f7f3000 size: 000000000000d000 end: 000000007f80 0000 type: 3 copy_e820_map() start: 00000000fec00000 size: 0000000001400000 end: 000000010000 0000 type: 2 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable) BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000007f7f0000 (usable) BIOS-e820: 000000007f7f0000 - 000000007f7f3000 (ACPI NVS) BIOS-e820: 000000007f7f3000 - 000000007f800000 (ACPI data) BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved) 1143MB HIGHMEM available. 896MB LOWMEM available. found SMP MP-table at 000f5520 Using x86 segment limits to approximate NX protection Entering add_active_range(0, 0, 522224) 0 entries of 256 used Zone PFN ranges: DMA 0 -> 4096 Normal 4096 -> 229376 HighMem 229376 -> 522224 early_node_map[1] active PFN ranges 0: 0 -> 522224 On node 0 totalpages: 522224 DMA zone: 40 pages used for memmap DMA zone: 0 pages reserved DMA zone: 4056 pages, LIFO batch:0 Normal zone: 2200 pages used for memmap Normal zone: 223080 pages, LIFO batch:31 HighMem zone: 2859 pages used for memmap HighMem zone: 289989 pages, LIFO batch:31 DMI 2.3 present. Using APIC driver default ACPI: RSDP 000F6D50, 0014 (r0 VIAP4X) ACPI: RSDT 7F7F3000, 002C (r1 VIAP4X AWRDACPI 42302E31 AWRD 0) ACPI: FACP 7F7F3040, 0074 (r1 VIAP4X AWRDACPI 42302E31 AWRD 0) ACPI: DSDT 7F7F30C0, 54B0 (r1 VIAP4X AWRDACPI 1000 MSFT 100000D) ACPI: FACS 7F7F0000, 0040 ACPI: APIC 7F7F8580, 005C (r1 VIAP4X AWRDACPI 42302E31 AWRD 0) ACPI: PM-Timer IO Port: 0x408 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) Processor #0 15:2 APIC version 20 ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] disabled) ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 2, version 3, address 0xfec00000, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Enabling APIC mode: Flat. Using 1 I/O APICs Using ACPI (MADT) for SMP configuration information Allocating PCI resources starting at 80000000 (gap: 7f800000:7f400000) Built 1 zonelists. Total pages: 517125 Kernel command line: ro root=/dev/VolGroup00/LogVol00 mapped APIC to ffffd000 (fee00000) mapped IOAPIC to ffffc000 (fec00000) Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 CPU 0 irqstacks, hard=c0782000 soft=c0762000 PID hash table entries: 4096 (order: 12, 16384 bytes) Detected 2806.547 MHz processor. Console: colour VGA+ 80x25 Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) Memory: 2058608k/2088896k available (2079k kernel code, 29052k reserved, 1103k d ata, 240k init, 1171392k highmem) virtual kernel memory layout: fixmap : 0xffc56000 - 0xfffff000 (3748 kB) pkmap : 0xff800000 - 0xffc00000 (4096 kB) vmalloc : 0xf8800000 - 0xff7fe000 ( 111 MB) lowmem : 0xc0000000 - 0xf8000000 ( 896 MB) .init : 0xc0721000 - 0xc075d000 ( 240 kB) .data : 0xc0607dc7 - 0xc071bcb4 (1103 kB) .text : 0xc0400000 - 0xc0607dc7 (2079 kB) Checking if this processor honours the WP bit even in supervisor mode... Ok. Calibrating delay using timer specific routine.. 5617.43 BogoMIPS (lpj=2808716) Security Framework v1.0.0 initialized SELinux: Initializing. SELinux: Starting in permissive mode selinux_register_security: Registering secondary module capability Capability LSM initialized as secondary Mount-cache hash table entries: 512 CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000 CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 512K CPU: Hyper-Threading is disabled CPU: After all inits, caps: bfebf3ff 00000000 00000000 00003080 00004400 0000000 0 00000000 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU0: Intel P4/Xeon Extended MCE MSRs (12) available CPU0: Thermal monitoring enabled Checking 'hlt' instruction... OK. SMP alternatives: switching to UP code Freeing SMP alternatives: 12k freed ACPI: Core revision 20070126 CPU0: Intel(R) Pentium(R) 4 CPU 2.80GHz stepping 09 Total of 1 processors activated (5617.43 BogoMIPS). ENABLING IO-APIC IRQs ..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1 Brought up 1 CPUs sizeof(vma)=84 bytes sizeof(page)=40 bytes sizeof(inode)=420 bytes sizeof(dentry)=144 bytes sizeof(ext3inode)=596 bytes sizeof(buffer_head)=56 bytes sizeof(skbuff)=176 bytes sizeof(task_struct)=1408 bytes Time: 23:04:44 Date: 04/22/107 NET: Registered protocol family 16 ACPI: bus type pci registered PCI: PCI BIOS revision 2.10 entry at 0xfb3a0, last bus=1 PCI: Using configuration type 1 Setting up standard PCI resources ACPI: Interpreter enabled ACPI: (supports S0 S1 S4 S5) ACPI: Using IOAPIC for interrupt routing ACPI: PCI Root Bridge [PCI0] (0000:00) PCI: Probing PCI hardware (bus 00) PCI quirk: region 0400-047f claimed by vt8235 PM PCI quirk: region 0500-050f claimed by vt8235 SMB Boot video device is 0000:01:00.0 ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 6 7 10 *11 12) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 6 7 10 11 12) *0, disabled. ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 6 7 10 11 12) *0, disabled. ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 6 7 10 11 12) *0, disabled. ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 6 7 10 11 12) *0, disabled. ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 6 7 10 11 12) *0, disabled. ACPI: PCI Interrupt Link [LNK0] (IRQs 3 4 6 7 10 11 12) *0, disabled. ACPI: PCI Interrupt Link [LNK1] (IRQs 3 4 6 7 10 11 12) *0, disabled. ACPI: PCI Interrupt Link [ALKA] (IRQs *20), disabled. ACPI: PCI Interrupt Link [ALKB] (IRQs *21), disabled. ACPI: PCI Interrupt Link [ALKC] (IRQs *22), disabled. ACPI: PCI Interrupt Link [ALKD] (IRQs *23) Linux Plug and Play Support v0.97 (c) Adam Belay pnp: PnP ACPI init pnp: PnP ACPI: found 12 devices usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: Using ACPI for IRQ routing PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report NetLabel: Initializing NetLabel: domain hash size = 128 NetLabel: protocols = UNLABELED CIPSOv4 NetLabel: unlabeled traffic allowed by default pnp: 00:00: iomem range 0xcd000-0xcffff has been reserved Time: tsc clocksource has been installed. pnp: 00:00: iomem range 0xf0000-0xf7fff could not be reserved pnp: 00:00: iomem range 0xf8000-0xfbfff could not be reserved pnp: 00:00: iomem range 0xfc000-0xfffff could not be reserved pnp: 00:02: ioport range 0x400-0x47f has been reserved pnp: 00:02: ioport range 0x500-0x50f has been reserved PCI: Bridge: 0000:00:01.0 IO window: disabled. MEM window: e8000000-e9ffffff PREFETCH window: e0000000-e7ffffff PCI: Setting latency timer of device 0000:00:01.0 to 64 NET: Registered protocol family 2 IP route cache hash table entries: 32768 (order: 5, 131072 bytes) TCP established hash table entries: 131072 (order: 9, 3145728 bytes) TCP bind hash table entries: 65536 (order: 8, 1310720 bytes) TCP: Hash tables configured (established 131072 bind 65536) TCP reno registered checking if image is initramfs... it is Freeing initrd memory: 3451k freed apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16ac) apm: overridden by ACPI. audit: initializing netlink socket (disabled) audit(1179875084.169:1): initialized highmem bounce pool size: 64 pages Total HugeTLB memory allocated, 0 VFS: Disk quotas dquot_6.5.1 Dquot-cache hash table entries: 1024 (order 0, 4096 bytes) SELinux: Registering netfilter hooks ksign: Installing public key data Loading keyring - Added public key 5FA80F9AE132683E - User ID: Red Hat, Inc. (Kernel Module GPG key) io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) pci_hotplug: PCI Hot Plug PCI Core version: 0.5 ACPI: Fan [FAN] (on) ACPI Exception (processor_core-0783): AE_NOT_FOUND, Processor Device is not pres ent [20070126] ACPI: Thermal Zone [THRM] (40 C) isapnp: Scanning for PnP cards... Switched to high resolution mode on CPU 0 isapnp: No Plug & Play device found Real Time Clock Driver v1.12ac Non-volatile memory driver v1.2 Linux agpgart interface v0.102 (c) Dave Jones agpgart: Detected VIA P4M266x/P4N266 chipset agpgart: AGP aperture is 4M @ 0xeb000000 Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A 00:09: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A RAMDISK driver initialized: 16 RAM disks of 16384K size 4096 blocksize input: Macintosh mouse button emulation as /class/input/input0 usbcore: registered new interface driver libusual usbcore: registered new interface driver hiddev usbcore: registered new interface driver usbhid drivers/usb/input/hid-core.c: v2.6:USB HID core driver PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12 serio: i8042 KBD port at 0x60,0x64 irq 1 serio: i8042 AUX port at 0x60,0x64 irq 12 mice: PS/2 mouse device common for all mice input: AT Translated Set 2 keyboard as /class/input/input1 TCP bic registered Initializing XFRM netlink socket NET: Registered protocol family 1 NET: Registered protocol family 17 Using IPI No-Shortcut mode Magic number: 7:600:100 drivers/rtc/hctosys.c: unable to open rtc device (rtc0) Freeing unused kernel memory: 240k freed Write protecting the kernel read-only data: 826k USB Universal Host Controller Interface driver v3.0 ohci_hcd: 2006 August 04 USB 1.1 'Open' Host Controller (OHCI) Driver SCSI subsystem initialized libata version 2.20 loaded. pata_via 0000:00:11.1: version 0.2.1 ACPI: PCI Interrupt Link [ALKA] disabled and referenced, BIOS bug ACPI: PCI Interrupt Link [ALKA] enabled at IRQ 20 ACPI: PCI Interrupt 0000:00:11.1[A] -> Link [ALKA] -> GSI 20 (level, low) -> IRQ 16 ata1: PATA max UDMA/133 cmd 0x000101f0 ctl 0x000103f6 bmdma 0x0001e000 irq 14 ata2: PATA max UDMA/133 cmd 0x00010170 ctl 0x00010376 bmdma 0x0001e008 irq 15 scsi0 : pata_via ata1.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968 ata1.00: ATA-6: WDC WD2000JB-00EVA0, 15.05R15, max UDMA/100 ata1.00: 390721968 sectors, multi 16: LBA48 ata1.00: ata_hpa_resize 1: sectors = 390721968, hpa_sectors = 390721968 ata1.00: configured for UDMA/100 scsi1 : pata_via input: ImPS/2 Generic Wheel Mouse as /class/input/input2 ata2.00: ATAPI, max UDMA/33 ata2.00: configured for UDMA/33 scsi 0:0:0:0: Direct-Access ATA WDC WD2000JB-00E 15.0 PQ: 0 ANSI: 5 SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB) sda: Write Protect is off sda: Mode Sense: 00 3a 00 00 SCSI device sda: write cache: enabled, read cache: enabled, doesn't support DPO or FUA sda: sda1 sda2 sd 0:0:0:0: Attached scsi disk sda scsi 1:0:0:0: CD-ROM TOSHIBA CD-ROM XM-6702B 1007 PQ: 0 ANSI: 5 device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with ordered data mode. SELinux: Disabled at runtime. SELinux: Unregistering netfilter hooks audit(1179875092.168:2): selinux=0 auid=4294967295 input: PC Speaker as /class/input/input3 sd 0:0:0:0: Attached scsi generic sg0 type 0 scsi 1:0:0:0: Attached scsi generic sg1 type 5 Floppy drive(s): fd0 is 1.44M FDC 0 is a post-1991 82077 via-rhine.c:v1.10-LK1.4.3 2007-03-06 Written by Donald Becker via-rhine: Broken BIOS detected, avoid_D3 enabled. ACPI: PCI Interrupt Link [ALKD] enabled at IRQ 23 ACPI: PCI Interrupt 0000:00:12.0[A] -> Link [ALKD] -> GSI 23 (level, low) -> IRQ 17 eth0: VIA Rhine II at 0xeb441000, 00:11:5b:a8:af:0a, IRQ 17. eth0: MII PHY found at address 1, status 0x7869 advertising 05e1 Link 41e1. NET: Registered protocol family 23 Intel(R) PRO/1000 Network Driver - version 7.5.5-NAPI Copyright (c) 1999-2007 Intel Corporation. ACPI: PCI Interrupt 0000:00:08.0[A] -> GSI 16 (level, low) -> IRQ 18 e1000: 0000:00:08.0: e1000_probe: (PCI:33MHz:32-bit) 00:0e:0c:d8:69:ab e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection sr0: scsi3-mmc drive: 48x/48x cd/rw xa/form2 cdda tray Uniform CD-ROM driver Revision: 3.20 sr 1:0:0:0: Attached scsi CD-ROM sr0 loop: loaded (max 8 devices) sonypi: Sony Programmable I/O Controller Driver v1.26. No dock devices found. input: Power Button (FF) as /class/input/input4 ACPI: Power Button (FF) [PWRF] input: Power Button (CM) as /class/input/input5 ACPI: Power Button (CM) [PWRB] input: Sleep Button (CM) as /class/input/input6 ACPI: Sleep Button (CM) [SLPB] ibm_acpi: ec object not found EXT3 FS on dm-0, internal journal kjournald starting. Commit interval 5 seconds EXT3 FS on sda1, internal journal EXT3-fs: mounted filesystem with ordered data mode. Adding 2064376k swap on /dev/VolGroup00/LogVol01. Priority:-1 extents:1 across: Uniform CD-ROM driver Revision: 3.20 sr 1:0:0:0: Attached scsi CD-ROM sr0 loop: loaded (max 8 devices) sonypi: Sony Programmable I/O Controller Driver v1.26. No dock devices found. input: Power Button (FF) as /class/input/input4 ACPI: Power Button (FF) [PWRF] input: Power Button (CM) as /class/input/input5 ACPI: Power Button (CM) [PWRB] input: Sleep Button (CM) as /class/input/input6 ACPI: Sleep Button (CM) [SLPB] ibm_acpi: ec object not found EXT3 FS on dm-0, internal journal kjournald starting. Commit interval 5 seconds EXT3 FS on sda1, internal journal EXT3-fs: mounted filesystem with ordered data mode. Adding 2064376k swap on /dev/VolGroup00/LogVol01. Priority:-1 extents:1 across: 2064376k audit(1179900320.509:3): audit_pid=1390 old=0 by auid=4294967295 it87: Found IT8705F chip at 0x290, revision 3 it87-isa 9191-0290: Detected broken BIOS defaults, disabling PWM interface eth0: link up, 100Mbps, full-duplex, lpa 0x41E1 ACPI: PCI interrupt for device 0000:00:08.0 disabled Enabling or disabling ACPI in the BIOS does not seem to have any impact. It still locks with the same symptoms, same function calls in the trace. In the BIOS Power Management -> IRQ/Event Activity Detect -> PCI Master was off. I turned it on and no change in behavior. Same lock. In the BIOS Power Management -> IRQ/Event Activity Detect -> IRQs Activity Monitoring -> Primary INTR was on and I turned it off. Same lock. I can't see any other relevant settings but I thought I'd pass this info along. Robert, your problem doesn't seem to have anything to do with the original bug report. Please open a separate bug report. |