Description of problem: A memory corruption or a paging error occurs during the NFS connectathon test is run Version-Release number of selected component (if applicable): 2.6.18-185.el5 How reproducible: 100% Steps to Reproduce: 1. Install the NFS connectathon testsuite 2. Edit the testsuite to only run against the Solaris 10 NFS server 3. Run the test suite d [NOTE: on the intel-piketon-02.lab.bos.redhat.com system, this issue is reproducible within _minutes_ on 185.el5 and self built 185.el5 kernels] Actual results: The transmitter hangs, SELinux: initialized (dev 0:1b, type nfs), uses genfs_contexts eth0: Detected Tx Unit Hang: TDH <d1> TDT <d2> next_to_use <d2> next_to_clean <d0> buffer_info[next_to_clean]: time_stamp <fffe3619> next_to_watch <d1> jiffies <fffe41ae> next_to_watch.status <0> nfs: server sol10-nfs not responding, still trying eth0: Detected Tx Unit Hang: TDH <d1> TDT <d2> next_to_use <d2> next_to_clean <d0> buffer_info[next_to_clean]: time_stamp <fffe3619> next_to_watch <d1> jiffies <fffe4959> next_to_watch.status <0> eth0: Detected Tx Unit Hang: TDH <d1> TDT <d2> next_to_use <d2> next_to_clean <d0> buffer_info[next_to_clean]: time_stamp <fffe3619> next_to_watch <d1> jiffies <fffe5129> next_to_watch.status <0> eth0: Detected Tx Unit Hang: TDH <d1> TDT <d2> next_to_use <d2> next_to_clean <d0> buffer_info[next_to_clean]: time_stamp <fffe3619> next_to_watch <d1> jiffies <fffe58fa> next_to_watch.status <0> NETDEV WATCHDOG: eth0: transmit timed out and the card is reset. During the down, we hit this panic: NETDEV WATCHDOG: eth0: transmit timed out general protection fault: 0000 [1] SMP last sysfs file: /devices/pci0000:00/0000:00:00.0/resource CPU 0 Modules linked in: autofs4 hidp nfs fscache nfs_acl rfcomm l2cap bluetooth lockd sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 xfrm_nalgo crypto_api cpufreq_ondemand acpi_cpufreq freq_table dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac lp joydev snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc sr_mod snd_hwdep shpchp cdrom i2c_i801 e1000e parport_serial parport_pc parport sg snd i2c_core pcspkr soundcore dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 14, comm: events/0 Not tainted 2.6.18-185.el5debug #1 RIP: 0010:[<ffffffff8000d257>] [<ffffffff8000d257>] put_page+0x0/0x2e RSP: 0018:ffff810137899db8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8101361514b0 RCX: 0000000000000002 RDX: ffff810129782848 RSI: 0000000129783000 RDI: 000880484fdb0b00 RBP: 0000000000000001 R08: ffff810129784000 R09: 0000000000000000 R10: ffff810137a270b8 R11: 00000000000000f8 R12: ffff810134156e88 R13: ffff810137a270b8 R14: ffff810136758680 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff8043e000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002aac08939000 CR3: 0000000120e33000 CR4: 00000000000006e0 Process events/0 (pid: 14, threadinfo ffff810137898000, task ffff810137896440) Stack: ffffffff80239dc4 ffff810132b95560 ffff8101361514b0 ffffc2000011a078 ffffffff8002a2e7 ffff8101338fb668 ffffffff8821c52e 0000000000000003 ffff810136758680 ffff810136758718 ffff8101076ba188 0000000000000282 Call Trace: [<ffffffff80239dc4>] skb_release_data+0x5f/0x99 [<ffffffff8002a2e7>] __kfree_skb+0x11/0x1a [<ffffffff8821c52e>] :e1000e:e1000_clean_rx_ring+0xde/0x1b7 [<ffffffff8821e566>] :e1000e:e1000_reset_task+0x0/0xc [<ffffffff8821c96e>] :e1000e:e1000e_reinit_locked+0x3d/0x50 [<ffffffff80050582>] run_workqueue+0x9a/0xf4 [<ffffffff8004cd33>] worker_thread+0x0/0x122 [<ffffffff8004ce23>] worker_thread+0xf0/0x122 [<ffffffff80091598>] default_wake_function+0x0/0xe [<ffffffff80034d73>] kthread+0xfe/0x132 [<ffffffff80067fea>] trace_hardirqs_on_thunk+0x35/0x37 [<ffffffff80061079>] child_rip+0xa/0x11 [<ffffffff800688ed>] _spin_unlock_irq+0x24/0x27 [<ffffffff800606a8>] restore_args+0x0/0x30 [<ffffffff80034c75>] kthread+0x0/0x132 [<ffffffff8006106f>] child_rip+0x0/0x11 Code: 8b 07 f6 c4 40 74 05 e9 15 34 02 00 8b 47 08 85 c0 75 0a 0f RIP [<ffffffff8000d257>] put_page+0x0/0x2e RSP <ffff810137899db8> <0>Kernel panic - not syncing: Fatal exception Expected results: No panic Additional info: We've seen different versions of this panic. It *usually* is preceded by a TX hang, however, in some cases that doesn't appear to have happened. The key point is, however, that the panics appear to be the same -- the e1000e device is being reset and during the cleanup of the rx buffers a panic occurs due to the slab debug or an invalid paging request. This has been recently reported upstream against the e1000 (NOT _e_) in the 2.6.33-rc5 kernel here: http://marc.info/?l=linux-netdev&m=126426149306083&w=4 The panic is *very* similar to that of the e1000e. Of interest in this panic is a) The card is being reset and the rx buffers are being cleaned: [<c1276970>] ? e1000_clean_rx_ring+0x80/0x150 [<c127c743>] ? e1000_down+0x1b3/0x1d0 [<c127cf60>] ? e1000_reset_task+0x0/0x10 [<c127cd3b>] ? e1000_reinit_locked+0x4b/0x70 and b) this is reported against a UP kernel -- so the possibility of a concurrency issue is not possible. P.
An example in which the slab corruption detector kicked in: <transmit hang info snipped> NETDEV WATCHDOG: eth0: transmit timed out slab error in verify_redzone_free(): cache `size-2048': memory outside object was overwritten Call Trace: [<ffffffff80033633>] cache_free_debugcheck+0x106/0x217 [<ffffffff8000b94c>] kfree+0xcc/0x25d [<ffffffff8002a2e7>] __kfree_skb+0x11/0x1a [<ffffffff881f652e>] :e1000e:e1000_clean_rx_ring+0xde/0x1b7 [<ffffffff881f8566>] :e1000e:e1000_reset_task+0x0/0xc [<ffffffff881f696e>] :e1000e:e1000e_reinit_locked+0x3d/0x50 [<ffffffff80050582>] run_workqueue+0x9a/0xf4 [<ffffffff8004cd33>] worker_thread+0x0/0x122 [<ffffffff8004ce23>] worker_thread+0xf0/0x122 [<ffffffff80090ab8>] default_wake_function+0x0/0xe [<ffffffff80034d73>] kthread+0xfe/0x132 [<ffffffff80067fea>] trace_hardirqs_on_thunk+0x35/0x37 [<ffffffff80061079>] child_rip+0xa/0x11 [<ffffffff800688ed>] _spin_unlock_irq+0x24/0x27 [<ffffffff800606a8>] restore_args+0x0/0x30 [<ffffffff80034c75>] kthread+0x0/0x132 [<ffffffff8006106f>] child_rip+0x0/0x11 ffff81012fff7530: redzone 1:0x170f208e, redzone 2:0x170fc2a5. Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: [<ffffffff8000d257>] put_page+0x0/0x2e PGD 121130067 PUD 121193067 PMD 0 Oops: 0000 [1] SMP last sysfs file: /devices/pci0000:00/0000:00:19.0/irq CPU 0 Modules linked in: autofs4(U) hidp(U) nfs(U) fscache(U) nfs_acl(U) rfcomm(U) l2cap(U) bluetooth(U) lockd(U) sunrpc(U) ip_conntrack_netbios_ns(U) ipt_REJECT(U) xt_state(U) ip_conntrack(U) nfnetlink(U) iptable_filter(U) ip_tables(U) ip6t_REJECT(U) xt_tcpudp(U) ip6table_filter(U) ip6_tables(U) x_tables(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) cpufreq_ondemand(U) acpi_cpufreq(U) freq_table(U) dm_multipath(U) scsi_dh(U) video(U) backlight(U) sbs(U) power_meter(U) hwmon(U) i2c_ec(U) dell_wmi(U) wmi(U) button(U) battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) lp(U) snd_hda_intel(U) snd_seq_dummy(U) snd_seq_oss(U) snd_seq_midi_event(U) snd_seq(U) joydev(U) snd_seq_device(U) snd_pcm_oss(U) snd_mixer_oss(U) snd_pcm(U) snd_timer(U) sr_mod(U) snd_page_alloc(U) cdrom(U) parport_serial(U) snd_hwdep(U) parport_pc(U) i2c_i801(U) snd(U) e1000e(U) parport(U) shpchp(U) i2c_core(U) sg(U) soundcore(U) pcspkr(U) dm_raid45(U) dm_message(U) dm_region_hash(U) dm_mem_cache(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_log(U) dm_mod(U) ahci(U) libata(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U) Pid: 14, comm: events/0 Tainted: G 2.6.18.4 #8 RIP: 0010:[<ffffffff8000d257>] [<ffffffff8000d257>] put_page+0x0/0x2e RSP: 0018:ffff810137899db8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff810115e32a00 RCX: 0000000000000002 RDX: ffff8101317198f8 RSI: 000000013171a000 RDI: 0000000000000000 RBP: 0000000000000001 R08: ffff81013171b000 R09: 0000000000000000 R10: ffff810137a0f0b8 R11: 00000000000000f8 R12: ffff810131cc8988 R13: ffff810137a0f0b8 R14: ffff810133f7c680 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff8043c000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 000000012113e000 CR4: 00000000000006e0 Process events/0 (pid: 14, threadinfo ffff810137898000, task ffff810137896440) Process events/0 (pid: 14, threadinfo ffff810137898000, task ffff810137896440) Stack: ffffffff802392d5 ffff8101310fa458 ffff810115e32a00 ffffc2000011a938 ffffffff8002a2e7 ffff8101310fafb0 ffffffff881f652e 000000000000003b ffff810133f7c680 ffff810133f7c718 ffff8101375a4930 0000000000000282 Call Trace: [<ffffffff802392d5>] skb_release_data+0x5f/0x99 [<ffffffff8002a2e7>] __kfree_skb+0x11/0x1a [<ffffffff881f652e>] :e1000e:e1000_clean_rx_ring+0xde/0x1b7 <snip> P.
Boot with "time" argument and Paging request panic: [ 1286.693799] time_stamp <1000e2218> [ 1286.693800] next_to_watch <67> [ 1286.693802] jiffies <1000e3f49> [ 1286.693803] next_to_watch.status <0> [ 1287.291723] NETDEV WATCHDOG: eth0: transmit timed out [ 1287.378043] Unable to handle kernel paging request at ffffffffffffd786 RIP: [ 1287.382793] [<ffffffff8023925f>] skb_drop_list+0xb/0x22 [ 1287.390783] PGD 203067 PUD 1075d8067 PMD 0 [ 1287.395119] Oops: 0000 [1] SMP [ 1287.398373] last sysfs file: /devices/pci0000:00/0000:00:00.0/irq [ 1287.404641] CPU 0 [ 1287.406720] Modules linked in: autofs4(U) hidp(U) nfs(U) fscache(U) nfs_acl(U) rfcomm(U) l2cap(U) bluetooth(U) lockd(U) sunrpc(U) ip_conntrack_netbios_ns(U) ipt_REJECT(U) xt_state(U) ip_conntrack(U) nfnetlink(U) iptable_filter(U) ip_tables(U) ip6t_REJECT(U) xt_tcpudp(U) ip6table_filter(U) ip6_tables(U) x_tables(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) cpufreq_ondemand(U) acpi_cpufreq(U) freq_table(U) dm_multipath(U) scsi_dh(U) video(U) backlight(U) sbs(U) power_meter(U) hwmon(U) i2c_ec(U) dell_wmi(U) wmi(U) button(U) battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) lp(U) joydev(U) snd_hda_intel(U) snd_seq_dummy(U) snd_seq_oss(U) snd_seq_midi_event(U) snd_seq(U) snd_seq_device(U) snd_pcm_oss(U) snd_mixer_oss(U) snd_pcm(U) shpchp(U) snd_timer(U) snd_page_alloc(U) parport_serial(U) sr_mod(U) snd_hwdep(U) i2c_i801(U) i2c_core(U) e1000e(U) parport_pc(U) parport(U) cdrom(U) snd(U) soundcore(U) sg(U) pcspkr(U) dm_raid45(U) dm_message(U) dm_region_hash(U) dm_mem_cache(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_log(U) dm_mod(U) ahci(U) libata(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U) [ 1287.508755] Pid: 14, comm: events/0 Tainted: G 2.6.18.4 #8 [ 1287.514843] RIP: 0010:[<ffffffff8023925f>] [<ffffffff8023925f>] skb_drop_list+0xb/0x22 [ 1287.523087] RSP: 0018:ffff810137899db0 EFLAGS: 00010282 [ 1287.528550] RAX: ffffffffffffd786 RBX: ffff810107788710 RCX: 0000000000000002 [ 1287.535887] RDX: ffff8101332ce500 RSI: 0000000115caf000 RDI: ffff810115caeff0 [ 1287.543225] RBP: 0000000000000000 R08: ffff810115cb0000 R09: 0000000000000000 [ 1287.550560] R10: ffff810137a270b8 R11: 00000000000000f8 R12: ffff8101340f3d58 [ 1287.557899] R13: ffff810137a270b8 R14: ffff81013674c680 R15: 0000000000000000 [ 1287.565236] FS: 0000000000000000(0000) GS:ffffffff8043c000(0000) knlGS:0000000000000000 [ 1287.573555] CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b [ 1287.579465] CR2: ffffffffffffd786 CR3: 00000001236ac000 CR4: 00000000000006e0 [ 1287.586804] Process events/0 (pid: 14, threadinfo ffff810137898000, task ffff810137896440) [ 1287.595302] Stack: ffff810107788710 ffffffff802392fb ffff8101332ce5b8 ffff810107788710 [ 1287.603597] ffffc2000011a190 ffffffff8002a2e7 ffff8101332ce140 ffffffff8821052e [ 1287.611249] 000000000000000a ffff81013674c680 ffff81013674c718 ffff8101375a4930 [ 1287.618717] Call Trace: [ 1287.621424] [<ffffffff802392fb>] skb_release_data+0x85/0x99 [ 1287.627250] [<ffffffff8002a2e7>] __kfree_skb+0x11/0x1a [ 1287.632640] [<ffffffff8821052e>] :e1000e:e1000_clean_rx_ring+0xde/0x1b7 [ 1287.639550] [<ffffffff88212566>] :e1000e:e1000_reset_task+0x0/0xc [ 1287.645918] [<ffffffff8821096e>] :e1000e:e1000e_reinit_locked+0x3d/0x50 [ 1287.652811] [<ffffffff80050582>] run_workqueue+0x9a/0xf4 [ 1287.658366] [<ffffffff8004cd33>] worker_thread+0x0/0x122 [ 1287.663922] [<ffffffff8004ce23>] worker_thread+0xf0/0x122 [ 1287.669569] [<ffffffff80090ab8>] default_wake_function+0x0/0xe [ 1287.675673] [<ffffffff80034d73>] kthread+0xfe/0x132 [ 1287.680782] [<ffffffff80067fea>] trace_hardirqs_on_thunk+0x35/0x37 [ 1287.687239] [<ffffffff80061079>] child_rip+0xa/0x11 [ 1287.692349] [<ffffffff800688ed>] _spin_unlock_irq+0x24/0x27 [ 1287.698170] [<ffffffff800606a8>] restore_args+0x0/0x30 [ 1287.703555] [<ffffffff80034c75>] kthread+0x0/0x132 [ 1287.708576] [<ffffffff8006106f>] child_rip+0x0/0x11 [ 1287.713687] [ 1287.715219] [ 1287.715220] Code: 48 8b 18 48 89 c7 e8 65 ff ff ff 48 85 db 74 05 48 89 d8 eb [ 1287.724430] RIP [<ffffffff8023925f>] skb_drop_list+0xb/0x22 [ 1287.730262] RSP <ffff810137899db0> [ 1287.733848] CR2: ffffffffffffd786 [ 1287.737257] <0>Kernel panic - not syncing: Fatal exception [ 1287.742996] P.
Panic reported by gospo, not very reproducible on e1000: Failure on 2.6.31-rc4: log moved to: '/tmp/tmp.W17147' BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8108eaa0>] put_page+0x4/0xca PGD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/pci0000:ff/0000:ff:02.3/irq CPU 1 Modules linked in: nfs nfs_acl auth_rpcgss autofs4 hidp rfcomm l2cap bluetooth rfkill lockd sunrpc ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath sbs sbshc battery acpi_memhotplug ac lp snd_hda_codec_intelhdmi snd_hda_codec_realtek snd_hda_intel sg snd_hda_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device sr_mod snd_pcm_oss cdrom video snd_mixer_oss rtc_cmos output rtc_core snd_pcm rtc_lib snd_timer button snd parport_serial parport_pc parport i2c_i801 e1000e soundcore i2c_core snd_page_alloc shpchp pcspkr dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd [last unloaded: microcode] Pid: 0, comm: swapper Not tainted 2.6.31-rc4 #6 To be filled by O.E.M. RIP: 0010:[<ffffffff8108eaa0>] [<ffffffff8108eaa0>] put_page+0x4/0xca RSP: 0018:ffff880028069d60 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88012a83e480 RCX: 0000000000000011 RDX: ffff8800af130640 RSI: 00000000640000e0 RDI: 0000000000000000 RBP: 0000000000000001 R08: ffffffff81b3ce00 R09: 0000000000000002 R10: 0000000000000000 R11: ffffffff81272de9 R12: ffffc90014717af0 R13: ffff88012a83e480 R14: ffff88012f916460 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff880028066000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001001000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffff880137b1a000, task ffff880137b19560) Stack: ffff88012a83e480 0000000000000001 ffffc90014717af0 ffffffff81249720 <0> ffff88012a83e480 ffff88012a83e480 ffff88012a83e480 ffffffff81249474 <0> ffff8800af130030 ffffffff81272ad4 0000000000000000 000000000000018f Call Trace: <IRQ> [<ffffffff81249720>] ? skb_release_data+0x65/0xaa [<ffffffff81249474>] ? __kfree_skb+0x9/0x6f [<ffffffff81272ad4>] ? ip_rcv_finish+0x3a0/0x3b0 [<ffffffffa0152ab3>] ? e1000_clean_rx_irq+0x22e/0x2cd [e1000e] [<ffffffffa01518cf>] ? e1000_clean+0x6e/0x21d [e1000e] [<ffffffff81253778>] ? net_rx_action+0xa9/0x17d [<ffffffff8104153e>] ? __do_softirq+0xc5/0x182 [<ffffffff8100ca3c>] ? call_softirq+0x1c/0x28 [<ffffffff8100ddd2>] ? do_softirq+0x2c/0x68 [<ffffffff8100d452>] ? do_IRQ+0xa0/0xb6 [<ffffffff8100c2d3>] ? ret_from_intr+0x0/0xa <EOI> [<ffffffff8100c42e>] ? apic_timer_interrupt+0xe/0x20 [<ffffffff811a852f>] ? acpi_safe_halt+0x27/0x39 [<ffffffff811a861f>] ? acpi_idle_enter_c1+0x6f/0xc7 [<ffffffff81231396>] ? ladder_select_state+0x2b/0x135 [<ffffffff812309e5>] ? cpuidle_idle_call+0x7f/0xbe [<ffffffff8100aa1d>] ? cpu_idle+0x40/0x5e
Is this problem seen with our stand-alone versions of the e1000 and e1000e drivers? Maybe a patch didn't get picked up during backport efforts. This would narrow that down. We also don't have other reports of this happening either upstream or from our stand-alone drivers.
(In reply to comment #4) > Is this problem seen with our stand-alone versions of the e1000 and e1000e > drivers? Maybe a patch didn't get picked up during backport efforts. This > would narrow that down. We also don't have other reports of this happening > either upstream or from our stand-alone drivers. John, This does happen upstream on the e1000. http://marc.info/?l=linux-netdev&m=126426149306083&w=4 (Of course that is a UP kernel ... but the panic is the same) P.
NFS: (0:18/6248) revalidation complete NFS: (0:18/6248) data cache invalidated NFS: readdir_search_pagecache() searching for cookie 1536 NFS: find_dirent_page: searching page 0 for target 1536 NFS: nfs_readdir_filler: reading cookie 0 into page 0 NFS: find_dirent: examining cookie 12 NFS: find_dirent: examining cookie 512 NFS: find_dirent_page: returns -523 NFS: readdir_search_pagecache: returns -523 NFS: uncached_readdir() searching for cookie 1536 NFS: uncached_readdir: returns -523 <e1000e tx unit hang messages> slab error in verify_redzone_free(): cache `size-2048': memory outside object was overwritten Call Trace: [<ffffffff80033633>] cache_free_debugcheck+0x106/0x217 [<ffffffff8000b94c>] kfree+0xcc/0x25d [<ffffffff8002a2e7>] __kfree_skb+0x11/0x1a [<ffffffff881d052e>] :e1000e:e1000_clean_rx_ring+0xde/0x1b7 [<ffffffff881d2566>] :e1000e:e1000_reset_task+0x0/0xc [<ffffffff881d096e>] :e1000e:e1000e_reinit_locked+0x3d/0x50 [<ffffffff80050582>] run_workqueue+0x9a/0xf4 [<ffffffff8004cd33>] worker_thread+0x0/0x122 [<ffffffff8004ce23>] worker_thread+0xf0/0x122 [<ffffffff80091598>] default_wake_function+0x0/0xe [<ffffffff80034d73>] kthread+0xfe/0x132 [<ffffffff80067fea>] trace_hardirqs_on_thunk+0x35/0x37 [<ffffffff80061079>] child_rip+0xa/0x11 [<ffffffff800688ed>] _spin_unlock_irq+0x24/0x27 [<ffffffff800606a8>] restore_args+0x0/0x30 [<ffffffff80034c75>] kthread+0x0/0x132 [<ffffffff8006106f>] child_rip+0x0/0x11 ffff81010fcb0e18: redzone 1:0x170f0000, redzone 2:0x170fc2a5. slab error in verify_redzone_free(): cache `size-2048': memory outside object was overwritten Call Trace: [<ffffffff80033633>] cache_free_debugcheck+0x106/0x217 [<ffffffff8000b94c>] kfree+0xcc/0x25d [<ffffffff8002a2e7>] __kfree_skb+0x11/0x1a [<ffffffff881d052e>] :e1000e:e1000_clean_rx_ring+0xde/0x1b7 [<ffffffff881d2566>] :e1000e:e1000_reset_task+0x0/0xc [<ffffffff881d096e>] :e1000e:e1000e_reinit_locked+0x3d/0x50 [<ffffffff80050582>] run_workqueue+0x9a/0xf4 [<ffffffff8004cd33>] worker_thread+0x0/0x122 [<ffffffff8004ce23>] worker_thread+0xf0/0x122 [<ffffffff80091598>] default_wake_function+0x0/0xe [<ffffffff80034d73>] kthread+0xfe/0x132 [<ffffffff80067fea>] trace_hardirqs_on_thunk+0x35/0x37 [<ffffffff80061079>] child_rip+0xa/0x11 [<ffffffff800688ed>] _spin_unlock_irq+0x24/0x27 [<ffffffff800606a8>] restore_args+0x0/0x30 [<ffffffff80034c75>] kthread+0x0/0x132 [<ffffffff8006106f>] child_rip+0x0/0x11 ffff810116f72e18: redzone 1:0x170f0000, redzone 2:0x170fc2a5. general protection fault: 0000 [1] SMP last sysfs file: /devices/pci0000:00/0000:00:00.0/irq CPU 0 Modules linked in: nfs fscache nfs_acl autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc ipv6 xfrm_nalgo crypto_api cpufreq_ondemand acpi_cpufreq freq_table dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac lp joydev snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc sr_mod parport_serial cdrom snd_hwdep snd i2c_i801 parport_pc sg shpchp e1000e i2c_core parport soundcore pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 14, comm: events/0 Not tainted 2.6.18-185.el5debug #1 RIP: 0010:[<ffffffff80239d4e>] [<ffffffff80239d4e>] skb_drop_list+0xb/0x22 RSP: 0018:ffff810137897db0 EFLAGS: 00010206 RAX: 000e0000003c0000 RBX: ffff8101191de710 RCX: 0000000000000002 RDX: ffff81013061be00 RSI: 000000012d253000 RDI: ffff81012d252858 RBP: 0000000000000000 R08: ffff81012d254000 R09: 0000000000000000 R10: ffff810137a0f0b8 R11: 00000000000000f8 R12: ffff81012f01c8f0 R13: ffff810137a0f0b8 R14: ffff8101325e0680 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffffffff8043e000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000439520 CR3: 0000000125d41000 CR4: 00000000000006e0 Process events/0 (pid: 14, threadinfo ffff810137896000, task ffff810137894440) Stack: ffff8101191de710 ffffffff80239dea ffff81013061be50 ffff8101191de710 ffffc200000eda00 ffffffff8002a2e7 ffff81013061b9d8 ffffffff881d052e 0000000000000040 ffff8101325e0680 ffff8101325e0718 ffff8101375a8b60 Call Trace: [<ffffffff80239dea>] skb_release_data+0x85/0x99 [<ffffffff8002a2e7>] __kfree_skb+0x11/0x1a [<ffffffff881d052e>] :e1000e:e1000_clean_rx_ring+0xde/0x1b7 [<ffffffff881d2566>] :e1000e:e1000_reset_task+0x0/0xc [<ffffffff881d096e>] :e1000e:e1000e_reinit_locked+0x3d/0x50 [<ffffffff80050582>] run_workqueue+0x9a/0xf4 [<ffffffff8004cd33>] worker_thread+0x0/0x122 [<ffffffff8004ce23>] worker_thread+0xf0/0x122 [<ffffffff80091598>] default_wake_function+0x0/0xe [<ffffffff80034d73>] kthread+0xfe/0x132 [<ffffffff80067fea>] trace_hardirqs_on_thunk+0x35/0x37 [<ffffffff80061079>] child_rip+0xa/0x11 [<ffffffff800688ed>] _spin_unlock_irq+0x24/0x27 [<ffffffff800606a8>] restore_args+0x0/0x30 [<ffffffff80034c75>] kthread+0x0/0x132 [<ffffffff8006106f>] child_rip+0x0/0x11 Code: 48 8b 18 48 89 c7 e8 65 ff ff ff 48 85 db 74 05 48 89 d8 eb RIP [<ffffffff80239d4e>] skb_drop_list+0xb/0x22 RSP <ffff810137897db0> <0>Kernel panic - not syncing: Fatal exception
In the middle of the <e1000e tx unit hang messages> I did notice one of these: nfs: server sol10-nfs not responding, still trying
Hard to understand why the NFS version would matter much here... A question: are you running the NFSv3 runs over UDP as well?
(In reply to comment #10) > Hard to understand why the NFS version would matter much here... If it was easy, we would have already figured it out. :) > A question: are you running the NFSv3 runs over UDP as well? Yes.
Ok, I got on the box and reproduced the panic: ffff810132749330: redzone 1:0x170f0000, redzone 2:0x170fc2a5. ffff81011f6b6b18: redzone 1:0x170fc2a5, redzone 2:0x82a0e000046208e. ffff810133eb0700: redzone 1:0x170fc2a5, redzone 2:0x208e0e0000aa208e. In the latter two cases, rz2 was clobbered so it doesn't seem to be *just* rz1.
I think we need to look at why the rx unit is still receiving into buffers that the driver had freed. The tx hang debug we also need to work on, but I think the rx cleanup panic can be fixed, probably by being a lot more paranoid about making sure RX is stopped before reset continues.
Created attachment 387208 [details] make sure rx unit is stopped before reset patch applies against 2.6.33-rc5 and was load/unload tested. We don't have a repro here, so please give us feedback if this stops the panic. will also attach the dump code.
Created attachment 387209 [details] e1000e dump code this dump code will dump registers and descriptor rings upon a tx hang. If possible please consider running with only 80 rx and tx descriptors. ethtool -G eth0 rx 80 tx 80 patch is against 2.6.33-rc5
(In reply to comment #29) > Created an attachment (id=387208) [details] > make sure rx unit is stopped before reset > > patch applies against 2.6.33-rc5 and was load/unload tested. We don't have a > repro here, so please give us feedback if this stops the panic. > > will also attach the dump code. Jesse, I tried both patches on a system here and I still get the same panic. FYI, P.
did you capture the panic? were there any messages from the driver before the crash?
(In reply to comment #35) > did you capture the panic? were there any messages from the driver before the > crash? I have the panic, and will attach it shortly. Panic done with sysctl -w kernel.printk=8 ... There are no messages from the driver, and the panic is pretty much the same as the previous panics reported in this BZ. P.
Created attachment 387411 [details] panic with Jesse's patches
For fun we tried the e1000e-1.1.2 driver from sourceforge and the system fails just like it does when using the backported e1000e driver.
I'm trying to reproduce. Do you have any non piketon (like ich9 or ich10R) systems that you can reproduce on too? My immediately accessible systems are ich9/10. Please be absolutely sure your piketon has the latest bios, they were changing the bios every day on those machines, right up to production. I'm in connectathon build hell, no libsocket, I'll probably figure it out just after sending this.
please attach dmidecode output, and any other information regarding the system you're running on. Is it a production piketon? I've got a piketon CRB in my office we used for development, installing 5.5 pre now.
attached is my tests.init I've started two cthon04 runs, two machines connected via the e1000e ports to a switch. each machine is serving nfs, one is configured sync in exports, the other is not. ich9: 00:19.0 Ethernet controller: Intel Corporation 82566DC-2 Gigabit Network Connection (rev 02) ich10r: 00:19.0 Ethernet controller: Intel Corporation 82567LM-2 Gigabit Network Connection I started cthon like so: ./server -N 1000 apparently there are some comments in this bug that I cannot see, so if you want to fill me in, feel free. I'll run with the piketon tomorrow.
Created attachment 387475 [details] tests.init file configuration
Created attachment 387564 [details] dmidecode output from piketon-02 (In reply to comment #47) > please attach dmidecode output, and any other information regarding the system > you're running on. Is it a production piketon? > Hi Jesse -- here is the dmidecode output. AFAIK, this piketon box is a pre-production box, however, we've seen this error crop up on a few other production systems (HP z400 for example) but with much less reproducibility. The panic on those other systems is *identical* (I cannot stress that enough -- the panics are *exactly the same*) as the panic in this BZ. More information on reproducibility to follow shortly, P.
Jesse, given the nature of this BZ I'm going to send you the reproducer details in a private email and cc a few others as well. P.
Created attachment 387644 [details] untested first draft of my slab overrun trapper ITs untested and only works on x86_64, but this is a first draft of my slab trap patch. It appends a page to the 2048 slab, aligns the actual object to the end of the first page, and marks the second page read-only. This should, when working properly trap the corruption of the 2048 slab when it occurs, in the context of the corruptor. I need to validate its functionality still, but this is pretty close.
Created attachment 387745 [details] new version of trap patch Here you go, I've still got to verify it, but this will let you boot. I'm going to try do a version 2 (I think I can clean it up significantly), but we should try rolling with this for a bit to see if we catch anything
in response to commetn #71, thats odd, its booting fine for me on hp-windsor. Try the above patch and see if anything happens
Hmmm...still not working correctly. It now boots a bit further, but eventually hangs after trying to start up udev: --------------------[snip]------------------------ Switching to new root and running init. unmounting old /dev unmounting old /proc unmounting old /sys type=1404 audit(1264871459.791:2): enforcing=1 old_enforcing=0 auid=4294967295 ses=4294967295 BUG: warning at mm/slab.c:2333/check_irq_off() (Not tainted) Call Trace: [<ffffffff80017ff0>] cache_grow+0x463/0x4e2 [<ffffffff8005d12e>] cache_alloc_refill+0x184/0x1d4 [<ffffffff800dbdcb>] __kmalloc+0xbd/0xd7 [<ffffffff800cc14e>] __kzalloc+0x9/0x21 [<ffffffff8013861e>] security_get_bools+0x74/0x1b5 [<ffffffff80130a22>] sel_write_load+0x130/0x324 [<ffffffff80016da1>] vfs_write+0xce/0x174 [<ffffffff8001766e>] sys_write+0x45/0x6e [<ffffffff8005e116>] system_call+0x7e/0x83 BUG: warning at mm/slab.c:2333/check_irq_off() (Not tainted) Call Trace: [<ffffffff8005d074>] cache_alloc_refill+0xca/0x1d4 [<ffffffff800dbdcb>] __kmalloc+0xbd/0xd7 [<ffffffff800cc14e>] __kzalloc+0x9/0x21 [<ffffffff8013861e>] security_get_bools+0x74/0x1b5 [<ffffffff80130a22>] sel_write_load+0x130/0x324 [<ffffffff80016da1>] vfs_write+0xce/0x174 [<ffffffff8001766e>] sys_write+0x45/0x6e [<ffffffff8005e116>] system_call+0x7e/0x83 type=1403 audit(1264871460.166:3): policy loaded auid=4294967295 ses=4294967295 INIT: version 2.86 booting Welcome to Red Hat Enterprise Linux Server Press 'I' to enter interactive startup. Setting clock (localtime): Sat Jan 30 17:11:02 EST 2010 [ OK ] Starting udev: --------------------[snip]------------------------ ...it hung in that spot for >5 mins and the console stopped responding. Sysrq doesn't seem to work correctly through the serial console here, so I can't tell exactly what's wrong.
Got it to boot after a few attempts and started up the test. It took a while and never quite panic'ed in the way we expected: BUG: scheduling while atomic: swapper/0x10000100/0 Call Trace: <IRQ> [<ffffffff8006343d>] __sched_text_start+0x7d/0xbd6 [<ffffffff8008d6a7>] enqueue_task+0x41/0x56 [<ffffffff8008d712>] __activate_task+0x56/0x6d [<ffffffff8008f56b>] __cond_resched+0x1c/0x44 [<ffffffff800640cb>] cond_resched+0x37/0x42 [<ffffffff800a3794>] down_write+0x9/0x12 [<ffffffff8008363a>] change_page_attr_addr+0x43/0xf2 [<ffffffff800dc2d3>] slab_destroy+0x101/0x19f [<ffffffff800dc50c>] free_block+0x19b/0x1db [<ffffffff800dc697>] cache_flusharray+0x8b/0xba [<ffffffff8000b82c>] kfree+0x1fa/0x215 [<ffffffff80029187>] __kfree_skb+0x11/0x1a [<ffffffff8022c86c>] skb_free_datagram+0xc/0x31 [<ffffffff885425b2>] :sunrpc:xs_udp_data_ready+0x17c/0x193 [<ffffffff802297ca>] sock_queue_rcv_skb+0x15f/0x16b [<ffffffff8025dcc5>] __udp_queue_rcv_skb+0x9/0x5d [<ffffffff80053f69>] udp_queue_rcv_skb+0x283/0x2d6 [<ffffffff80053b9e>] udp_rcv+0x3e5/0x52d [<ffffffff8003540b>] ip_local_deliver+0x19d/0x263 [<ffffffff80036579>] ip_rcv+0x539/0x57c [<ffffffff80020faa>] netif_receive_skb+0x470/0x49f [<ffffffff881b139c>] :e1000e:e1000_receive_skb+0x1b5/0x1d6 [<ffffffff881b5af7>] :e1000e:e1000_clean_rx_irq+0x283/0x322 [<ffffffff881b3b8c>] :e1000e:e1000_clean+0x7c/0x29a [<ffffffff8000cbfb>] net_rx_action+0xac/0x1e0 [<ffffffff881b3a1c>] :e1000e:e1000_intr_msi+0xd6/0xe0 [<ffffffff80012762>] __do_softirq+0x89/0x133 [<ffffffff8005f2fc>] call_softirq+0x1c/0x28 [<ffffffff8006dba8>] do_softirq+0x2c/0x85 [<ffffffff8006da30>] do_IRQ+0xec/0xf5 [<ffffffff8005e615>] ret_from_intr+0x0/0xa <EOI> [<ffffffff8019d5aa>] acpi_processor_idle_simple+0x17d/0x30e [<ffffffff8019cce3>] acpi_safe_halt+0x25/0x36 [<ffffffff8019d50d>] acpi_processor_idle_simple+0xe0/0x30e [<ffffffff8019d42d>] acpi_processor_idle_simple+0x0/0x30e [<ffffffff8004a19e>] cpu_idle+0x95/0xb8 [<ffffffff800786bc>] start_secondary+0x495/0x4a4 NMI Watchdog detected LOCKUP on CPU 0 CPU 0 Modules linked in: nfs fscache nfs_acl autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc ipv6 xfrm_nalgo crypto_api cpufreq_ondemand acpi_cpufreq freq_table dm_m ultipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi butto n battery asus_acpi acpi_memhotplug ac lp joydev snd_hda_intel snd_seq_dummy snd _seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd _pcm i2c_i801 snd_timer snd_page_alloc snd_hwdep i2c_core snd parport_serial par port_pc parport shpchp soundcore sr_mod cdrom e1000e pcspkr sg dm_raid45 dm_mess age dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 2817, comm: automount Not tainted 2.6.18-185.el5.slabtrap.1 #4 RIP: 0010:[<ffffffff80065bfc>] [<ffffffff80065bfc>] .text.lock.spinlock+0x2/0x3 0 RSP: 0018:ffff810125dedd90 EFLAGS: 00000086 RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000000000 RDX: ffff81010448ef40 RSI: 00000000000000d0 RDI: ffff81010448e680 RBP: 00000000000000d0 R08: ffff810125dec000 R09: 00000000000000d0 R10: 0000000000000000 R11: ffff810133a33860 R12: ffff81010448ef40 R13: ffff81010448e640 R14: 0000000000000004 R15: ffff810104499380 FS: 00000000415c9940(0063) GS:ffffffff803ca000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000003a9989a3a0 CR3: 00000001287c4000 CR4: 00000000000006e0 Process automount (pid: 2817, threadinfo ffff810125dec000, task ffff810133a33860 ) Stack: ffffffff8005d01d 000000d000000010 0000000000000246 00000000000000d0 ffff810104499380 ffffffff8004a7ee 00000000424a49d0 0000000000010800 ffffffff8000ae35 ffff8101349d79d8 ffffffff80316e70 0000000000000002 Call Trace: [<ffffffff8005d01d>] cache_alloc_refill+0x73/0x1d4 [<ffffffff8004a7ee>] audit_alloc+0x70/0x123 [<ffffffff8000ae35>] kmem_cache_alloc+0x86/0xa3 [<ffffffff8004a7ee>] audit_alloc+0x70/0x123 [<ffffffff8012dbbf>] selinux_task_alloc_security+0x1e/0x55 [<ffffffff8001fab1>] copy_process+0x562/0x15eb [<ffffffff8009f459>] alloc_pid+0x1f6/0x292 [<ffffffff80031c00>] do_fork+0x69/0x1c1 [<ffffffff8005e28d>] tracesys+0xd5/0xe0 [<ffffffff8005e427>] ptregscall_common+0x67/0xac Code: 83 3f 00 7e f9 e9 7f fe ff ff f3 90 83 3f 00 7e f9 e9 f9 fe Kernel panic - not syncing: nmi watchdog BUG: warning at kernel/panic.c:137/panic() (Not tainted) Call Trace: <NMI> [<ffffffff80092913>] panic+0x1da/0x1eb [<ffffffff8006caef>] _show_stack+0xdb/0xea [<ffffffff8006cbe2>] show_registers+0xe4/0x100 [<ffffffff800662c5>] die_nmi+0x66/0xa3 [<ffffffff80066a0b>] nmi_watchdog_tick+0x157/0x1d3 [<ffffffff80066629>] default_do_nmi+0x81/0x225 [<ffffffff80066896>] do_nmi+0x43/0x61 [<ffffffff80065eef>] nmi+0x7f/0x88 [<ffffffff80065bfc>] .text.lock.spinlock+0x2/0x30 <<EOE>> [<ffffffff8005d01d>] cache_alloc_refill+0x73/0x1d4 [<ffffffff8004a7ee>] audit_alloc+0x70/0x123 [<ffffffff8000ae35>] kmem_cache_alloc+0x86/0xa3 [<ffffffff8004a7ee>] audit_alloc+0x70/0x123 [<ffffffff8012dbbf>] selinux_task_alloc_security+0x1e/0x55 [<ffffffff8001fab1>] copy_process+0x562/0x15eb [<ffffffff8009f459>] alloc_pid+0x1f6/0x292 [<ffffffff80031c00>] do_fork+0x69/0x1c1 [<ffffffff8005e28d>] tracesys+0xd5/0xe0 [<ffffffff8005e427>] ptregscall_common+0x67/0xac
Yeah, this is confusing to me. I'm actually growing a bit concerned about this check_irq_off bug. I started from scratch and made a better implementation of the guard page patch, and despite it being significantly different, I still get the same check irq warning. I'm starting to wonder if we have a unbalance local_irq_enable call somewhere that allowing a double dequeue of an object somewhere thats leading to this sort of corruption. Since I can trip the check_irq_off warning here on my development machine, I'm further instrumenting to track hardirq disable/enables and report who the last enabler was. If I am tripping over another problem that should reveal it. I'll post here when I have something.
Created attachment 387872 [details] cleaner version of slab trap Ok, heres a cleaned up implementation of the slab trap patch. Turns out the page protection altering code did some locking that seemed to be messing up the slab debug checking, thats fixed now Some notes: 1) This version of the patch requires that CONFIG_SLAB_DEBUG be enabled 2) It explicitly disables Red Zone checking for the slabs. With this patch,ideally that shouldn't matter, and it solves a number of problem with alignment when adding a guard page I've tested this and verified that it catches overwrites beyond 2048 bytes (the object is set 1/2 page into the first of two pages, so any overwrite is trapped. Jeff, not sure if you're still working on this over the weekend now, but if you have time to run it trough your test that would be great. Otherwise I'll take it over monday.
Created attachment 388024 [details] enhancement to the slab trp patch same patch as before, but enhanced. It writes 0xcc to the first 128 bytes of the guard page before write protecting it. This should catch any dma overruns on it.
Created attachment 388175 [details] patch to write protect all data pages for all skbs
Created attachment 388317 [details] client side nfs capture w/ tcpdump So, with my most recent slab trap patch, I'm not getting crashes, but part of the problem is certainly reproducing, in that the NIC hangs up and needs to be ifdowned/ifuped to restore function. So since we're seeing part of the problem I did a client side tcpdump capture during the reproducer, attached here. Not sure what to make of it yet, but even during the NIC hang, we seem to receive frames. This kind of suggests that the card is having a problem with transmisson rather than receive (which might explain why my write protection changes on the receive side aren't catching anything, I'm really not sure). Regardless however, I note that the sol-10 server seems to be sending _lots_ of duplicate replies to frames we received many many frames ago. That shouldn't be problematic, but its definately something to look at and explain.
I'm now running against a Solaris 10 x86 nfs server, and I ran through 941 iterations of ./server -N 10000 before getting the message: ===== check for lost reply on non-idempotent requests 100 tries test exclusive create. test negative seek, you should get: read: Invalid argument or lseek: Invalid argument lseek: Invalid argument test rename test truncate truncate succeeded test holey file support Holey file test ok second check for lost reply on non-idempotent requests testing 50 idempotencies in directory "testdir" rmdir 1: Directory not empty special tests failed Tests failed, leaving /mnt/sol-dell1.jf.intel.com mounted ===== as it says above the directory is full of files. I still have not seen any slab corruption reported, could the network that this machine is plugged into have jumbo frame traffic on it? I'm wondering if this is somehow related to the LPE (long packet enable) problems we've had at redhat on some of the other piketon systems (that we cannot reproduce) Neil, will you make sure that your e1000e driver is running in non-jumbo mode with this patch applied? commit 6a9a79019524ed1fd4e2036e947b93077b2d7aff Author: Jesse Brandeburg <jesse.brandeburg> Date: Tue Feb 2 09:42:49 2010 -0800 e1000e: don't use LPE to limit buffer size Signed-off-by: Jesse Brandeburg <jesse.brandeburg> diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c index a042aa0..61a511a 100644 --- a/drivers/net/e1000e/netdev.c +++ b/drivers/net/e1000e/netdev.c @@ -4639,12 +4639,6 @@ static int e1000_change_mtu(struct net_device *netdev, int new_mtu) else adapter->rx_buffer_len = 4096; - /* adjust allocation if LPE protects us, and we aren't using SBP */ - if ((max_frame == ETH_FRAME_LEN + ETH_FCS_LEN) || - (max_frame == ETH_FRAME_LEN + VLAN_HLEN + ETH_FCS_LEN)) - adapter->rx_buffer_len = ETH_FRAME_LEN + VLAN_HLEN - + ETH_FCS_LEN; - if (netif_running(netdev)) e1000e_up(adapter); else
(In reply to comment #89) > > Neil, will you make sure that your e1000e driver is running in non-jumbo mode > with this patch applied? > > commit 6a9a79019524ed1fd4e2036e947b93077b2d7aff > Author: Jesse Brandeburg <jesse.brandeburg> > Date: Tue Feb 2 09:42:49 2010 -0800 > > e1000e: don't use LPE to limit buffer size > > Signed-off-by: Jesse Brandeburg <jesse.brandeburg> > > diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c > index a042aa0..61a511a 100644 > --- a/drivers/net/e1000e/netdev.c > +++ b/drivers/net/e1000e/netdev.c > @@ -4639,12 +4639,6 @@ static int e1000_change_mtu(struct net_device *netdev, > int new_mtu) > else > adapter->rx_buffer_len = 4096; > > - /* adjust allocation if LPE protects us, and we aren't using SBP */ > - if ((max_frame == ETH_FRAME_LEN + ETH_FCS_LEN) || > - (max_frame == ETH_FRAME_LEN + VLAN_HLEN + ETH_FCS_LEN)) > - adapter->rx_buffer_len = ETH_FRAME_LEN + VLAN_HLEN > - + ETH_FCS_LEN; > - > if (netif_running(netdev)) > e1000e_up(adapter); > else I can't say for sure if there are any jumbo frames on the network, but we are not running that patch. The bits are still in place.
(In reply to comment #90) > (In reply to comment #89) > > > > Neil, will you make sure that your e1000e driver is running in non-jumbo mode > > with this patch applied? > > > > commit 6a9a79019524ed1fd4e2036e947b93077b2d7aff > > Author: Jesse Brandeburg <jesse.brandeburg> > > Date: Tue Feb 2 09:42:49 2010 -0800 > > > > e1000e: don't use LPE to limit buffer size > > > > Signed-off-by: Jesse Brandeburg <jesse.brandeburg> > > > > diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c > > index a042aa0..61a511a 100644 > > --- a/drivers/net/e1000e/netdev.c > > +++ b/drivers/net/e1000e/netdev.c <snip> > I can't say for sure if there are any jumbo frames on the network, but we are > not running that patch. The bits are still in place. Removing it now and testing ... fyi. P.
> <snip> > > > I can't say for sure if there are any jumbo frames on the network, but we are > > not running that patch. The bits are still in place. > > Removing it now and testing ... fyi. > > P. Kernel still panics (in exactly the same way as previously noted). P.
Did some more testing today. Tried some page protection strategies on the actual data buffer for the rx skb, and got the server not responding messages, but never traped any corruption. Tried poisoning some slack space in the data buffer to catch scribbing by a dma transfer and never detected any of that either. Gospo had an interesting theory in which it was just the skb_shared info that was getting corrupted previously which would have fit with previous observations as well (given the corruption of rz2). So we're trying some approaches with just that area getting moved away.
Created attachment 388391 [details] interesting-skb-data-alignment.patch While the patch Neil referenced sounds like a great idea -- something about it doesn't work. I'll have to look at it tomorrow. It is attached for those that are curious.
To clairfy, the origional patch I posted in comment 85 works fine, it properly write protects the data page and places the skb_shared_info on the writeable second page, it just doesn't catch any corruption. Its only the variant above that doesn't boot. I _think_ its because we're adding a guard page between the data buffer and the skb_shared_info struct. A 3 page allocation will require that we alloc from the order 2 slab, which is 16k of contiguous data. Thats pretty hard to get in normal conditions, and if netlink or some other early use protocol is allocating skbs during boot, we could run out of available 16k chunks quick. Given that my patch in comment 85 works, but doesn't trap any writes or detect any corruption via dma (if I use poisoning on the slack space), I think we can fix this by reducing the allocation to 2 pages, and pushing the data buffer size such that the end pointer of the skb is PAGE_SIZE-sizeof(skb_shared_info) from the end of the page. That way if someone is only corrupting skb_shared_info (as the patch in comment 94 hypothesizes), we can use the second of 2 pages as a guard page here and achieve the same goal.
so, I just gave the above a shot, and got simmilar results. I observed tx-side hangs, and an eventual complete hang of the NIC, which resulted in loss of connectivity. an ifdown/ifup of the interface brought it back to life. Given that I'm seeing dumps of the tx ring during these operations, I strongly feel we need to start investigating the tx path in the driver. I'm specifically wondering if perhaps the nfs code is sending down a packet with a data field that is causing the tx logic in the nic to fall over and create unpredictable scenarios. I'm going to further instrument the tx path tomorrow.
i tried to do some isolated redzone checking today, I repoduced the issue on a stock -185 kernel, then added a private redzone variable at the end of the skb_shared_info strucutre, then I checked it after we get a tx complete interrupt. As soon as I added the redzone variable, I stopped oopsing the system. still got the tx non-responsiveness, but no oops. Need to investigate this further, checking the private redzone on free.
Created attachment 389102 [details] debug patch 2 -- add routine to check redzone integrity in slabcache This is a later iteration of the patch I was using to check the validity of the slabcaches in the RPC code. I just sprinkles some of these calls around the RPC code: objp = verify_kmalloc_cache(2048, GFP_KERNEL); ...and did some other printk's if that returned a non-NULL, to indicate when the corruption was detected. This version also does a hexdump of 256 bytes around rz2 (which is where the corruption was almost always detected). This patch bails out when it detects the first corrupted object, but it might be interesting too to keep on scanning and see whether there might be contiguous corrupted objects.
Created attachment 389110 [details] debug patch 3 -- more comprehensive patch -- scan and dump all corrupt objects The other patch had a bug, if it detected problems around rz2, it didn't dump_mem the data around that area, but rather at the head of the allocation. This should fix that, and also change the patch to dump out info on any corrupt redzone in the slabcache. Note: not even compile tested. If it's broken, let me know (or send me a patch ;).
so, I decided to take a more layton-esque approach this weekend. given that our page protection approaches, while functional, modified the system enough to prevent these oopses from occuring. Instead i did some lightweight instrumentation that validated the affected redzone in multiple places. I've trapped the violation in netif_receive_skb, which is very early in the receive path, so thats good. the odd part is, i did some checking earlier in the driver, that did not trigger, but I also put in some printks that should always trigger, but do not, even though the function their in appears in the backtrace when we do catch the violation, its causing me question my understanding of the rx path.
Interesting results today. Based on my last comments, I sprinkled BUG() traps throughout the code in the e1000e drivers rx path to ensure i would hit one of them, but when i tested I didn't. Yet, I still managed to hit a BUG() trap in netif_receive_skb(). So I tried booting with eth0 set to ONBOOT=no, I _still_managed to trap on the BUG() halt (with e1000e functions on the stack, perhaps erroneously, despite its being down). I'm beginning to if maybe that our common code is the loopback driver, or something that passes data through the loopback driver. I need to test with lo disabled to see if that avoids the problem. I hope to have results tomorrow. If thats the case, I'll continue by checking SKBs there in the loopback xmit path.
Yeah, so I configured eth0 to not come up during boot, and added some redzone checking the loopback tx/rx routine, loopback_xmit. I trapped several bad frames during boot there, that originated from rpc.statd (not sure if the app has any relevance). I validated that the e1000e driver wasn't loaded at all. looks like there might be some sort of overrun on frames going through loopback. I'm going to instrument back up the stack that I get from these.
Sigh, bad news. AFter digging through the redzone code, I finally found that my check was off bya byte, hence the xmit failures....going back to instrumenting the rx path...
Created attachment 389887 [details] my current debugging patch Ok, so good news, bad news. Bad news is that the bug in my patch was causing the erroneous results above. I was using the wrong offset when checking redzones. With that bug fixed, we're back to the origional behavior. the good news is that this patch is working properly now. With it I managed to get this backtrace: Kernel BUG at drivers/net/e1000e/netdev.c:219 invalid opcode: 0000 [1] SMP last sysfs file: /devices/pci0000:00/0000:00:00.0/irq CPU 3 Modules linked in: nfs(U) fscache(U) nfs_acl(U) autofs4(U) hidp(U) rfcomm(U) l2cap(U) bluetooth(U) lockd(U) sunrpc(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) cpufreq_ondemand(U) acpi_cpufreq(U) freq_table(U) dm_multipath(U) scsi_dh(U) video(U) backlight(U) sbs(U) power_meter(U) hwmon(U) i2c_ec(U) dell_wmi(U) wmi(U) button(U) battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) lp(U) joydev(U) snd_hda_intel(U) snd_seq_dummy(U) snd_seq_oss(U) snd_seq_midi_event(U) snd_seq(U) snd_seq_device(U) snd_pcm_oss(U) snd_mixer_oss(U) snd_pcm(U) snd_timer(U) snd_page_alloc(U) snd_hwdep(U) parport_serial(U) parport_pc(U) snd(U) soundcore(U) parport(U) sr_mod(U) cdrom(U) e1000e(U) i2c_i801(U) i2c_core(U) shpchp(U) sg(U) pcspkr(U) dm_raid45(U) dm_message(U) dm_region_hash(U) dm_mem_cache(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_log(U) dm_mod(U) ahci(U) libata(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U) Pid: 0, comm: swapper Tainted: G 2.6.18-prep #2 RIP: 0010:[<ffffffff881e6745>] [<ffffffff881e6745>] :e1000e:e1000_alloc_rx_buffers+0x20e/0x278 RSP: 0018:ffff81010777fdc0 EFLAGS: 00010287 RAX: 0000000000000000 RBX: ffffc200000eee50 RCX: 0000000000000026 RDX: ffff8101072c7da0 RSI: ffff81013214f1f0 RDI: ffff8101075e8340 RBP: ffff810130844c20 R08: ffff810132150000 R09: 0000000000000001 R10: 0000000000000046 R11: ffffffff800c5604 R12: 00000000000000c2 R13: ffff810136040680 R14: ffff810133fcd1d0 R15: ffff810137a650b8 FS: 0000000000000000(0000) GS:ffff8101075f7c28(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000003a9987a7d0 CR3: 0000000124cc4000 CR4: 00000000000006e0 Process swapper (pid: 0, threadinfo ffff81010777a000, task ffff810107778400) Stack: 000000018008fcbd ffff810136040000 000005f40777fee8 a210724b06c10800 00000000170f9b0f 0000000000000001 ffff810136040680 ffff810130844a00 ffffc200000eeea0 ffff8101315f4c40 ffffffff80468340 ffffffff881e6c7d Call Trace: <IRQ> [<ffffffff881e6c7d>] :e1000e:e1000_clean_rx_irq+0x2e8/0x322 [<ffffffff881e4bfb>] :e1000e:e1000_clean+0x7c/0x2b3 [<ffffffff8000d111>] net_rx_action+0xb6/0x1fc [<ffffffff80012eb6>] __do_softirq+0x94/0x152 [<ffffffff80070c49>] do_softirq+0x35/0xa0 [<ffffffff80070c0b>] do_IRQ+0xfb/0x104 [<ffffffff80060652>] ret_from_intr+0x0/0xf <EOI> [<ffffffff80065ff3>] __sched_text_start+0xc03/0xc3e [<ffffffff801aaa1e>] acpi_processor_idle_simple+0x193/0x324 [<ffffffff801aaa14>] acpi_processor_idle_simple+0x189/0x324 [<ffffffff801aa88b>] acpi_processor_idle_simple+0x0/0x324 [<ffffffff801aa88b>] acpi_processor_idle_simple+0x0/0x324 [<ffffffff8004c06a>] cpu_idle+0x9a/0xbd [<ffffffff8007b8e9>] start_secondary+0x498/0x4a7 I've also seen a failed check on receive in the driver which I didn't record. that tells me that the error is independent of weather or not the skb is on the rx ring, which in turn supports the theory that a cpu based access is causing this corruption. Based on this I have an idea: Given that, with redzoning, size-2048 slabs occupy 1 page per object, I can simply write protect that single page in the following conditions 1) between the times the skb is on the rx ring and the time it is removed by a napi poll 2) when its passed to kfree, until its recalled with a kmalloc, or a slab_destory_object Since the two conditions above constitue periods when the cpu should not ever write to an object, we should be safe. If the corruptor doesn't move due to timing (it won't move due to size changes, as we observed before, since we're not changing the object size). With any luck we can trap the corruptor when it happens. I'll start work in the morning.
Created attachment 390115 [details] todays update Update notes for today are in this text file. I drew some ascii art that doesn't display well in firefox, so I figured I'd just upload it here. interesting stuff....
(In reply to comment #107) > Created an attachment (id=390115) [details] > todays update > > Update notes for today are in this text file. I drew some ascii art that > doesn't display well in firefox, so I figured I'd just upload it here. > interesting stuff.... Interesting, Neil. Have we seen tx-timeouts when not using the debug kernel and our standard reproducer?
I'm not sure, I'll try that. Whats also interesting is that when I added the 'dispose page-spanning skbs' bits to the e1000e tx path, it began corrupting stuff almost immediately in exactly the same way (and _very_ quickly). Not sure if I did something wrong, but I'm going to follow up on that too.
Notes: Last night I realized that this box didn't have an AMD iommu in it (not sure why I ever thought it did). But I added some tracing code and found that it always used the swiotlb library for dma transfers. Looking at that libraries map_single method, I found that we were falling into a clause in which we just tried direct memory-memory dmas, rather than bounce buffering, because the e1000e cards dma_mask was set to 64 bits. Just for curiosities sake, I set the mask to 32 bits, and the system has been running all night (the card has stopped sending/receiving data after a while, but no hangs as of yet, after running all night). Now, it makes perfectly good sense for this card to have a 64 bit dma mask, so I don't see why this should do anything to prevent the problem, but thought it was worth noting.
does this system end up with memory addresses > 4GB ? we have seen the occasional issue with previous e1000 hardware in client systems with > 4GB ram. I wonder if something in the bios or the swiotlb code is off. Sometimes the system has 4GB ram, but the BIOS has to do strange things to remap some of the memory in use by video adapters and io ranges. also, does the driver you're running have the skb_dma_map stuff still in it? That code was removed upstream because it was being buggy (multiple mappings of a single tx buffer did bad things) Also, what happens if you turn on the hardware MMU (VT-d)/DMAR? do you get a different fault? PLEASE GET A RECENT BIOS!! from the dmidecode: + Release Date: 08/13/2009 it should probably be November or newer.
In response to your comments questions, Jesse: Yes, the system has m> 4GB of ram and many of the skbs we allocate wind up above the 4G limit. In fact I had noticed that if I had set the dma_mask of the adapter to 32 bits, the memory corruption problem goes away. Tomorrow on my list of things to try is booting with mem=4G to. I had also thought that the swiotlb library might have a problem, and limiting ram to 4GB should effectively make it a pass through operation. We no longer have the skb_dma_map bits in e1000 or e1000e (I will double check this though, as I only looked very quickly into this question). I had not thought of disabling the DMAR. I assume you can do that in the BIOS? I will look into that at the same time that I look into disabling the BMC Regarding the DMIdecode information, we should have a later bios than that, but IIRC it was dated october, not november, we'll look into a later bios. I know your out the rest of the week, but I'll update with results of the outstanding questions here as soon as I'm able, in case you're keeping tabs while your out.
quick note, just tried booting with mem=4G, still failed.
updated bios to 10.29.09 version, disabled ME, failures still occur.
Please try mem = 3G too
I've gone all the way down to mem=2G today, with no change in behavior, unfortunately. It did occur to me however that the bios might be mapping ram oddly in such a way that low virtual addresses are still located in physical locations above 4GB, so I'm currently hacking up the network stack so that, in addition to setting the device dma_mask to be 32 bits (for transmit bounce buffering), calls to netdev_alloc_skb pass GFP_DMA as a allocation flag, which should guarantee low memory allocations (I know I should use GFP_DMA32, but that flag was very new with the 2.6.18 kernel and doesn't work quite right with the slab allocation code). GFP_DMA should be overkill, and ok for the narrow reproducer we have. I'll post results shortly.
Got some time on Jesse B. piketon box today. Set up the reproducer there and I think I was able to make the problem reproduce. Asking Jesse to look at it more closely to confirm.
I made some good progress on this bug today, I made a driver that is showing that the driver is processing !EOP packets (should never happen on a network with this MTU) with length 0x800, and with the copybreak code disabled too. this explains the memory corruption of the skb_shinfo(skb) area, I think. Next I will prepare a driver that prints the bad skbs, and allocates a much larger area to DMA into. Neil, this sounds much like the other problem we had on the Z200 system where you were getting longer than expected DMA (aka LPE bit didn't work right) I'm suspecting that the NIC hardware is getting confused by something in the packets coming in from the NFS server. The fact that we are getting 0x800 length packets without EOP set means there is likely a hardware bug.
Jesse, I agree completely, what you describe is definately simmilar to what we saw on the z200 systems, and explains why bumping up the rx_buf_size in the driver avoided the problem at hand. Just so that I can be sure I understand what you're seeing packets arrive at the driver with a length of 2048 bytes, but no EOP bit set, yes? Can you send me the patch that you're currently working with. I'd like to put that in place here so see if I can get you any additional info that might be helpful in diagnosing the problem. If you have any suggestions/requests for stuff that you'd be interested in seeing, I'll get on putting it together.
(In reply to comment #121) > Just so that I can be sure I understand what you're seeing packets arrive at > the driver with a length of 2048 bytes, but no EOP bit set, yes? Can you send yes, it appears that simultaneously with the nfs server losing connection, we see corrupt (long) packets make their way to memory, the redzone logic kicks in, and the driver continues to receive long packets. The next step is to hexdump the packet data, and possibly capture with tcpdump the data that is being sent from the sol-10 server (for comparison) > me the patch that you're currently working with. I'd like to put that in place > here so see if I can get you any additional info that might be helpful in > diagnosing the problem. If you have any suggestions/requests for stuff that > you'd be interested in seeing, I'll get on putting it together. I'll attach the patch to my driver that I'm using for debug code. It makes the driver last much longer before a panic occurs. I guess since we are getting kdumps that we can probably print a few of the ring variables (buffer_info start, tx and rx descriptor ring start addresses, and such at open time, and be able to quickly browse memory in the crash tool.
Created attachment 395964 [details] patch to e1000e for debug this patch is to our out of tree driver, and as such may not apply cleanly to your source. the changes are pretty simple however, and I feel it is actually good that the issue reproduces on all drivers we've tried.
hey, jesse, thanks for the debug patch. Just as a quick update, I was talking with gospo about this, and he noted some bits in the rfctl register, which he though might be germaine. Spercifically he asked me to test with the NFSW, NFSR and ISCSI filtering bits set to disable those filters. Unfortunately, even with those bits set, we still encountered the same failure as before. Don't know if thats meaningful to you, but I figured it would be worth sharing, just in case. I'm going to set up your debug patch this afternoon.
I have an updated debug patch, and (last night before bed) I set the RFCTL bits DIS_NFSR and DIS_NFSW, and your failing test ran all night with no issues, and no !EOP packets. So, at this point I believe we have a solid workaround and I am working with the hardware team to figure out exactly why it fixes the issue and what the root cause is.
Created attachment 396127 [details] 2nd debug patch with proposed workaround this patch sets RFCTL 000000c0 which for me allowed the failing test to run for > 10 hours so far.
I was so close. The patch in comment #126 is almost exactly what I did after reading the programmers guide this morning. It looks like I just put those bits in the wrong spot initially since the hardware Neil was testing doesn't appear to evaluate to true for this statement: if (!(adapter->flags & FLAG_IS_ICH) && (pages <= 3) && (PAGE_SIZE <= 16384) && (rctl & E1000_RCTL_LPE)) This was the patch Neil tested. diff --git a/drivers/net/e1000e/defines.h b/drivers/net/e1000e/defines.h index db05ec3..e301e26 100644 --- a/drivers/net/e1000e/defines.h +++ b/drivers/net/e1000e/defines.h @@ -320,6 +320,8 @@ #define E1000_RXCSUM_IPPCSE 0x00001000 /* IP payload checksum enable */ /* Header split receive */ +#define E1000_RFCTL_NFSW_DIS 0x00000040 +#define E1000_RFCTL_NFSR_DIS 0x00000080 #define E1000_RFCTL_ACK_DIS 0x00001000 #define E1000_RFCTL_EXTEN 0x00008000 #define E1000_RFCTL_IPV6_EX_DIS 0x00010000 diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c index 88d54d3..d6eba1c 100644 --- a/drivers/net/e1000e/netdev.c +++ b/drivers/net/e1000e/netdev.c @@ -2398,6 +2398,12 @@ static void e1000_setup_rctl(struct e1000_adapter *adapter) */ rfctl |= (E1000_RFCTL_IPV6_EX_DIS | E1000_RFCTL_NEW_IPV6_EXT_DIS); + /* + * disable packet split for nfs write request headers and + * nfs read reply headers + */ + rfctl |= (E1000_RFCTL_NFSW_DIS | + E1000_RFCTL_NFSR_DIS); ew32(RFCTL, rfctl);
yeah, thats looking _way_ better! I'm running the updated patch on piketon-02, and its working smoothly. I'm going to let this run over night to be sure, but its definately looking good. So, looking forward, jesse, how shall we handle this? do you see disabling the NFS filters universally in e1000e as the right solution, or can it be narrowed down to certain revisions? Or will this not be the final fix at all (opting instead for a eeprom/firmware change to modify the behavior of whatever it is the silicon is doing when these filters are on)? This all assumes of course we don't hit more problems, but this is looking good so far. Thanks for all your hard work on this!
Created attachment 396144 [details] e1000e-nfs-packet-split-disable2.patch Jesse, this is the patch Neil has been running successfully for a while. It's like yours, but compiles. ;-) Anyway do you think this will go upstream or are you likely to try and fix the actual problem quickly rather than posting this in the interim? We would like to ship something that works so we are inclined to ship this patch since we literally need to decided *today* what will be done for 5.5.
To be conservative, I would limit it to FLAG_IS_ICH parts to set those bits. it *should* be safe to have them set for all e1000e parts, but its a lot fewer to test if we just do the change within a FLAG_IS_ICH section. otherwise andy your patch looks fine, and thanks for fixing it up. :-) @Neil: The driver fix is the only way to go at this point.
@Andy, if its ok with you, I'll make & test jesse's requested modification, by adding a adapter->flag & FLAG_IS_ICH conditional around your rfctl setting tonight, so we're ready to do something with this in time for snap4 tomorrow. @Jesse, how do you want to handle upstream, given that the driver fix is our way forward? Shall Andy or I just post it, or do you want to handle upstream after you roll it around internally for a bit?
Created attachment 396206 [details] updated e1000e patch heres the updated patch I'm running. Its Andy's patch, wrapped in a conditional check for FLAG_IS_ICH. I've had it running well for an hour under our reproducer. I'm going to leave it running overnight.
I'm going to propose a slightly different patch with the same functionality, which we'll maintain copyright on, and modifies ich8lan.c. We'll post and test it via our kernel patch process and jeff kirsher. also, it is worth mentioning the details that you sent me in email to reproduce this. solaris-10 nfs server RHEL5 client on RHEL5 client, mount -t nfs -o udp,nfsvers=2,rw 192.168.10.10:/export/home/connectathon /mnt in that nfs directory, put these two files. [jbrandeb@jbrandeb-piketon nhorman]$ cat fail.sh #!/bin/sh -x while true; do date ./make_dir sleep 2; rm -rf rm-test/ done <EOF> [jbrandeb@jbrandeb-piketon nhorman]$ cat make_dir #!/bin/sh echo create mkdir rm-test cd rm-test touch `seq 1 50` cd .. <EOF>
Created attachment 396650 [details] alternate code location for change here is the patch similar to the one we'll push upstream.
Thanks, Jesse. That seems reasonable to me.
I confirmed on Friday that I could not reproduce this issue at all on ICH10.
Jesse, I cannot reproduce on ICH10 either. I tried a McCreary with 82567LM-3 (8086:10de) and it ran for around an hour with no problems.
Jesse, I also tried another PCH system with a X3440 (Lynnfield) processor while running a debug kernel and cannot get it to fail there either.
Jesse, have you had any luck recreating this (it works best with a -debug kernel) on any system other than the piketon system?
(In reply to comment #144) > Jesse, have you had any luck recreating this (it works best with a -debug > kernel) on any system other than the piketon system? short answer: no. That's a good thing, and we can move forward with the workaround I attached as it won't hurt any of the other systems (esp since they don't use packet split)
*** Bug 541424 has been marked as a duplicate of this bug. ***
in kernel-2.6.18-191.el5 You can download this test kernel from http://people.redhat.com/jwilson/el5 Please update the appropriate value in the Verified field (cf_verified) to indicate this fix has been successfully verified. Include a comment with verification details.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0178.html