Bug 691086

Summary: A kernel oops related to network
Product: Red Hat Enterprise Linux 6 Reporter: colyli
Component: kernelAssignee: Neil Horman <nhorman>
Status: CLOSED CURRENTRELEASE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.0CC: eguan, nhorman
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-09-21 17:15:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description colyli 2011-03-26 16:29:13 UTC
Description of problem:
After run 2.6.32-71.18.2 kernel for 42+ hours on a squid server, we observed a kernel oops related t network.

Version-Release number of selected component (if applicable):
2.6.32-71.18.2 on X86_64, we rebild this kernel from srpm and run it on RHEL5 installation.

How reproducible:
Not sure how to reproduce, just run squid workload on the server, after 42+ hours, we observed this oops from log. Didn't find similar oops on other machines.

Steps to Reproduce:
1. run squid workload
2. around 42+ hours
3. only observe onece
  
Actual results:
So far it seems the system will working

Expected results:
Maybe the kernel oops should not be there. 

Additional info:
Attaching the oops info, please note the kernel is running on RHEL5.

[152286.466138] ------------[ cut here ]------------
[152286.472442] WARNING: at net/ipv4/tcp_input.c:2919 tcp_ack+0xc9b/0x168d() (Not tainted)
[152286.482689] Hardware name: CS24-TY
[152286.488837] Modules linked in: ext2 bonding ipv6 dm_mirror dm_multipath video output sbs sbshc power_meter hwmon acpi_pad parport sg igb serio_raw i7core_edac 

iTCO_wdt dcdbas iTCO_vendor_support ahci i2c_i801 edac_core ioatdma dca dm_region_hash dm_log dm_mod megaraid_sas shpchp mptsas mptscsih mptbase scsi_transport_sas 

uhci_hcd ohci_hcd ehci_hcd
[152286.531076] Pid: 0, comm: swapper Not tainted 2.6.32-71.18.2.el5.x86_64 #1
[152286.541169] Call Trace:
[152286.544388]  <IRQ>  [<ffffffff81405e32>] ? tcp_ack+0xc9b/0x168d
[152286.552189]  [<ffffffff8105b94b>] warn_slowpath_common+0x8d/0xa6
[152286.560651]  [<ffffffff8105b97e>] warn_slowpath_null+0x1a/0x1c
[152286.568640]  [<ffffffff81405e32>] tcp_ack+0xc9b/0x168d
[152286.575413]  [<ffffffff81407aa0>] tcp_rcv_established+0xcd/0x566
[152286.583163]  [<ffffffff8140df6b>] tcp_v4_do_rcv+0x196/0x352
[152286.591026]  [<ffffffff8106260a>] ? local_bh_enable+0x12/0x14
[152286.599189]  [<ffffffff8140f628>] tcp_v4_rcv+0x423/0x631
[152286.606234]  [<ffffffff813f3831>] ? ip_local_deliver_finish+0x152/0x1fa
[152286.614716]  [<ffffffff813f3831>] ip_local_deliver_finish+0x152/0x1fa
[152286.623599]  [<ffffffff813f3c2c>] ip_local_deliver+0x72/0x7d
[152286.630793]  [<ffffffff813f365d>] ip_rcv_finish+0x371/0x38b
[152286.638677]  [<ffffffff8140ce7f>] ? tcp4_gro_receive+0x9b/0xa4
[152286.646744]  [<ffffffff813f3b7b>] ip_rcv+0x2a2/0x2e1
[152286.653612]  [<ffffffff813cb9c8>] netif_receive_skb+0x448/0x47f
[152286.661247]  [<ffffffff813cba9b>] napi_skb_finish+0x2b/0x43
[152286.668123]  [<ffffffff813cbf14>] napi_gro_receive+0x2f/0x34
[152286.675784]  [<ffffffffa00ea52d>] igb_poll+0x83f/0xba3 [igb]
[152286.681994]  [<ffffffff810181f0>] ? read_tsc+0xd/0x25
[152286.687392]  [<ffffffff81082c9e>] ? timekeeping_get_ns+0x1b/0x3d
[152286.693952]  [<ffffffff8104b406>] ? __enqueue_entity+0x79/0x7b
[152286.700285]  [<ffffffff810624cd>] ? local_bh_enable_ip+0xe/0x10
[152286.706609]  [<ffffffff813ced7c>] net_rx_action+0xc6/0x1c3
[152286.712561]  [<ffffffff81062a2d>] __do_softirq+0xd2/0x194
[152286.718323]  [<ffffffff810b646c>] ? handle_IRQ_event+0x66/0x120
[152286.724614]  [<ffffffff81012e8c>] call_softirq+0x1c/0x30
[152286.730296]  [<ffffffff81014757>] do_softirq+0x46/0x87
[152286.735789]  [<ffffffff810628b5>] irq_exit+0x3b/0x7a
[152286.741110]  [<ffffffff814668e1>] do_IRQ+0x99/0xb0
[152286.746259]  [<ffffffff81012693>] ret_from_intr+0x0/0x11
[152286.751927]  <EOI>  [<ffffffff812ba90c>] ? acpi_idle_enter_bm+0x232/0x267
[152286.759123]  [<ffffffff812ba905>] ? acpi_idle_enter_bm+0x22b/0x267
[152286.765697]  [<ffffffff813a1b75>] ? menu_select+0x15a/0x228
[152286.771655]  [<ffffffff813a0d55>] cpuidle_idle_call+0x87/0xe2
[152286.777799]  [<ffffffff81010c68>] cpu_idle+0xa5/0xd4
[152286.783111]  [<ffffffff8145c26f>] ? start_secondary+0x1ea/0x237
[152286.789402]  [<ffffffff8145c27d>] start_secondary+0x1f8/0x237
[152286.795536] ---[ end trace 23becf1ca0e310bc ]---

Comment 3 RHEL Program Management 2011-04-04 02:46:51 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 4 Neil Horman 2011-09-21 10:52:03 UTC
Is this happening on the latest kernel?  This seems like a problem we've seen and corrected already, I'll try find the exact bug we fixed.

Comment 5 colyli 2011-09-21 16:00:55 UTC
We build 6.1 kernel from the srpm, and run it on several hardware, currently we don't have similar report for more then 1 month.

Comment 6 Neil Horman 2011-09-21 17:15:38 UTC
Copy that, I'll close this then, and post the patch that fixed it here when I find it.