Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 702508

Summary: TCP traffic to IPv6 causes 32 bit Linux OS to reboot
Product: Red Hat Enterprise Linux 6 Reporter: Atita <atita.shirwaikar>
Component: kernelAssignee: Jiri Olsa <jolsa>
Status: CLOSED ERRATA QA Contact: Botu Sun <bosun>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.0CC: agospoda, alexander.h.duyck, bosun, dtian, haliu, jesse.brandeburg, kmcmartin, nhorman
Target Milestone: rc   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.32-164.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 13:22:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Atita 2011-05-05 22:10:26 UTC
Description of problem:
Panic seems to have been caused simply by SUT sending ipv6 neighbor discovery packets.Tried a file transfer using scp, and that panicked the kernel.The size of the file did not seem to matter.Problem does not happen with Ipv4.Bug is not seen in the previous versions(RHEL 5.6)


Version-Release number of selected component (if applicable):
Linux Redhat 2.6.32-71.el6.i686

How reproducible:
Reproduces easily on any Intel 82599 based nic connected B2B


Steps to Reproduce:
1.Install the OS
2.load the out-of-tree ixgbe-3.3.9 driver.[*]
3.Bring up the network
4.Transfer a file with scp over IPv6.
  
Actual results:
The system reboots immediately due to a panic.

Expected results:
File transfer should have completed


Additional info:
[*] RHEL does not support out-of-tree driver.This change would likely be backported to RHEL driver and needs to be fixed in the future kernel release.
RHEL kernel passes a corrupted sk_buff to the driver.transport header field is zero.Kernel panics when it reads the invalid memory location.


BUG: unable to handle kernel NULL pointer dereference at 0000000d
IP: [<f852c108>] ixgbe_xmit_frame_ring+0xa18/0xdf0 [ixgbe]
*pdpt = 0000000026abb001 *pde = 000000036d718067
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
Modules linked in: netconsole configfs nfsd lockd nfs_acl auth_rpcgss exportfs autofs4 sunrpc cpufreq_ondemand acpi_cpufreq ipv6 sr_mod cdrom dm
_mirror dm_region_hash dm_log sg microcode ghes hed i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core ixgbe(U) igb dc
a ext3 jbd mbcache sd_mod crc_t10dif usb_storage ahci dm_mod [last unloaded: scsi_wait_scan]

Pid: 0, comm: swapper Not tainted (2.6.32-131.0.5.el6.i686 #1) S5520UR
EIP: 0060:[<f852c108>] EFLAGS: 00010246 CPU: 10
EIP is at ixgbe_xmit_frame_ring+0xa18/0xdf0 [ixgbe]
EAX: f0236890 EBX: 00000009 ECX: dd9c7740 EDX: c1692d40
ESI: 0000dd86 EDI: 00000000 EBP: dd9c7740 ESP: f71c7aec
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process swapper (pid: 0, ti=f71c6000 task=f71a9ab0 task.ti=f71c6000)
Stack:
 00000246 fa5ffc41 f71c7b28 00000000 00000004 ebb1ccc0 ef98e5a0 c1781e40
<0> 00000000 0000017f fa883000 f71c7c8c ef897c90 dd9c7740 ef897c8c 0000000a
<0> 00000004 c0bed5c0 00000004 ef9a5840 fa5ffc86 f71c7c70 00000000 c0a5ef60
Call Trace:
 [<fa5ffc41>] ? ip6_pol_route+0x2a1/0x2d0 [ipv6]
 [<fa5ffc86>] ? ip6_pol_route_output+0x16/0x20 [ipv6]
 [<c077f03e>] ? dev_hard_start_xmit+0x18e/0x3e0
 [<fa5ffc70>] ? ip6_pol_route_output+0x0/0x20 [ipv6]
 [<c0795873>] ? sch_direct_xmit+0x113/0x180
 [<c079137c>] ? fib_rules_lookup+0x8c/0xc0
 [<c0783355>] ? dev_queue_xmit+0x375/0x4c0
 [<fa5f2178>] ? ip6_output_finish+0x58/0xd0 [ipv6]
 [<fa5f38a8>] ? ip6_xmit+0x3e8/0x490 [ipv6]
 [<fa5ffbc0>] ? ip6_pol_route+0x220/0x2d0 [ipv6]
 [<fa61566d>] ? tcp_v6_send_response+0x3bd/0x460 [ipv6]
 [<fa616a0c>] ? tcp_v6_rcv+0x35c/0x7e0 [ipv6]
 [<fa5f5a2f>] ? ip6_input_finish+0x10f/0x390 [ipv6]
 [<fa5f54fd>] ? ip6_rcv_finish+0x2d/0x30 [ipv6]
 [<c077e899>] ? __netif_receive_skb+0x339/0x5e0
 [<c078046f>] ? netif_receive_skb+0x3f/0x50
 [<c078054f>] ? napi_skb_finish+0x2f/0x40
 [<c0782885>] ? napi_gro_receive+0x25/0x40
 [<f852a8e8>] ? ixgbe_poll+0x718/0x1520 [ixgbe]
 [<c0473eeb>] ? autoremove_wake_function+0x1b/0x40
 [<c043a2c7>] ? __wake_up_common+0x47/0x70
 [<c0470000>] ? schedule_on_each_cpu+0x10/0x130
 [<c078297e>] ? net_rx_action+0xde/0x280
 [<c0459bbf>] ? __do_softirq+0x8f/0x1b0
 [<c04b5491>] ? move_native_irq+0x11/0x50
 [<c0459d1d>] ? do_softirq+0x3d/0x50
 [<c0459e75>] ? irq_exit+0x65/0x70
 [<c040b1c0>] ? do_IRQ+0x50/0xc0
 [<c040a030>] ? common_interrupt+0x30/0x38
 [<c045007b>] ? unshare_files+0xb/0xa0
 [<c0640c8f>] ? intel_idle+0xaf/0x140
 [<c0752c12>] ? cpuidle_idle_call+0x72/0x100
 [<c04089a4>] ? cpu_idle+0x94/0xd0
 [<c081e2a9>] ? start_secondary+0x20d/0x252
Code: 20 20 31 db 8b 6c 24 34 0f b7 41 24 8b 8d 98 00 00 00 e9 8c f9 ff ff 80 78 09 06 0f 85 3a fa ff ff 8b 4c 24 34 8b b9 94 00 00 00 <0f> b6 4f 0d f6 c1 01 0f 85 23 fa ff ff 80 e1 02 75 0d 8b 6c 24
EIP: [<f852c108>] ixgbe_xmit_frame_ring+0xa18/0xdf0 [ixgbe] SS:ESP 0068:f71c7aec
CR2: 000000000000000d
---[ end trace 430ec7cb37718f4b ]---
Kernel panic - not syncing: Fatal exception in interrupt
Pid: 0, comm: swapper Tainted: G      D    ----------------   2.6.32-131.0.5.el6.i686 #1
Call Trace:
 [<c0821d7e>] ? panic+0x42/0xf9
 [<c0825b88>] ? oops_end+0xc8/0xd0
 [<c0432472>] ? no_context+0xc2/0x190
 [<c043269f>] ? bad_area_nosemaphore+0xf/0x20
 [<c0432b18>] ? __do_page_fault+0x2d8/0x420
 [<c07957c6>] ? sch_direct_xmit+0x66/0x180
 [<c07830df>] ? dev_queue_xmit+0xff/0x4c0
 [<fa5f2178>] ? ip6_output_finish+0x58/0xd0 [ipv6]
 [<c082751a>] ? do_page_fault+0x2a/0x90
 [<c08274f0>] ? do_page_fault+0x0/0x90
 [<c0824f67>] ? error_code+0x73/0x78
 [<f852c108>] ? ixgbe_xmit_frame_ring+0xa18/0xdf0 [ixgbe]
 [<fa5ffc41>] ? ip6_pol_route+0x2a1/0x2d0 [ipv6]
 [<fa5ffc86>] ? ip6_pol_route_output+0x16/0x20 [ipv6]
 [<c077f03e>] ? dev_hard_start_xmit+0x18e/0x3e0
 [<fa5ffc70>] ? ip6_pol_route_output+0x0/0x20 [ipv6]
 [<c0795873>] ? sch_direct_xmit+0x113/0x180
 [<c079137c>] ? fib_rules_lookup+0x8c/0xc0
 [<c0783355>] ? dev_queue_xmit+0x375/0x4c0
 [<fa5f2178>] ? ip6_output_finish+0x58/0xd0 [ipv6]
 [<fa5f38a8>] ? ip6_xmit+0x3e8/0x490 [ipv6]
 [<fa5ffbc0>] ? ip6_pol_route+0x220/0x2d0 [ipv6]
 [<fa61566d>] ? tcp_v6_send_response+0x3bd/0x460 [ipv6]
 [<fa616a0c>] ? tcp_v6_rcv+0x35c/0x7e0 [ipv6]
 [<fa5f5a2f>] ? ip6_input_finish+0x10f/0x390 [ipv6]
 [<fa5f54fd>] ? ip6_rcv_finish+0x2d/0x30 [ipv6]
 [<c077e899>] ? __netif_receive_skb+0x339/0x5e0
 [<c078046f>] ? netif_receive_skb+0x3f/0x50
 [<c078054f>] ? napi_skb_finish+0x2f/0x40
 [<c0782885>] ? napi_gro_receive+0x25/0x40
 [<f852a8e8>] ? ixgbe_poll+0x718/0x1520 [ixgbe]
 [<c0473eeb>] ? autoremove_wake_function+0x1b/0x40


Assembly 

0xfce20144 <ixgbe_xmit_frame_ring+2564>:        cmpb   $0x6,0x9(%eax)
0xfce20148 <ixgbe_xmit_frame_ring+2568>:        jne    0xfce1fb88
0xfce2014e <ixgbe_xmit_frame_ring+2574>:        mov    0x34(%esp),%ecx
0xfce20152 <ixgbe_xmit_frame_ring+2578>:        mov    0x94(%ecx),%edi
0xfce20158 <ixgbe_xmit_frame_ring+2584>:        movzbl 0xd(%edi),%ecx

edi is 0.Kernel panic!!!

Comment 2 Neil Horman 2011-05-06 00:36:38 UTC
Just to clarify, this bug only happens if you use the out of tree driver, correct?  Can you reproduce it with the shipping driver?

Comment 3 Atita 2011-05-06 00:42:26 UTC
Correct.It does not happen with the shipped driver.

Comment 4 Neil Horman 2011-05-06 01:05:12 UTC
Well, as you note we don't support the out of tree driver, and the panic is clearly occurring inside that code.  We might backport whatever change is causing this, but only after you get it fixed.  What exactly are you asking us to do here?

Comment 5 Andy Gospodarek 2011-05-06 02:36:13 UTC
If the patch is already upstream of if you have the patch that you plan to post upstream that will likely cause this to happen with a backported driver, please attach it to the bug so we can test.  Thanks!

Comment 6 RHEL Program Management 2011-05-06 06:01:01 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 7 Alexander Duyck 2011-05-06 17:34:53 UTC
This panic has been root caused to the fact that the driver is being passed IPv6 SKBs for transmit that have an uninitialized transport header.  We do not see the same problem with the linux-2.6 kernel and are still trying to determine what patch resolved this issue.

Comment 8 Alexander Duyck 2011-05-06 18:01:42 UTC
I believe the kernel is likely missing the patch below which is causing the issues we are seeing:

From: Herbert Xu <herbert.org.au>
Date: Wed, 21 Apr 2010 07:47:15 +0000 (-0700)
Subject: ipv6: Fix tcp_v6_send_response transport header setting.
X-Git-Tag: v2.6.34-rc6~30^2~18
X-Git-Url: http://gitlad.jf.intel.com/git/?p=torvalds%2Flinux-2.6%2F.git;a=commitdiff_plain;h=6651ffc8e8bdd5fb4b7d1867c6cfebb4f309512c

ipv6: Fix tcp_v6_send_response transport header setting.

My recent patch to remove the open-coded checksum sequence in
tcp_v6_send_response broke it as we did not set the transport
header pointer on the new packet.

Actually, there is code there trying to set the transport
header properly, but it sets it for the wrong skb ('skb'
instead of 'buff').

This bug was introduced by commit
a8fdf2b331b38d61fb5f11f3aec4a4f9fb2dedcb ("ipv6: Fix
tcp_v6_send_response(): it didn't set skb transport header")

Signed-off-by: Herbert Xu <herbert.org.au>
Signed-off-by: David S. Miller <davem>
---

diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index c92ebe8..075f540 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1015,7 +1015,7 @@ static void tcp_v6_send_response(struct sk_buff *skb, u32 seq, u32 ack, u32 win,
 	skb_reserve(buff, MAX_HEADER + sizeof(struct ipv6hdr) + tot_len);
 
 	t1 = (struct tcphdr *) skb_push(buff, tot_len);
-	skb_reset_transport_header(skb);
+	skb_reset_transport_header(buff);
 
 	/* Swap the send and the receive. */
 	memset(t1, 0, sizeof(*t1));

Comment 9 Jiri Olsa 2011-05-25 09:39:48 UTC
I understand the issue is not reproducible without the out of tree ixgbe
driver. I port the patch from Comment 8, and rpms are published here:

http://people.redhat.com/jolsa/702508/

Could you please make the test and let me know?

thanks

Comment 11 Alexander Duyck 2011-06-08 22:51:39 UTC
We added a workaround to the out-of-tree driver that is essentially masking the issue but it doesn't solve it.  We are checking for null pointer and on 32 bit systems this is okay, but it doesn't resolve the issue on 64 bit systems.

On 64bit systems this issue doesn't cause a panic. Instead an invalid hash will be generated and issued to the hardware that will have no effect.  The fix I mentioned in comment 8 will resolve the issue for both 64 and 32 bit so that it is resolved in both cases.

Comment 12 RHEL Program Management 2011-06-16 12:50:23 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 13 Kyle McMartin 2011-06-29 21:49:28 UTC
Patch(es) available on kernel-2.6.32-164.el6

Comment 18 errata-xmlrpc 2011-12-06 13:22:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2011-1530.html