Bug 525223 - Kernel crash when using UDP multicast with bridges on RHEL 5.4
Summary: Kernel crash when using UDP multicast with bridges on RHEL 5.4
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.4
Hardware: x86_64
OS: Linux
low
high
Target Milestone: rc
: ---
Assignee: Stanislaw Gruszka
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-09-23 17:43 UTC by Bruno Cornec
Modified: 2010-02-03 13:17 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-02-03 13:17:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
kernel crash capture (62.23 KB, image/jpeg)
2009-09-23 17:43 UTC, Bruno Cornec
no flags Details

Description Bruno Cornec 2009-09-23 17:43:11 UTC
Created attachment 362334 [details]
kernel crash capture

Description of problem:

Kernel crash when using UDP multicast with bridges on RHEL 5.4 on an HP BL 460 G6 using bnx2x driver: we use an HP telecommunication software on RHEL 5.4, and when launching the part generating multicast UDP trafic, with bridges enabled, it crashes the kernel with the attached capture.



Version-Release number of selected component (if applicable):
5.4

How reproducible:
Each time we launched that part of our application. We have not been able yet to reproduce it in a more std context so it could be easy to reproduce outside our sw context.

Steps to Reproduce:
1. setup bridges
2. launch the ocmp-bre part
3. look at crash in the iLO
  
Actual results:
kernel crash

Expected results:
kernel works ;-)

Additional info:

Comment 1 Stanislaw Gruszka 2009-10-01 09:48:07 UTC
Note to myself:

(gdb) l *(br_nf_pre_routing_finish+0x20)
0x540e is in br_nf_pre_routing_finish (net/bridge/br_netfilter.c:246).
241     {
242             struct net_device *dev = skb->dev;
243             struct iphdr *iph = skb->nh.iph;
244             struct nf_bridge_info *nf_bridge = skb->nf_bridge;
245
246             if (nf_bridge->mask & BRNF_PKT_TYPE) {
247                     skb->pkt_type = PACKET_OTHERHOST;
248                     nf_bridge->mask ^= BRNF_PKT_TYPE;
249             }
250             nf_bridge->mask ^= BRNF_NF_BRIDGE_PREROUTING;

Comment 2 Stanislaw Gruszka 2009-10-01 13:08:18 UTC
As far I can not find cause or kernel crash. From my code analyze looks skb->nf_bridge can not be NULL because  br_nf_pre_routing_finish() is always called from br_nf_pre_routing() after successful allocation and assignment to skb->nf_bridge. I must missed something, but as far I don't know what.

Comment 3 Stanislaw Gruszka 2009-10-01 13:11:30 UTC
Bruno, 

is possible to setup kdump (http://kbase.redhat.com/faq/docs/DOC-6039), capture crash dump and provide image to me ?

Comment 4 Stanislaw Gruszka 2009-10-14 12:01:44 UTC
Bruno,

I have a couple of questions:

Could you provide crash dump ? If not, why - is there any problem with kdump? 
Is bug reproducible with other network driver than bnx2x ?
Is bug reproducible on RHEL5.3 ?
Do you have steps to reproduce on standard systems without special HP applications ?

Thanks in advance.

Comment 5 Bruno Cornec 2009-10-14 13:44:03 UTC
Hello   Stanislaw,

Sorry for the delay, we've had other activities to persue, in order to provide a as stable setup as possible for our environement, and hadn't time to work on this up to now.

FYI, we have setup a new infrastructure in place, on which Olivier Renault, one of your colleague in copy, is currently working for demo with partners. We will give you remote access to this platform so that you can also look directly at the issue.

We will reproduce the problem on that platofrm hopefully tomorrow, once the demo is done, and comme back with the kdump info.

RHEL 5.3 is out of scope as we want KVM.

It's difficult for now to reproduce the symptom without the application (we tried in the meantime without much luck).
The problem is not seen when we use another server (DL380 with different NICs bnx2 or e1000e)

We also have very strange perf and bandwidth comportment with the bnx2x driver.

Thanks for your patience on this issue.

Comment 6 Stanislaw Gruszka 2009-10-15 06:57:32 UTC
(In reply to comment #5)
> FYI, we have setup a new infrastructure in place, on which Olivier Renault, one
> of your colleague in copy, is currently working for demo with partners. We will
> give you remote access to this platform so that you can also look directly at
> the issue.

Sounds excellent.

> We also have very strange perf and bandwidth comportment with the bnx2x driver.

I'm going to investigate this as well, thanks.

Comment 7 Olivier Renault 2009-10-15 15:10:18 UTC
There is a crash dump available on the ftp server qumranet.redhat.com in the folder engineers/monolive.

FYi, I have play with the solution. I am not able to reproduce the crash without the application but the network performance are really poor. A copy of an ISO image in between the hypervisor or another host to the VM is around 4-5k max 10k. 

Olivier

Comment 8 Olivier Renault 2009-10-19 06:35:18 UTC
my mistake the ftp server is 

ftp.qumranet.com 

Olivier

Comment 9 Stanislaw Gruszka 2009-10-20 10:11:54 UTC
From crash dump we have different oops:

Kernel BUG at drivers/net/tun.c:476
invalid opcode: 0000 [1] SMP 
last sysfs file: /class/net/eth0/operstate
CPU 14 
Modules linked in: nfs fscache nfs_acl dm_round_robin ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic cxgb3i libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi bo
nding tun lockd sunrpc ipt_REJECT xt_state ip_conntrack nfnetlink xt_multiport iptable_filter ip_tables xt_physdev bridge ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 xfrm_nalgo crypto_api 
uio cxgb3 8021q dm_multipath scsi_dh ksm(U) kvm_intel(U) kvm(U) shpchp sg bnx2x squashfs dm_snapshot ext3 jbd dm_mod sd_mod qla2xxx ehci_hcd cciss scsi_transport_fc uhci_hcd loop sr_mod scsi_mod cdrom
Pid: 13344, comm: qemu-kvm Tainted: G      2.6.18-162.el5 #1
RIP: 0010:[<ffffffff8867e7d9>]  [<ffffffff8867e7d9>] :tun:tun_chr_readv+0x2b1/0x3a6
RSP: 0018:ffff810312dede48  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff810312dede98 RCX: 0000000050104555
RDX: ffff8105f14f1700 RSI: ffff810312dede9e RDI: ffff810312dede92
RBP: 0000000000010ff6 R08: 0000000000000000 R09: 0000000000000001
R10: ffff810312dede94 R11: 0000000000000048 R12: ffff8105f14cdbc0
R13: ffff81031f93d500 R14: 0000000000000000 R15: ffff810312dedef8
FS:  00002af795605080(0000) GS:ffff81061ff9d4c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fff34077838 CR3: 00000005e8e2c000 CR4: 00000000000026e0
Process qemu-kvm (pid: 13344, threadinfo ffff810312dec000, task ffff8102f804b7a0)
Stack:  ffff81061df5ae20 ffff81061babe580 0000000000000000 ffff8102f804b7a0
 ffffffff8008be55 ffff81031f93d528 ffff81031f93d528 ffff81061e06b4b0
 000005ea05ea0000 0000000000000000 00000076034a1a00 0000000000000000
Call Trace:
 [<ffffffff8008be55>] default_wake_function+0x0/0xe
 [<ffffffff8867e8e8>] :tun:tun_chr_read+0x1a/0x1f
 [<ffffffff8000b695>] vfs_read+0xcb/0x171
 [<ffffffff80011b72>] sys_read+0x45/0x6e
 [<ffffffff8005d28d>] tracesys+0xd5/0xe0


Code: 0f 0b 68 c8 f4 67 88 c2 dc 01 f6 42 0a 08 74 0c 80 4c 24 41 
RIP  [<ffffffff8867e7d9>] :tun:tun_chr_readv+0x2b1/0x3a6
 RSP <ffff810312dede48>

I think this is the same bug as was reported here, just this oops in tun driver make further machine crash in bridging code.

Comment 10 Stanislaw Gruszka 2009-10-20 10:30:33 UTC
We have few others bug reports abut bug in tun driver, one is here:

https://bugzilla.redhat.com/show_bug.cgi?id=503851

Suggested short term fix is disable LRO for driver, however we have no LRO support in bnx2x. Perhaps we should try use module parameter disable_tpa (?).

There are patches which should fix tun driver oops:
http://people.redhat.com/agospoda/rhel5/0049-lro-add-check-to-warn-if-forwarding-on-devices-that.patch
http://people.redhat.com/agospoda/rhel5/0131-tun-fix-LRO-crash.patch

Kernels rpm's with these patches are here:
http://people.redhat.com/agospoda/#rhel5

Olivier, could you try one of it? Or perhaps could you give me remote access to machine.

Long term fix eventually would be convert bnx2x driver to GRO. But first we have to check if patches helps.

Comment 11 Stanislaw Gruszka 2009-10-20 10:56:51 UTC
(In reply to comment #10)

> There are patches which should fix tun driver oops:
> http://people.redhat.com/agospoda/rhel5/0049-lro-add-check-to-warn-if-forwarding-on-devices-that.patch
> http://people.redhat.com/agospoda/rhel5/0131-tun-fix-LRO-crash.patch
> 
> Kernels rpm's with these patches are here:
> http://people.redhat.com/agospoda/#rhel5

These patches are now included in "official" rhel5 kernels, you can pick up rpm from here, if you prefer less experimental packages: 

http://people.redhat.com/dzickus/el5/

Comment 12 Olivier Renault 2009-10-20 11:13:43 UTC
Hi, 

I have tested with both kernels at the host level and the netowrk performance within the guest are still bloody awful ( 5k ). I have also tested them at the host & VM level and I get the same results.

When grabbing the kernel at the host level, I am downloading at 500k. When I download the asme kernel, I am downloading in between 3-7k.

For example, copying a file from a physical hsot to a VM on a G network.
[root@rhn-satellite ~]# scp 10.3.118.84:/rhev/9511fbed-e07e-403e-89d8-871a736b7e17/images/11111111-1111-1111-1111-111111111111/redhat-rhn-satellite-5.3-server-x86_64-5-embedded-oracle.iso .
root.118.84's password: 
redhat-rhn-satellite-5.3-server-x86_64-5-embedded-oracle.iso                                                                                                  0%  156KB   7.1KB/s 25:52:20 ET

Bruno, could you provide access to Stanislaw ?

Olivier

Comment 13 Stanislaw Gruszka 2009-10-20 11:49:55 UTC
(In reply to comment #12)
> I have tested with both kernels at the host level and the netowrk performance
> within the guest are still bloody awful ( 5k ). 

This is speed between two real hosts, uhgh ?

Crashes disappears ? Test was running with HP specific software ?

Comment 14 Olivier Renault 2009-10-20 12:11:03 UTC
No I ve not run the test with the application, so I do not know if it solved
the crash issue. Bruno, Jean Marc coudl you run it ?

No the copy was in between VM and Physical host. The perf at ths host level are good.

Comment 15 Herbert Xu 2009-10-20 13:32:14 UTC
(In reply to comment #10)
> We have few others bug reports abut bug in tun driver, one is here:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=503851
> 
> Suggested short term fix is disable LRO for driver, however we have no LRO
> support in bnx2x. Perhaps we should try use module parameter disable_tpa (?).

I just did a grep on RHEL5 -155:

$ grep -ri lro drivers/net/bnx*
drivers/net/bnx2x_main.c:MODULE_PARM_DESC(disable_tpa, " Disable the TPA (LRO) feature");
drivers/net/bnx2x_main.c:               bp->dev->features &= ~NETIF_F_LRO;
drivers/net/bnx2x_main.c:               bp->dev->features |= NETIF_F_LRO;
drivers/net/bnx2x_main.c:       if ((data & ETH_FLAG_LRO) && bp->rx_csum) {
drivers/net/bnx2x_main.c:               if (!(dev->features & NETIF_F_LRO)) {
drivers/net/bnx2x_main.c:                       dev->features |= NETIF_F_LRO;
drivers/net/bnx2x_main.c:       } else if (dev->features & NETIF_F_LRO) {
drivers/net/bnx2x_main.c:               dev->features &= ~NETIF_F_LRO;
drivers/net/bnx2x_main.c:               rc = bnx2x_set_flags(dev, (flags & ~ETH_FLAG_LRO));
$ 

So yes we should disable TPA.  This also explains the poor performance as all those packets would be dropped until the remote end sends exactly one packet at a time which essentially disables TPA.

Comment 16 Stanislaw Gruszka 2009-10-20 13:46:16 UTC
(In reply to comment #15)
> I just did a grep on RHEL5 -155:
> 
> $ grep -ri lro drivers/net/bnx*
> drivers/net/bnx2x_main.c:MODULE_PARM_DESC(disable_tpa, " Disable the TPA (LRO)
> feature");
> drivers/net/bnx2x_main.c:               bp->dev->features &= ~NETIF_F_LRO;
> drivers/net/bnx2x_main.c:               bp->dev->features |= NETIF_F_LRO;
> drivers/net/bnx2x_main.c:       if ((data & ETH_FLAG_LRO) && bp->rx_csum) {
> drivers/net/bnx2x_main.c:               if (!(dev->features & NETIF_F_LRO)) {
> drivers/net/bnx2x_main.c:                       dev->features |= NETIF_F_LRO;
> drivers/net/bnx2x_main.c:       } else if (dev->features & NETIF_F_LRO) {
> drivers/net/bnx2x_main.c:               dev->features &= ~NETIF_F_LRO;
> drivers/net/bnx2x_main.c:               rc = bnx2x_set_flags(dev, (flags &
> ~ETH_FLAG_LRO));
> $ 

All of them are in #if 0 ... #endif brackets.

> So yes we should disable TPA.  This also explains the poor performance as all
> those packets would be dropped until the remote end sends exactly one packet at
> a time which essentially disables TPA.  

We have to check it.

Comment 17 Herbert Xu 2009-10-20 14:12:58 UTC
(In reply to comment #16)
>
> All of them are in #if 0 ... #endif brackets.

It doesn't matter.  As long as TPA is enabled it'll merge packets and produce packets with a non-zero gso_size and a zero gso_type, which triggers either a crash or causes the packet to be dropped.

Comment 18 Andy Gospodarek 2009-10-21 19:55:42 UTC
One should also check /var/log/messages and dmesg output.

The system is probably littered with messages to disable LRO/TPA on the driver.

Comment 21 Stanislaw Gruszka 2009-10-22 07:33:48 UTC
We have bug report about bridging on bnx2x here: 

https://bugzilla.redhat.com/show_bug.cgi?id=483646

So this issue is fixed and disable_tpa=1 is solution for pure performance (Olivier do you confirm?).

I'm not sure about original crash in br_nf_pre_routing_finish() when HP software is running. Bruno, could you test the latest kernel i.e. 2.6.18-170 and confirm/deny issue is fixed in your system ?

Comment 22 Bruno Cornec 2009-10-22 09:23:42 UTC
(In reply to comment #21)
> I'm not sure about original crash in br_nf_pre_routing_finish() when HP
> software is running. Bruno, could you test the latest kernel i.e. 2.6.18-170
> and confirm/deny issue is fixed in your system ?  

I'll not be able to do it before next week. but we will work on this.

Comment 23 Olivier Renault 2009-10-23 15:13:41 UTC
I can confirm that disable_tpa=1 fixed the issue of performance for the VMs. 

Thanks a lot,
Olivier

Comment 24 Andy Gospodarek 2009-10-23 15:22:50 UTC
Olivier, did you check to see if dmesg or /var/log/messages was filled with messages indicating that LRO should be disabled when there was poor performance?

If not, can you?

Comment 25 Olivier Renault 2009-10-23 15:38:24 UTC
You are right it was full of:
 eth1: received packets cannot be forwarded while LRO is enabled

Thanks,
Olivier

Comment 26 Andy Gospodarek 2009-10-23 15:53:33 UTC
Would those messages be more valuable if they came out as KERN_CRIT (always on the console)?

I'm glad we have them, but I fear many users won't notice them.

Comment 27 Stanislaw Gruszka 2010-01-22 16:10:19 UTC
(In reply to comment #22)
> (In reply to comment #21)
> > I'm not sure about original crash in br_nf_pre_routing_finish() when HP
> > software is running. Bruno, could you test the latest kernel i.e. 2.6.18-170
> > and confirm/deny issue is fixed in your system ?  
> 
> I'll not be able to do it before next week. but we will work on this.    

Hi Bruno, any news?

Comment 28 Stanislaw Gruszka 2010-01-22 16:11:55 UTC
Please note new kernel public release are now here:
http://people.redhat.com/jwilson/el5/

Comment 29 Jean-Marc ANDRE 2010-02-03 09:19:53 UTC
Hi,

We were not able to reproduce the bug.
We did some configuration changes on the platform, installed the latest kernel...
I don't know if the problem is actually fixed but everything seems to work correctly now.

Comment 30 Stanislaw Gruszka 2010-02-03 13:17:46 UTC
Since we have no clean statement if bug is fixed, but we can't reproduce it, I'm closing with CANTFIX resolution. Please reopen if you enter this problem again. Thanks.


Note You need to log in before you can comment on or make changes to this bug.