Description of problem: I'm adding patches to automatic disable LRO on bnx2x. With that new patches cnic oops on xen kernel when I'm enabling bridges. Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: [<ffffffff8840d50f>] :uio:uio_event_notify+0x1/0x31 PGD 157c4b067 PUD 15395e067 PMD 0 Oops: 0000 [1] SMP last sysfs file: /class/net/lo/ifindex CPU 0 Modules linked in: ipt_MASQUERADE iptable_nat ip_nat bridge autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables be2iscsi ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp bnx2i cnic ipv6 xfrm_nalgo crypto_api uio cxgb3i cxgb3 libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi loop dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi ac parport_pc lp parport sg shpchp hpilo pcspkr bnx2x 8021q serial_core serio_raw dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 14, comm: events/0 Not tainted 2.6.18-194.el5.bnx2x_v3xen #1 RIP: e030:[<ffffffff8840d50f>] [<ffffffff8840d50f>] :uio:uio_event_notify+0x1/0x31 RSP: e02b:ffffffff80684e88 EFLAGS: 00010297 RAX: ffff880151c05191 RBX: ffff8801590fd0a8 RCX: 0000000000000000 RDX: ffff88015b135191 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88015a990500 R08: ffff88015e44e000 R09: 0000000000000001 R10: ffff88015a990500 R11: 00000000000000c8 R12: 0000000000000001 R13: 0000000000000001 R14: ffff88015e44fcf8 R15: ffff88015e44fcf8 FS: 00002b22f5712260(0000) GS:ffffffff805d2000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 Process events/0 (pid: 14, threadinfo ffff88015e44e000, task ffff88015e44d7a0) Stack: ffff8801590fd0a8 ffffffff884a301f 0000000000000002 ffffffff881cdcba ffffffff80684ea0 0000000000000000 ffff88015ba62c40 0000000000000015 0000000000000000 ffffffff8021152a Call Trace: <IRQ> [<ffffffff884a301f>] :cnic:cnic_service_bnx2x+0x69/0x6d [<ffffffff881cdcba>] :bnx2x:bnx2x_interrupt+0x19a/0x211 [<ffffffff8021152a>] handle_IRQ_event+0x55/0xae [<ffffffff802b3642>] __do_IRQ+0xa4/0x103 [<ffffffff80290528>] _local_bh_enable+0x61/0xc5 [<ffffffff8026df62>] do_IRQ+0xe7/0xf5 [<ffffffff803b3b8f>] evtchn_do_upcall+0x13b/0x1fb [<ffffffff802608d6>] do_hypervisor_callback+0x1e/0x2c <EOI> [<ffffffff881c7116>] :bnx2x:bnx2x_release_hw_lock+0x8d/0xe0 [<ffffffff881c717d>] :bnx2x:bnx2x_release_phy_lock+0x14/0x21 [<ffffffff881d7d0a>] :bnx2x:bnx2x_nic_load+0x108d/0x1371 [<ffffffff80299a8a>] queue_delayed_work+0x75/0x7e [<ffffffff881d0220>] :bnx2x:bnx2x_nic_unload+0x836/0x844 [<ffffffff881d9f9f>] :bnx2x:bnx2x_reset_task+0x0/0x31 [<ffffffff881d9fca>] :bnx2x:bnx2x_reset_task+0x2b/0x31 [<ffffffff8024fa5f>] run_workqueue+0x94/0xe4 [<ffffffff8024c318>] worker_thread+0x0/0x122 [<ffffffff8024c408>] worker_thread+0xf0/0x122 [<ffffffff8028906a>] default_wake_function+0x0/0xe [<ffffffff80233e47>] kthread+0xfe/0x132 [<ffffffff80260b2c>] child_rip+0xa/0x12 [<ffffffff80233d49>] kthread+0x0/0x132 [<ffffffff80260b22>] child_rip+0x0/0x12 Version-Release number of selected component (if applicable): 2.6.18-194.el5 + new bnx2x patches. How reproducible: Always Steps to Reproduce: 1. Boot in xen domain 0 with disabled libvirtd and xend 2. Run /etc/init.d/libvirtd start
Created attachment 409446 [details] proposed workaround/fix This patch prevent oops, I'm pretty sure it is not right fix just workaround.
Created attachment 409449 [details] packed addional patches Patches for 2.6.18-194.el5 kernel which make this bug reproducible.
Michael, Problem is that we get interrupt when cnic driver is not ready for it. On xen we have legacy INT# interrupt. This happens when we reset device during bnx2x_nic_load(). Do you have any ideas about better fix?
I think this upstream patch should fix it. http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.git;a=commit;h=94824f3dbe0d3f62470603bbb18efb5510aaf07c We saw a similar issue during MTU change and fixed it with the above patch. Thanks.
Yes, patch fix problem. Thanks Michael.
This patch was applied with patch series for bug 582367 . *** This bug has been marked as a duplicate of bug 582367 ***