Bug 854777

Summary: driver_attach called module probe with spin_lock_irqX held produces WARN_ON(irqs_disabled()) with pci_free_consistent
Product: Red Hat Enterprise Linux 6 Reporter: minh <minh.tran>
Component: kernelAssignee: Prarit Bhargava <prarit>
Status: CLOSED WORKSFORME QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.3CC: jayamohan.kallickal, jbenc, rkhan
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-09-24 17:53:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
rhel6.3 i386 installation log none

Description minh 2012-09-05 23:58:45 UTC
Created attachment 610124 [details]
rhel6.3 i386 installation log

Description of problem:  We are getting kernel warn trace with Emulex's be2iscsi outbox driver during rhel6.3 i386 installation.  The problem occurs at driver loads.  irqs_disabled() is already true when our pci_dev_probe entry has been invoked.  This means two things: either a spin_lock_irqX or a disable_irq() has been called somewhere earlier in the __driver_attach stack.

Note: this problem has been observed only with rhel6.3 i386 installation kernel. 


Version-Release number of selected component (if applicable):


How reproducible: problem should be reproducible with any driver which call pci_free_consistent during load with OS installation.


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
02:34:14,684 WARNING kernel:------------[ cut here ]------------
02:34:14,684 WARNING kernel:WARNING: at /usr/src/kernels/2.6.32-268.el6.i686/arch/x86/include/asm/dma-mapping.h:154 se_pci_free_consistent+0xa5/0xb0 [be2iscsi]() (Not tainted)
02:34:14,684 WARNING kernel:Hardware name: PowerEdge T310
02:34:14,684 WARNING kernel:Modules linked in: be2iscsi(+)(U) bnx2 ipv6 iscsi_ibft iscsi_boot_sysfs pcspkr edd sg sd_mod crc_t10dif iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi squashfs cramfs usb_storage [last unloaded: cdrom]
02:34:14,684 WARNING kernel:Pid: 708, comm: modprobe Not tainted 2.6.32-279.el6.i686 #1
02:34:14,684 WARNING kernel:Call Trace:
02:34:14,684 WARNING kernel: [<c0455c11>] ? warn_slowpath_common+0x81/0xc0
02:34:14,684 WARNING kernel: [<f8149cd5>] ? se_pci_free_consistent+0xa5/0xb0 [be2iscsi]
02:34:14,684 WARNING kernel: [<f8149cd5>] ? se_pci_free_consistent+0xa5/0xb0 [be2iscsi]
02:34:14,684 WARNING kernel: [<c0455c6b>] ? warn_slowpath_null+0x1b/0x20
02:34:14,684 WARNING kernel: [<f8149cd5>] ? se_pci_free_consistent+0xa5/0xb0 [be2iscsi]
02:34:14,684 WARNING kernel: [<f814ea60>] ? se_free_internal_ioctl_data+0x70/0xa0 [be2iscsi]
02:34:14,684 WARNING kernel: [<f81564c5>] ? se_get_boot_target_info+0xe5/0x190 [be2iscsi]
02:34:14,684 WARNING kernel: [<f814cfbf>] ? se_pci_dev_probe+0x7bf/0xaf0 [be2iscsi]
02:34:14,684 WARNING kernel: [<c05ff51e>] ? rb_insert_color+0xce/0x100
02:34:14,684 WARNING kernel: [<c058c85b>] ? __sysfs_add_one+0x5b/0x90
02:34:14,684 WARNING kernel: [<c0615bbb>] ? local_pci_probe+0xb/0x10
02:34:14,684 WARNING kernel: [<c06169a1>] ? pci_device_probe+0x61/0x80
02:34:14,684 WARNING kernel: [<c06c1197>] ? driver_probe_device+0x87/0x290
02:34:14,684 WARNING kernel: [<c0615c92>] ? pci_match_device+0x12/0xa0
02:34:14,684 WARNING kernel: [<c06c1419>] ? __driver_attach+0x79/0x80
02:34:14,684 WARNING kernel: [<c06c13a0>] ? __driver_attach+0x0/0x80
02:34:14,684 WARNING kernel: [<c06c0592>] ? bus_for_each_dev+0x52/0x80
02:34:14,684 WARNING kernel: [<c06c0f86>] ? driver_attach+0x16/0x20
02:34:14,684 WARNING kernel: [<c06c13a0>] ? __driver_attach+0x0/0x80
02:34:14,684 WARNING kernel: [<c06c093f>] ? bus_add_driver+0x1cf/0x320
02:34:14,684 WARNING kernel: [<c06168e0>] ? pci_device_remove+0x0/0x40
02:34:14,684 WARNING kernel: [<c06c169f>] ? driver_register+0x5f/0x110
02:34:14,684 WARNING kernel: [<c04bd3af>] ? tracepoint_module_notify+0x1f/0x30
02:34:14,684 WARNING kernel: [<f7e3f000>] ? init_module+0x0/0x62 [be2iscsi]
02:34:14,684 WARNING kernel: [<c0616bbd>] ? __pci_register_driver+0x3d/0xb0
02:34:14,684 WARNING kernel: [<c040303f>] ? do_one_initcall+0x2f/0x1c0
02:34:14,684 WARNING kernel: [<c0492514>] ? sys_init_module+0xb4/0x220
02:34:14,684 WARNING kernel: [<c052dee1>] ? sys_read+0x41/0x70
02:34:14,684 WARNING kernel: [<c0409a9f>] ? sysenter_do_call+0x12/0x28
02:34:14,684 WARNING kernel:---[ end trace 18c78d9781995049 ]---

Comment 2 Prarit Bhargava 2012-09-06 17:44:38 UTC
>02:34:14,684 WARNING kernel:Modules linked in: be2iscsi(+)(U) bnx2 ipv6 

You've loaded an unsigned kernel module and that's where the errors are coming from.

We do not support this module.

P.

Comment 3 minh 2012-09-06 17:55:19 UTC
Prarit,
This has nothing to do with the module, it's still a kernel bug which needs to be fixed.  Any module that calls pci_free_consistent will hit the problem during rhel6.3 i386 installation.

-Minh

Comment 4 Prarit Bhargava 2012-09-10 11:00:43 UTC
(In reply to comment #3)
> Prarit,
> This has nothing to do with the module, it's still a kernel bug which needs
> to be fixed.  Any module that calls pci_free_consistent will hit the problem
> during rhel6.3 i386 installation.
> 

Hmm ... okay, I'll take a look but I'm pretty sure we have 32-bit drivers that call pci_free_consistent() and don't exhibit a problem.

P.

Comment 5 Prarit Bhargava 2012-09-10 14:35:39 UTC
(In reply to comment #3)
> Prarit,
> This has nothing to do with the module, it's still a kernel bug which needs
> to be fixed.  Any module that calls pci_free_consistent will hit the problem
> during rhel6.3 i386 installation.
> 
> -Minh

Sorry, I don't see this behaviour.  I created a simple module that allocated (via pci_alloc_consistent) a 8 byte chunk of memory to the root bridge on module load.  I then free'd the memory (via pci_free_consistent) the 8 byte chunk of memory and don't see any problems.

This is a bug in the upstream be2iscsi driver.

P.

Comment 6 Prarit Bhargava 2012-09-10 14:39:55 UTC
(In reply to comment #5)
> (In reply to comment #3)
> > Prarit,
> > This has nothing to do with the module, it's still a kernel bug which needs
> > to be fixed.  Any module that calls pci_free_consistent will hit the problem
> > during rhel6.3 i386 installation.
> > 
> > -Minh
> 
> Sorry, I don't see this behaviour.  I created a simple module that allocated
> (via pci_alloc_consistent) a 8 byte chunk of memory to the root bridge on
> module load.  I then free'd the memory (via pci_free_consistent) the 8 byte
> chunk of memory and don't see any problems.
> 

Oh sorry -- I forgot the spin_lock_irq() part.

Yeah ... That still isn't a bug.  The problem is that you shouldn't be disabling interrupts during a free consistent.  The kernel cannot handle that.  ... ie) this is still a bug in the be2iscsi driver.

P.

Comment 8 minh 2012-09-10 17:14:45 UTC
Prarit,
Sorry about the confusion, I've not clearly tell you how to reproduce it.  The problem is that interrupts is already disabled when it calls the be2iscsi driver module probe.  My educated guess is interrupt has been disabled via spin_lock_irqsave() but it's also possible disable_irq() has been called.  To prove it's a kernel issue, here is how you do it:

1) compile a driver in rhel6.2 i386 with WARN_ON(irqs_disabled()) added immediately at the driver pci_dev_probe entry point.
2) make a DUD.iso to load during installation.
3) boot with rhel6.3 i386 OS installation CD then add driver in DUD.iso
4) after driver loads, look at /tmp/syslog or dmesg to see kernel trace

Note: this situation only happens with the rhel6.3 i386 installation CD, during __driver_attach context.

Comment 10 Prarit Bhargava 2012-09-24 17:53:55 UTC
(In reply to comment #8)
> Prarit,
> Sorry about the confusion, I've not clearly tell you how to reproduce it. 
> The problem is that interrupts is already disabled when it calls the
> be2iscsi driver module probe.  My educated guess is interrupt has been
> disabled via spin_lock_irqsave() but it's also possible disable_irq() has
> been called.  To prove it's a kernel issue, here is how you do it:
> 
> 1) compile a driver in rhel6.2 i386 with WARN_ON(irqs_disabled()) added
> immediately at the driver pci_dev_probe entry point.
> 2) make a DUD.iso to load during installation.
> 3) boot with rhel6.3 i386 OS installation CD then add driver in DUD.iso
> 4) after driver loads, look at /tmp/syslog or dmesg to see kernel trace
> 
> Note: this situation only happens with the rhel6.3 i386 installation CD,
> during __driver_attach context.

I did the following on a system which required the be2net driver.

1.  Modified the source with

diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index 6da26f4..0800f5d 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -3534,6 +3534,32 @@ static int __devinit be_probe(struct pci_dev *pdev,
        struct be_adapter *adapter;
        struct net_device *netdev;
 
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       printk(KERN_EMERG "LOADING MODIFIED BE2NET DRIVER\n");
+       WARN_ON(in_interrupt());
+
        status = pci_enable_device(pdev);
        if (status)
                goto do_none;


2.  built dud disk, burned to CD.

3.  started install with "dd" option

4.  Inserted driver disk when prompted to

5.  Hit ALT-F4 to watch kernel messages.

6.  I see "LOADING MODIFIED BE2NET DRIVER" (like I expect to from the above printks), but do not see a stack trace from the WARN_ON()



This is likely a bug in your proprietary driver.

P.