Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1381266

Summary:	[RHEL6] Kernel crash in bdi_remove_from_list with EMC powerpath
Product:	Red Hat Enterprise Linux 6	Reporter:	Hidehiko Matsumoto <hmatsumo>
Component:	kernel	Assignee:	Ewan D. Milne <emilne>
kernel sub component:	Storage	QA Contact:	Storage QE <storage-qe>
Status:	CLOSED DUPLICATE	Docs Contact:
Severity:	high
Priority:	high	CC:	coughlan, djeffery, hideki.miyajima, kearnan_keith, nyamashi, revers, tatsu-ab1
Version:	6.4
Target Milestone:	rc
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-11-14 18:45:52 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Hidehiko Matsumoto 2016-10-03 14:20:05 UTC

Description of problem:
The customer got system crash at bdi_remove_from_list().
The customer investigated that it is similar to solution 423593.

  ~~~
  - kernel crash in bdi_remove_from_list with EMC powerpath
  https://access.redhat.com/solutions/423593
  ~~~

  ~~~
  crash> log
         :
  Pid: 413, comm: scsi_wq_5 Tainted: P        W  ---------------    2.6.32-358.23.2.el6.x86_64 #1
  HP ProLiant DL380p Gen8
  RIP: 0010:[<ffffffff8113c1bf>]  [<ffffffff8113c1bf>] bdi_remove_from_list+0x2f/0x60
  RSP: 0018:ffff882fec7b1720  EFLAGS: 00010282
  RAX: dead000000200200 RBX: ffff882feed92658 RCX: 0000000000000158
  RDX: ffff8814de8bef88 RSI: 0000000000000246 RDI: ffffffff81fb7520
  RBP: ffff882fec7b1730 R08: 0000000000000000 R09: 0000000000000000
  R10: 00000000beefdead R11: 0000000000000000 R12: 0000000000000246
  R13: ffff882feed927c0 R14: ffff882feb6eac00 R15: 0000000008600070
  FS:  0000000000000000(0000) GS:ffff880113ec0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
  CR2: 00007f2d67a685e1 CR3: 0000001652f1c000 CR4: 00000000001407e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process scsi_wq_5 (pid: 413, threadinfo ffff882fec7b0000, task ffff882feecfd500)
  Stack:
      ffff882fec7b1730 ffff882feed92658 ffff882fec7b1760 ffffffff8113c2a1
  <d> ffff882feb6eac00 ffff882fd1f57808 0000000000000246 ffff882feed924f8
  <d> ffff882fec7b1770 ffffffffa02580ee ffff882fec7b1810 ffffffffa0261fff
  Call Trace:
   [<ffffffff8113c2a1>] bdi_unregister+0xb1/0x170
   [<ffffffffa02580ee>] emcp_bdi_unregister+0xe/0x10 [emcp]
   [<ffffffffa0261fff>] emcp_reenable_io+0x25f/0x2d0 [emcp]
   [<ffffffffa02574a4>] ? PowerPutSema+0x54/0x80 [emcp]
   [<ffffffffa026ce3c>] emcp_map_device+0x9c/0x170 [emcp]
   [<ffffffffa02574a4>] ? PowerPutSema+0x54/0x80 [emcp]
   [<ffffffffa026ec3e>] emcp_add+0x6ce/0x770 [emcp]
   [<ffffffff81051439>] ? __wake_up_common+0x59/0x90
   [<ffffffffa0270f35>] emcp_chg_device_notify+0x55/0x90 [emcp]
   [<ffffffff81513cb5>] notifier_call_chain+0x55/0x80
   [<ffffffff8109cfba>] __blocking_notifier_call_chain+0x5a/0x80
   [<ffffffff8109cff6>] blocking_notifier_call_chain+0x16/0x20
   [<ffffffff8135f26c>] driver_bound+0x4c/0xd0
   [<ffffffff8135f39f>] driver_probe_device+0xaf/0x2a0
   [<ffffffff8135f640>] ? __device_attach+0x0/0x60
   [<ffffffff8135f693>] __device_attach+0x53/0x60
   [<ffffffff8135e5c4>] bus_for_each_drv+0x64/0x90
   [<ffffffff8135f774>] device_attach+0xa4/0xc0
   [<ffffffff8135e36d>] bus_probe_device+0x2d/0x50
   [<ffffffff8135c677>] device_add+0x527/0x650
   [<ffffffff81362d61>] ? attribute_container_device_trigger+0xd1/0xe0
   [<ffffffff8137fe59>] scsi_sysfs_add_sdev+0x89/0x2c0
   [<ffffffff8137d2a0>] scsi_probe_and_add_lun+0xdd0/0xe50
   [<ffffffff8137ddfc>] __scsi_scan_target+0x55c/0x750
   [<ffffffff8137e6c5>] scsi_scan_target+0xd5/0xf0
   [<ffffffffa008a8ad>] fc_scsi_scan_rport+0xbd/0xc0 [scsi_transport_fc]
   [<ffffffffa008a7f0>] ? fc_scsi_scan_rport+0x0/0xc0 [scsi_transport_fc]
   [<ffffffff81090be0>] worker_thread+0x170/0x2a0
   [<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40
   [<ffffffff81090a70>] ? worker_thread+0x0/0x2a0
   [<ffffffff81096a36>] kthread+0x96/0xa0
   [<ffffffff8100c0ca>] child_rip+0xa/0x20
   [<ffffffff810969a0>] ? kthread+0x0/0xa0
   [<ffffffff8100c0c0>] ? child_rip+0x0/0x20
  
  crash> struct backing_dev_info ffff882feed92658
  struct backing_dev_info {
    bdi_list = {
      next = 0xffff8814de8bef88,
      prev = 0xdead000000200200
    },
    rcu_head = {
      next = 0xffff882fcd59d2c0,
      func = 0xffffffff8113c8a0 <bdi_add_to_pending>
    },
    ra_pages = 32,
    state = 17,
      :
  ~~~


Version-Release number of selected component (if applicable):
Server DL380 Gen8
kernel-2.6.32-358.23.2.el6.x86_64
EMC PowerPath


How reproducible:
It occurs sometimes.


Steps to Reproduce:
condition is unknown.


Additional info:
vmcore (it is large.  I could not attach vmcore to this BZ.): https://optimus.gsslab.rdu2.redhat.com/manager/258643055

Comment 4 Hidehiko Matsumoto 2016-10-03 14:48:10 UTC

The crash and warnings look likely to be side effects of how PowerPath grafts itself into the SCSI stack in a non-standard and racy way.

The backing_dev_info structure which triggered the crash is part of a request_queue at 0xffff882feed924f8.  This request_queue was for the scsi_device for sdhx/5:0:1:21 .  Each points to the other as expected.

crash> p ((struct scsi_device *)0xffff882fda0bd800)->request_queue
$20 = (struct request_queue *) 0xffff882feed924f8

crash> p ((struct scsi_device *)0xffff882fda0bd800)->request_queue->queuedata
$21 = (void *) 0xffff882fda0bd800


However, this isn't the request_queue as found by sdhx's gendisk:

crash> dev -d | grep sdhx
  134 ffff882feb6eac00   sdhx       ffff88160556a278       0     0     0     0

The gendisk for sdhx shows a request_queue of 0xffff88160556a278.  This is because of PowerPath.  PowerPath is injecting its own request_queue structures into the normal SCSI gendisk structures.  This isn't part of the SCSI and block layer design.  Trying to swap out the request queues is racy, which may also be the cause of the WARNING messages seen in the logs.  PowerPath changing a request_queue while a task is accessing a SCSI dev node can create race conditions and could break reference counting.  And in addition to the crash-triggering task, there were 2 more, 399 and 406, also manipulating request_queue and backing_dev_info structures from PowerPath functions.  EMC would need to examine how they can interact with the storage stack in a non-racy way.

Comment 5 Hidehiko Matsumoto 2016-10-03 14:49:43 UTC

<snip>
BDI registration happens first in add_disk in fixed kernels:

        /* Register BDI before referencing it from bdev */
        bdi = &disk->queue->backing_dev_info;
        bdi_register_dev(bdi, disk_devt(disk));

        blk_register_region(disk_devt(disk), disk->minors, NULL,
                            exact_match, exact_lock, disk);
        register_disk(disk);
        blk_register_queue(disk);



The likely cause of the warnings is PowerPath.  Code in the emcp module appears to be doing things like unregistering a bdi and then registering a bdi while a disk is live from emcp_reenable_io.

crash> bt
PID: 413    TASK: ffff882feecfd500  CPU: 6   COMMAND: "scsi_wq_5"
 #0 [ffff882fec7b14b0] machine_kexec at ffffffff81035d6b
 #1 [ffff882fec7b1510] crash_kexec at ffffffff810c0e22
 #2 [ffff882fec7b15e0] oops_end at ffffffff81511cb0
 #3 [ffff882fec7b1610] die at ffffffff8100f19b
 #4 [ffff882fec7b1640] do_general_protection at ffffffff815117b2
 #5 [ffff882fec7b1670] general_protection at ffffffff81510f85
    [exception RIP: bdi_remove_from_list+47]
    RIP: ffffffff8113c1bf  RSP: ffff882fec7b1720  RFLAGS: 00010282
    RAX: dead000000200200  RBX: ffff882feed92658  RCX: 0000000000000158
    RDX: ffff8814de8bef88  RSI: 0000000000000246  RDI: ffffffff81fb7520
    RBP: ffff882fec7b1730   R8: 0000000000000000   R9: 0000000000000000
    R10: 00000000beefdead  R11: 0000000000000000  R12: 0000000000000246
    R13: ffff882feed927c0  R14: ffff882feb6eac00  R15: 0000000008600070
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #6 [ffff882fec7b1738] bdi_unregister at ffffffff8113c2a1
 #7 [ffff882fec7b1768] emcp_bdi_unregister at ffffffffa02580ee [emcp]
 #8 [ffff882fec7b1778] emcp_reenable_io at ffffffffa0261fff [emcp]
 #9 [ffff882fec7b1818] emcp_map_device at ffffffffa026ce3c [emcp]
#10 [ffff882fec7b18a8] emcp_add at ffffffffa026ec3e [emcp]
#11 [ffff882fec7b1918] emcp_chg_device_notify at ffffffffa0270f35 [emcp]
#12 [ffff882fec7b1938] notifier_call_chain at ffffffff81513cb5
#13 [ffff882fec7b1978] __blocking_notifier_call_chain at ffffffff8109cfba
#14 [ffff882fec7b19c8] blocking_notifier_call_chain at ffffffff8109cff6
#15 [ffff882fec7b19d8] driver_bound at ffffffff8135f26c
#16 [ffff882fec7b19f8] driver_probe_device at ffffffff8135f39f
#17 [ffff882fec7b1a28] __device_attach at ffffffff8135f693
#18 [ffff882fec7b1a48] bus_for_each_drv at ffffffff8135e5c4
#19 [ffff882fec7b1a88] device_attach at ffffffff8135f774
#20 [ffff882fec7b1ab8] bus_probe_device at ffffffff8135e36d
#21 [ffff882fec7b1ac8] device_add at ffffffff8135c677
#22 [ffff882fec7b1b48] scsi_sysfs_add_sdev at ffffffff8137fe59
#23 [ffff882fec7b1b88] scsi_probe_and_add_lun at ffffffff8137d2a0
#24 [ffff882fec7b1cc8] __scsi_scan_target at ffffffff8137ddfc
#25 [ffff882fec7b1db8] scsi_scan_target at ffffffff8137e6c5
#26 [ffff882fec7b1e08] fc_scsi_scan_rport at ffffffffa008a8ad [scsi_transport_fc]
#27 [ffff882fec7b1e38] worker_thread at ffffffff81090be0
#28 [ffff882fec7b1ee8] kthread at ffffffff81096a36
#29 [ffff882fec7b1f48] kernel_thread at ffffffff8100c0ca
</snip>

Comment 8 Keith Kearnan 2016-10-04 16:40:21 UTC

Hi Tom,
This seems like it could be similar to the issue from https://access.redhat.com/solutions/1212373.
Could this be related? 

I have no visibility into https://bugzilla.redhat.com/show_bug.cgi?id=1111683?
-Keith

Comment 9 David Jeffery 2016-10-05 20:33:08 UTC

https://access.redhat.com/solutions/1212373 are systems exhibiting similar behavior during device remove.  In this case, it's not remove but device add.

We appear to be getting WARNINGs and the crash from PowerPath removing and adding bdi connections while a block device and its gendisk is already active and visible.  It's not a race on delete but a violation of the block layer's expectation that the block device and its bdi will be stable while the device exists.

Comment 11 Madhu Tarikere 2016-10-24 10:55:30 UTC

Yes, this panic happened during device add not removal. But isn’t the behavior similar in both the cases?

Even in this case also, bdi_unregister is called which in turn called bdi_remove_from_list to remove a BDI which was already removed.

From the dump, BDI_pending is set on the bdi. Which mean when line 612 was executed, BDI_pending was not set. When it was executing line 618 bdi_remove_from_list(), BDI_pending is set by some other thread.

From the code, BDI_pending bit is set only in one place and it is done in bdi_add_default_flusher_task() by default flusher task.

Doesn’t this issue due to race between bdi_unregister and bdi-default threads like in case of BZ 1111683?  Should we ask the customer to upgrade kernel version which has the fix for BZ 1111683? 
Appreciate your comments.

602 static void bdi_wb_shutdown(struct backing_dev_info *bdi)
603 {
609     /*
610      * If setup is pending, wait for that to complete first
611      */
612     wait_on_bit(&bdi->state, BDI_pending, bdi_sched_wait,
613             TASK_UNINTERRUPTIBLE);
614
615     /*
616      * Make sure nobody finds us on the bdi_list anymore
617      */
618     bdi_remove_from_list(bdi);


501 void static bdi_add_default_flusher_task(struct backing_dev_info *bdi)
502 {
..
518     if (!test_and_set_bit(BDI_pending, &bdi->state)) {
519         list_del_rcu(&bdi->bdi_list);           // Removed from list here
520
521         /*
522          * We must wait for the current RCU period to end before
523          * moving to the pending list. So schedule that operation
524          * from an RCU callback.
525          */
526         call_rcu(&bdi->rcu_head, bdi_add_to_pending);

Comment 13 David Jeffery 2016-10-25 20:32:33 UTC

Yes, the behavior in BZ1111683 is similar to the crash.  It may correct enough of the issue.  It will not stop the WARN messages from removing the bdi from an active and accessible device, which creates other problems I am not sure BZ1111683 would fully protect against.

Once available, such a kernel to test may not be a bad idea.  Patches for BZ1111683 have already been pulled once from creating a new issue.

Comment 16 hideki.miyajima 2016-10-27 03:58:48 UTC

We asked for more explanation on the panic as the last comment by David Jeffery at 2016-10-25 16:32:33 EDT --- the behavior in BZ1111683 is similar to the crash.  --- is not very clear. Here is what we got from RedHat Japan.
   ~~~
   This has nothing to do with that EMC bug.  The EMC powerpath module  
appears to be doing something that is completely unsupportable and broken.
   ~~~

I think *something* talks about the WARNIGS, but can you please give us what something is here? Is this about the WARNINGS or the panic?

As for the panic, which causes the panic, kernel or powerpath?

Comment 17 David Jeffery 2016-10-31 20:22:49 UTC

The warnings look to be powerpath related.  powerpath appears to be unregistering and registering bdi devices on a live, fully accessible device. This is not intended behavior and can create races which will trigger the warnings.

The crash isn't as clear.  There is a similar crash which is being fixed in RHEL.  However, powerpath's methods of injecting itself into SCSI disk structures as part of it claiming a disk are not standard, from its unregister/register of bdi device to creating alternate gendisks and request_queues.  Without a full view of what powerpath is doing, we cannot tell if it is the same issue or just a similar one triggered by powerpath's methods of inserting itself into the disk stack in ways that were never intended by the SCSI/block layer's design.

Comment 18 hideki.miyajima 2016-11-03 23:12:07 UTC

- powerpath's methods of injecting itself into SCSI disk structures as part of it claiming a disk are not standard, from its unregister/register of bdi device to creating alternate gendisks and request_queues.

- powerpath's methods of inserting itself into the disk stack in ways that were never intended by the SCSI/block layer's design.


Can you give us what makes you think PowerPath is behaving this way? 
I would appreciate if you show the stack or any other stuff based on the core.
Thank you.

Comment 25 Ewan D. Milne 2016-11-14 18:45:52 UTC

Closing as duplicate of BZ 1111683, there are fixes being made to correct issues
with device removal, which according to our analysis of the crash dumps, appear
to be the cause of the crashes.  EMC appears to concur with this analysis.

There is a pre-beta test kernel available for partners, but not for customer use.
See BZ 1111683 for details.  Please provide any testing feedback if it is used.

If for some reason the test kernel and/or RHEL 6.9 GA does not solve this issue
for the customer, then re-open this bug with further details at that point.

*** This bug has been marked as a duplicate of bug 1111683 ***

Comment 26 Red Hat Bugzilla 2023-09-14 03:31:44 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days