Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1623601 - Cannot execute multipath -F
Cannot execute multipath -F
Status: VERIFIED
Product: Red Hat Ceph Storage
Classification: Red Hat
Component: iSCSI (Show other bugs)
3.1
Unspecified Linux
high Severity high
: z1
: 3.1
Assigned To: Mike Christie
Manohar Murthy
Bara Ancincova
:
: 1626836 (view as bug list)
Depends On:
Blocks: 1584264
  Show dependency treegraph
 
Reported: 2018-08-29 13:46 EDT by Yevhenii Shapovalov
Modified: 2018-10-24 06:47 EDT (History)
13 users (show)

See Also:
Fixed In Version: tcmu-runner-1.4.0-0.3.el7cp
Doc Type: Bug Fix
Doc Text:
.The DM-Multipath device's path no longer bounces between the failed and active state causing I/O failures, hangs, and performance issues _[fixed by 3.1z1]_ In Red Hat Enterprise Linux 7.5, the kernel's ALUA layer reduced the number of times an initiator retries the SCSI sense code `ALUA State Transition`. This code is returned from the target side by the `tcmu-runner` service when taking the RBD exclusive lock during a failover or failback scenario and during a device discovery. As a consequence, the maximum number of retries had occurred before the discovery process was completed, and the SCSI layer returned a failure to the multipath I/O layer. The multipath I/O layer tried the next available path, and the same problem occurred. This behavior caused a loop of path checking, resulting in failed I/O operations and management operations to the multipath device. In addition, the logs on the initiator node printed messages about devices being removed and then re-added. This bug has been fixed, and the aforementioned operations no longer fail.
Story Points: ---
Clone Of:
: 1624040 (view as bug list)
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
vakulkar: automate_bug+
mkasturi: needinfo+


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Github open-iscsi/tcmu-runner/pull/471 None None None 2018-09-12 11:42 EDT

  None (edit)
Description Yevhenii Shapovalov 2018-08-29 13:46:13 EDT
Description of problem:
iscsi device constantly busy by systemd-udevd. Issue can be reproduced on ceph 3.0

Version-Release number of selected component (if applicable):
RHEL 7.5(all nodes), kernel version 3.10.0-862.el7.x86_64, Ceph version 3.0 or 3.1

How reproducible:
20-30%

Steps to Reproduce:
1. Deploy ceph storage
2. Config iscsi target using gwcli  
3. Config iscsi initiator
4. Connect initiator to target
5. Create ext4 fs on iscsi disks
6. Mount disks to initiator
7. Do some i/o operations
8. Unmount all disks
9. execute multipath -F
Actual results:
Flushing failed
At least one of disks busy
[root@ceph-yshap-1535553643452-node4-iscsi-clients log]# fuser -vam /dev/mapper/mpatha
                     USER        PID ACCESS COMMAND
/dev/dm-0:           root      29785 f.... systemd-udevd


Expected results:
Clean up multipath

Additional info:
Env topology: 4 ods, 1 installer+mon+mgr, 1 iscsi intiator

[root@ceph-yshap-1535553643452-node6-osd ~]# gwcli
/> ls /
o- / ......................................................................................................................... [...]
  o- clusters ........................................................................................................ [Clusters: 1]
  | o- ceph .......................................................................................................... [HEALTH_WARN]
  |   o- pools .......................................................................................................... [Pools: 1]
  |   | o- rbd ............................................................................. [(x3), Commit: 16G/59G (27%), Used: 9G]
  |   o- topology ............................................................................................... [OSDs: 12,MONs: 1]
  o- disks ......................................................................................................... [16G, Disks: 8]
  | o- rbd.test_image9350 .................................................................................... [test_image9350 (2G)]
  | o- rbd.test_image9351 .................................................................................... [test_image9351 (2G)]
  | o- rbd.test_image9352 .................................................................................... [test_image9352 (2G)]
  | o- rbd.test_image9353 .................................................................................... [test_image9353 (2G)]
  | o- rbd.test_image9354 .................................................................................... [test_image9354 (2G)]
  | o- rbd.test_image9355 .................................................................................... [test_image9355 (2G)]
  | o- rbd.test_image9356 .................................................................................... [test_image9356 (2G)]
  | o- rbd.test_image9357 .................................................................................... [test_image9357 (2G)]
  o- iscsi-target ..................................................................................................... [Targets: 1]
    o- iqn.2003-01.com.redhat.iscsi-gw:ceph-igw ...................................................................... [Gateways: 4]
      o- gateways ............................................................................................ [Up: 4/4, Portals: 4]
      | o- ceph-yshap-1535553643452-node2-osd ................................................................ [172.16.115.126 (UP)]
      | o- ceph-yshap-1535553643452-node3-osd ................................................................. [172.16.115.49 (UP)]
      | o- ceph-yshap-1535553643452-node5-osd ................................................................ [172.16.115.141 (UP)]
      | o- ceph-yshap-1535553643452-node6-osd ................................................................ [172.16.115.159 (UP)]
      o- host-groups .................................................................................................. [Groups : 0]
      o- hosts .......................................................................................................... [Hosts: 1]
        o- iqn.1994-05.com.redhat:a0c1ceb63ca1 .............................................. [LOGGED-IN, Auth: CHAP, Disks: 8(16G)]
          o- lun 0 ............................................. [rbd.test_image9350(2G), Owner: ceph-yshap-1535553643452-node2-osd]
          o- lun 1 ............................................. [rbd.test_image9351(2G), Owner: ceph-yshap-1535553643452-node5-osd]
          o- lun 2 ............................................. [rbd.test_image9352(2G), Owner: ceph-yshap-1535553643452-node6-osd]
          o- lun 3 ............................................. [rbd.test_image9353(2G), Owner: ceph-yshap-1535553643452-node3-osd]
          o- lun 4 ............................................. [rbd.test_image9354(2G), Owner: ceph-yshap-1535553643452-node2-osd]
          o- lun 5 ............................................. [rbd.test_image9355(2G), Owner: ceph-yshap-1535553643452-node5-osd]
          o- lun 6 ............................................. [rbd.test_image9356(2G), Owner: ceph-yshap-1535553643452-node6-osd]
          o- lun 7 ............................................. [rbd.test_image9357(2G), Owner: ceph-yshap-1535553643452-node3-osd]


After flushing:
[root@ceph-yshap-1535553643452-node4-iscsi-clients log]# multipath -ll
mpatha (36001405c882304bed544a8ea7277cf3d) dm-0 LIO-ORG ,TCMU device     
size=2.0G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='queue-length 0' prio=50 status=enabled
| `- 3:0:0:0 sdj  8:144  failed ready running
|-+- policy='queue-length 0' prio=10 status=enabled
| `- 2:0:0:0 sda  8:0    failed ready running
|-+- policy='queue-length 0' prio=10 status=enabled
| `- 4:0:0:0 sdi  8:128  failed ready running
`-+- policy='queue-length 0' prio=10 status=active
  `- 5:0:0:0 sdk  8:160  active ready running
mpathg (360014050d58eba023ec4cb2a61b1972b) dm-6 LIO-ORG ,TCMU device     
size=2.0G features='1 queue_if_no_path' hwhandler='1 alua' wp=rw
|-+- policy='queue-length 0' prio=50 status=active
| `- 4:0:0:1 sdae 65:224 failed ready running
|-+- policy='queue-length 0' prio=10 status=enabled
| `- 3:0:0:1 sdad 65:208 failed ready running
|-+- policy='queue-length 0' prio=10 status=enabled
| `- 5:0:0:1 sdaf 65:240 failed ready running
`-+- policy='queue-length 0' prio=10 status=enabled
  `- 2:0:0:1 sdh  8:112  failed ready running

[root@ceph-yshap-1535553643452-node4-iscsi-clients log]# cat  messages | grep systemd-udevd
Aug 29 10:44:11 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: starting version 219
Aug 29 10:44:12 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: starting version 219
Aug 29 11:19:03 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [11525] /devices/virtual/block/dm-7 is taking a long time
Aug 29 11:19:51 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [11523] /devices/virtual/block/dm-2 is taking a long time
Aug 29 11:20:53 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [11525] /devices/virtual/block/dm-7 is taking a long time
Aug 29 11:22:19 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [11526] /devices/virtual/block/dm-3 is taking a long time
Aug 29 11:24:52 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [11525] /devices/virtual/block/dm-7 is taking a long time
Aug 29 11:26:41 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [11523] /devices/virtual/block/dm-5 is taking a long time
Aug 29 11:29:57 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [11523] /devices/virtual/block/dm-7 is taking a long time
Aug 29 11:46:45 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29079] /devices/virtual/block/dm-6 is taking a long time
Aug 29 11:52:25 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29785] /devices/virtual/block/dm-0 is taking a long time
Aug 29 11:53:28 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29885] /devices/virtual/block/dm-6 is taking a long time
Aug 29 11:54:10 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29785] /devices/virtual/block/dm-0 is taking a long time
Aug 29 11:56:25 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29785] /devices/virtual/block/dm-0 is taking a long time
Aug 29 11:56:25 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29885] /devices/virtual/block/dm-6 is taking a long time
Aug 29 12:01:48 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29785] /devices/virtual/block/dm-0 is taking a long time
Aug 29 12:03:13 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: timeout '/sbin/multipath -T 1535555852: -c /dev/dm-0'
Aug 29 12:03:13 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: slow: '/sbin/multipath -T 1535555852: -c /dev/dm-0' [31158]
Aug 29 12:03:13 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: timeout '/sbin/kpartx -an /dev/dm-0'
Aug 29 12:03:13 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: slow: '/sbin/kpartx -an /dev/dm-0' [31159]
Aug 29 12:04:27 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29785] /devices/virtual/block/dm-0 is taking a long time
Aug 29 12:04:27 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29885] /devices/virtual/block/dm-6 is taking a long time
Aug 29 12:06:06 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29885] /devices/virtual/block/dm-6 timeout; kill it
Aug 29 12:06:06 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: seq 4925 '/devices/virtual/block/dm-6' killed
Aug 29 12:06:17 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29885] terminated by signal 9 (Killed)
Aug 29 12:06:57 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29785] /devices/virtual/block/dm-0 is taking a long time
Aug 29 12:11:15 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29967] /devices/virtual/block/dm-6 is taking a long time
Aug 29 12:13:50 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29785] /devices/virtual/block/dm-0 is taking a long time
Aug 29 12:13:50 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29967] /devices/virtual/block/dm-6 is taking a long time
Aug 29 12:15:21 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29967] /devices/virtual/block/dm-6 is taking a long time
Aug 29 12:22:30 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29785] /devices/virtual/block/dm-0 is taking a long time
Aug 29 12:22:54 ceph-yshap-1535553643452-node4-iscsi-clients systemd-udevd: worker [29967] /devices/virtual/block/dm-6 is taking a long time


[root@ceph-yshap-1535553643452-node4-iscsi-clients log]# fuser -vam /dev/mapper/mpatha
                     USER        PID ACCESS COMMAND
/dev/dm-0:           root      29785 f.... systemd-udevd
[root@ceph-yshap-1535553643452-node4-iscsi-clients log]# ps -aux | grep 29785
root      5390  0.0  0.0 112704   980 pts/1    S+   12:45   0:00 grep --color=auto 29785
root     29785  0.0  0.1  48408  4296 ?        D    11:50   0:02 /usr/lib/systemd/systemd-udevd
Comment 4 Vasu Kulkarni 2018-08-29 14:17:36 EDT
Complete sequence of logs can be viewed from Reportportal (login as ceph/ceph)

http://cistatus.ceph.redhat.com/ui/#cephci/launches/New_filter%7Cpage.page=1&page.size=50&filter.cnt.name=iscsi&page.sort=start_time%2Cnumber%2CDESC/5b841e83b3cb1f000162d78d?page.page=1&page.size=50&page.sort=start_time%2CASC


One can click on each test case and there will be logs link that will tell the exact steps run like , setup cluster using ceph ansible -> create iscsi gateway using gw cli -> issue mpath commands and exercise IO etc
Comment 7 Mike Christie 2018-08-29 17:52:36 EDT
How can I get the target and initiator side logs for these?

I was going to say it looks like that recurring multipathd udev/systemd hung issue, but in the multpipath -ll output you pasted one of the devices has a APD and other device only has one path up.

Are you doing path down tests during this test? Did multipathd crash by any chance?
Comment 8 Mike Christie 2018-08-29 17:59:49 EDT
(In reply to Yevhenii Shapovalov from comment #5)
> jenkins run with same issue
> https://ceph-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/RHCS%203.
> x%20RHEL%207.5/job/ceph-iscsi-sanity-3.x-rhel7.5/31/consoleFull

This is a different issue right? BZ

https://bugzilla.redhat.com/show_bug.cgi?id=1623650
Comment 9 Mike Christie 2018-08-29 18:05:36 EDT
(In reply to Yevhenii Shapovalov from comment #5)
> jenkins run with same issue
> https://ceph-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/RHCS%203.
> x%20RHEL%207.5/job/ceph-iscsi-sanity-3.x-rhel7.5/31/consoleFull

This looks like it is BZ:

https://bugzilla.redhat.com/show_bug.cgi?id=1623650
Comment 10 Vasu Kulkarni 2018-08-29 18:36:53 EDT
Mike,

The failover test happens after this one, we purge the cluser and recreate the setup again for failover test but the IP's remain same, I have given you the iscsi client IP and login details along with Ceph cluster IP details, we can keep the system in same state for debug.

Thanks
Comment 11 Mike Christie 2018-08-29 21:11:47 EDT
It looks like this is caused by a RHEL 7.5 change to the alua driver:

commit a553c7f7a6001c17d6575f182772add9a113ca92
Author: Mike Snitzer <snitzer@redhat.com>
Date:   Fri Oct 27 16:11:45 2017 -0400

    [scsi] scsi_dh_alua: Recheck state on unit attention
    
    Message-id: <1509120727-26247-37-git-send-email-snitzer@redhat.com>
    Patchwork-id: 195279
    O-Subject: [RHEL7.5 PATCH v2 36/58] scsi_dh_alua: Recheck state on unit atte
ntion
    Bugzilla: 1499107
    RH-Acked-by: Jerry Snitselaar <jsnitsel@redhat.com>
    RH-Acked-by: Ewan Milne <emilne@redhat.com>
    
    BZ: 1499107
    
    commit 2b35865e7a290d313c3d156c0c2074b4c4ffaf52
    Author: Hannes Reinecke <hare@suse.de>
    Date:   Fri Feb 19 09:17:13 2016 +0100
    
        scsi_dh_alua: Recheck state on unit attention
    
        When we receive a unit attention code of 'ALUA state changed'
        we should recheck the state, as it might be due to an implicit
        ALUA state transition. This allows us to return NEEDS_RETRY
        instead of ADD_TO_MLQUEUE, allowing to terminate the retries
        after a certain time.
        At the same time a workqueue item might already be queued, which
        should be started immediately to avoid any delays.
    
        Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
        Reviewed-by: Christoph Hellwig <hch@lst.de>
        Signed-off-by: Hannes Reinecke <hare@suse.com>
        Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
    
    Signed-off-by: Rafael Aquini <aquini@redhat.com>



The problem is that we return "alua state transition" while grabbing the lock. This can take a second or two. In RHEL 7.4 the alua layer handled this error by retrying for up to around 5 minutes. With the change above we get 5 retries but they are burned very quickly in a matter of hundreds of ms.

So what happens is the IO is failed, the path is failed, other paths are tried and this repeats. multipathd then gets backed up handling path renablement, and this can go on for the entire test.

I can fix this in a one line patch in tcmu-runner for 3.1.z.

I am also going to work on a upstream/RHEL fix.
Comment 14 Mike Christie 2018-09-05 01:31:15 EDT
*** Bug 1623650 has been marked as a duplicate of this bug. ***
Comment 15 Harish NV Rao 2018-09-05 03:28:02 EDT
@Mike, should this be in the 3.1 Release notes as a known issue?
Comment 17 Mike Christie 2018-09-05 16:31:05 EDT
*** Bug 1623650 has been marked as a duplicate of this bug. ***
Comment 38 Vikhyat Umrao 2018-10-02 13:37:00 EDT
*** Bug 1626836 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.