Bug 1351484

Summary: ceph-disk should timeout when a lock cannot be acquired
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Martin Kudlej <mkudlej>
Component: Ceph-DiskAssignee: Loic Dachary <ldachary>
Status: CLOSED ERRATA QA Contact: Ramakrishnan Periyasamy <rperiyas>
Severity: medium Docs Contact:
Priority: low    
Version: 2.0CC: flucifre, hnallurv, icolle, kdreyer, mkudlej, uboppana
Target Milestone: rc   
Target Release: 2.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: RHEL: ceph-10.2.3-2.el7cp Ubuntu: ceph_10.2.3-3redhat1xenial Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-22 19:28:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Martin Kudlej 2016-06-30 07:39:57 UTC
Description of problem:
I have these disks:
$ lsblk -a -t
NAME        ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED    RQ-SIZE   RA WSAME
sda                 0    512 126976     512     512    1 deadline     128 4096   32M
└─sda1              0    512 126976     512     512    1 deadline     128 4096   32M
sdb                 0    512 126976     512     512    1 deadline     128 4096   32M
vda                 0    512      0     512     512    1              128 4096    0B
├─vda1              0    512      0     512     512    1              128 4096    0B
└─vda2              0    512      0     512     512    1              128 4096    0B
vdb                 0    512      0     512     512    0              128 4096    0B
└─vdb1              0    512      0     512     512    0              128 4096    0B
vdc                 0    512      0     512     512    1              128 4096    0B
loop0               0    512      0     512     512    1              128 4096    0B
└─bad_disk          0    512      0     512     512    1              128 4096    0B
loop1               0    512      0     512     512    1              128 4096    0B
└─bad_disk2         0    512      0     512     512    1              128  128    0B

$ lsblk -a -t -S
NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED    RQ-SIZE   RA WSAME NAME HCTL       TYPE VENDOR   MODEL             REV TRAN
sda          0    512 126976     512     512    1 deadline     128 4096   32M sda  9:0:0:0    disk LIO-ORG  block1           4.0  iscsi
sdb          0    512 126976     512     512    1 deadline     128 4096   32M sdb  10:0:0:0   disk LIO-ORG  block2           4.0  iscsi

$  lsblk -a 
NAME        MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda           8:0    0   5G  0 disk 
└─sda1        8:1    0   5G  0 part /var/lib/ceph/osd/cluster1-0
sdb           8:16   0   5G  0 disk 
vda         253:0    0  20G  0 disk 
├─vda1      253:1    0   2G  0 part 
└─vda2      253:2    0  18G  0 part /
vdb         253:16   0  10G  0 disk 
└─vdb1      253:17   0   5G  0 part 
vdc         253:32   0  10G  0 disk 
loop0         7:0    0   5G  0 loop 
└─bad_disk  252:0    0   5G  0 dm   
loop1         7:1    0   5G  0 loop 
└─bad_disk2 252:1    0   5G  0 dm   

I've tried to list info about disks by ceph-disk, but it stucks, so I've killed it by CTRL-C.

$ ceph-disk list
^CTraceback (most recent call last):
  File "/usr/sbin/ceph-disk", line 9, in <module>
    load_entry_point('ceph-disk==1.0.0', 'console_scripts', 'ceph-disk')()
  File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4994, in run
    main(sys.argv[1:])
  File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4945, in main
    args.func(args)
  File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4157, in main_list
    activate_lock.acquire()  # noqa
  File "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 253, in acquire
    fcntl.lockf(self.fd, fcntl.LOCK_EX)
KeyboardInterrupt

I don't see any errors in logs.


Version-Release number of selected component (if applicable):
ceph-base-10.2.2-5.el7cp.x86_64
ceph-common-10.2.2-5.el7cp.x86_64
ceph-osd-10.2.2-5.el7cp.x86_64
ceph-selinux-10.2.2-5.el7cp.x86_64
libcephfs1-10.2.2-5.el7cp.x86_64
python-cephfs-10.2.2-5.el7cp.x86_64
[root@dhcp-126-35 log]# rpm -qa | egrep "ceph|python" | sort 
abrt-addon-python-2.1.11-35.el7.x86_64
abrt-python-2.1.11-35.el7.x86_64
audit-libs-python-2.4.1-5.el7.x86_64
ceph-base-10.2.2-5.el7cp.x86_64
ceph-common-10.2.2-5.el7cp.x86_64
ceph-osd-10.2.2-5.el7cp.x86_64
ceph-selinux-10.2.2-5.el7cp.x86_64
dbus-python-1.1.1-9.el7.x86_64
langtable-python-0.0.31-3.el7.noarch
libcephfs1-10.2.2-5.el7cp.x86_64
libreport-python-2.1.11-31.el7.x86_64
libselinux-python-2.2.2-6.el7.x86_64
libsemanage-python-2.1.10-18.el7.x86_64
libstoragemgmt-python-1.2.3-4.el7.noarch
libxml2-python-2.9.1-6.el7_2.3.x86_64
newt-python-0.52.15-4.el7.x86_64
policycoreutils-python-2.2.5-20.el7.x86_64
python-2.7.5-34.el7.x86_64
python-augeas-0.5.0-2.el7.noarch
python-babel-0.9.6-8.el7.noarch
python-backports-1.0-8.el7.x86_64
python-backports-ssl_match_hostname-3.4.0.2-4.el7.noarch
python-cephfs-10.2.2-5.el7cp.x86_64
python-chardet-2.2.1-1.el7_1.noarch
python-configobj-4.7.2-7.el7.noarch
python-configshell-1.1.fb18-1.el7.noarch
python-cpopen-1.4-0.1.el7.x86_64
python-crypto-2.6.1-1.1.el7.x86_64
python-dateutil-1.5-7.el7.noarch
python-decorator-3.4.0-3.el7.noarch
python-dmidecode-3.10.13-11.el7.x86_64
python-ethtool-0.8-5.el7.x86_64
python-gudev-147.2-7.el7.x86_64
python-hwdata-1.7.3-4.el7.noarch
python-iniparse-0.4-9.el7.noarch
python-IPy-0.75-6.el7.noarch
python-jinja2-2.7.2-2.1.el7.noarch
python-kitchen-1.1.1-5.el7.noarch
python-kmod-0.9-4.el7.x86_64
python-libs-2.7.5-34.el7.x86_64
python-lxml-3.2.1-5.el7ost.x86_64
python-magic-5.11-31.el7.noarch
python-markupsafe-0.11-10.el7.x86_64
python-msgpack-0.4.5-1.1.el7.x86_64
python-perf-3.10.0-327.22.2.el7.x86_64
python-pycurl-7.19.0-17.el7.x86_64
python-pyudev-0.15-7.el7_2.1.noarch
python-rados-10.2.2-5.el7cp.x86_64
python-rbd-10.2.2-5.el7cp.x86_64
python-requests-2.7.0-1.el7ost.noarch
python-rhsm-1.15.4-5.el7.x86_64
python-rtslib-2.1.fb57-3.el7.noarch
python-setuptools-0.9.8-4.el7.noarch
python-six-1.9.0-2.el7.noarch
python-slip-0.4.0-2.el7.noarch
python-slip-dbus-0.4.0-2.el7.noarch
python-urlgrabber-3.10-7.el7.noarch
python-urllib3-1.10.4-1.20150503gita91975b.el7ost.noarch
python-urwid-1.1.1-3.el7.x86_64
python-zmq-14.3.1-1.1.el7.x86_64
redhat-support-lib-python-0.9.7-3.el7.noarch
rpm-python-4.11.3-17.el7.x86_64
systemd-python-219-19.el7_2.11.x86_64


How reproducible:
I think 100%

Steps to Reproduce:
1. create some iscsi disks.
2. create osds and journals on those disks
3. try ceph-disk list command

Actual results:
ceph-disk stucks.

Expected results:
ceph-disk list timeouts and print error message in case of error.

Comment 2 Loic Dachary 2016-06-30 08:22:25 UTC
This is because another process is holding the activate lock. Did you try to verify if a ceph-disk process is hung ?

Comment 3 Martin Kudlej 2016-07-01 15:48:42 UTC
There were stuck some ceph-disk commands so there were locks for devices.
But anyway in case of lock existence ceph-disk should output                   error message and ends after reasonable timeout. 
I think this is correct behaviour for "ceph-disk list" for sure. 
It's on you if this can be correct behaviour also for another subcommands.
I think command should not stuck or end without proper error message.

Comment 4 Loic Dachary 2016-07-04 09:46:37 UTC
@mkudlej you're right. I created http://tracker.ceph.com/issues/16580 to fix this.

Comment 5 Ken Dreyer (Red Hat) 2016-07-05 16:54:54 UTC
Would you please explain how to induce the backend disk failure? Does it have something to do with the iscsi environment?

Comment 6 Martin Kudlej 2016-07-08 14:18:37 UTC
Because Console works only with empty disks and i cannot use partitions, I've created this structure described in comment #1:
$  lsblk -a 
NAME        MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda           8:0    0   5G  0 disk 
└─sda1        8:1    0   5G  0 part /var/lib/ceph/osd/cluster1-0
sdb           8:16   0   5G  0 disk 
vda         253:0    0  20G  0 disk 
├─vda1      253:1    0   2G  0 part 
└─vda2      253:2    0  18G  0 part /
vdb         253:16   0  10G  0 disk 
└─vdb1      253:17   0   5G  0 part 
vdc         253:32   0  10G  0 disk 
loop0         7:0    0   5G  0 loop 
└─bad_disk  252:0    0   5G  0 dm   
loop1         7:1    0   5G  0 loop 
└─bad_disk2 252:1    0   5G  0 dm   

Where:
loop0 is device created by losetup from file.
bad_disk is device created by Device mapper which includes about in the middle error mapping target https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/device_mapper.html A.1.5
And then I've created scsi block device from bad_disk.

Similar is loop1 -> bad_disk2 -> sdb

I've created disks with bad sectors for testing purpose to simulate disk with bad sectors and I've hoped that I see proper error messages in Ceph and console. If you would like to look at it, please contact me. I will rebuild that cluster in about week.

Comment 7 Ken Dreyer (Red Hat) 2016-07-12 15:25:11 UTC
timeout PR undergoing review upstream: https://github.com/ceph/ceph/pull/10262

Comment 8 Ken Dreyer (Red Hat) 2016-07-14 19:30:37 UTC
Discussed this with Product Management (Neil) today. Since this is a negative test case where failures are intentionally induced into the loopback devices, we are going to re-target this to RHCS 2.1. (This will provide more time for a ceph-disk solution to stabilize upstream.)

Comment 13 Ramakrishnan Periyasamy 2016-10-14 11:22:13 UTC
Moving the bug to verified state.

Verified in below ceph and RHEL kernel version

[user@node ~]$ ceph -v
ceph version 10.2.3-4.el7cp (852125d923e43802a51f681ca2ae9e721eec91ca)

[user@node ~]$ uname -a
Linux magna109 3.10.0-511.el7.x86_64 #1 SMP Wed Sep 28 12:25:44 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux

RHEL 7.3

[user@node ~]$ lsblk -t -a
NAME   ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE   RA WSAME
sda            0    512      0     512     512    1 cfq       128 4096    0B
└─sda1         0    512      0     512     512    1 cfq       128 4096    0B
sdb            0    512      0     512     512    1 cfq       128 4096    0B
├─sdb1         0    512      0     512     512    1 cfq       128 4096    0B
└─sdb2         0    512      0     512     512    1 cfq       128 4096    0B
sdc            0    512      0     512     512    1 cfq       128 4096    0B
├─sdc1         0    512      0     512     512    1 cfq       128 4096    0B
└─sdc2         0    512      0     512     512    1 cfq       128 4096    0B
sdd            0    512      0     512     512    1 cfq       128 4096    0B
├─sdd1         0    512      0     512     512    1 cfq       128 4096    0B
└─sdd2         0    512      0     512     512    1 cfq       128 4096    0B
[user@node ~]$ lsblk -a
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 931.5G  0 disk 
└─sda1   8:1    0 931.5G  0 part /
sdb      8:16   0 931.5G  0 disk 
├─sdb1   8:17   0 921.5G  0 part /var/lib/ceph/osd/ceph-2
└─sdb2   8:18   0    10G  0 part 
sdc      8:32   0 931.5G  0 disk 
├─sdc1   8:33   0 921.5G  0 part /var/lib/ceph/osd/ceph-5
└─sdc2   8:34   0    10G  0 part 
sdd      8:48   0 931.5G  0 disk 
├─sdd1   8:49   0 921.5G  0 part /var/lib/ceph/osd/ceph-7
└─sdd2   8:50   0    10G  0 part 
[user@node ~]$ sudo ceph-disk list
/dev/sda :
 /dev/sda1 other, ext4, mounted on /
/dev/sdb :
 /dev/sdb2 ceph journal, for /dev/sdb1
 /dev/sdb1 ceph data, active, cluster ceph, osd.2, journal /dev/sdb2
/dev/sdc :
 /dev/sdc2 ceph journal, for /dev/sdc1
 /dev/sdc1 ceph data, active, cluster ceph, osd.5, journal /dev/sdc2
/dev/sdd :
 /dev/sdd2 ceph journal, for /dev/sdd1
 /dev/sdd1 ceph data, active, cluster ceph, osd.7, journal /dev/sdd2

Comment 14 Loic Dachary 2016-10-17 11:41:56 UTC
The corresponding patch that resolves this problem is at https://github.com/ceph/ceph/commit/430ab1b83e67dfb697b034e669b06b7a600bcc6b

Comment 16 errata-xmlrpc 2016-11-22 19:28:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2815.html