Bug 1327098

Summary: [RBD] rbd bench-write hangs if exclusive-lock is enabled or disabled during writes
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Tejas <tchandra>
Component: RBDAssignee: Jason Dillaman <jdillama>
Status: CLOSED ERRATA QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 2.0CC: ceph-eng-bugs, hyelloji, jdillama, kdreyer, nlevine, vakulkar
Target Milestone: rc   
Target Release: 2.0   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: ceph-10.2.0-1.el7cp Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-23 19:36:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tejas 2016-04-14 09:40:59 UTC
Description of problem:
rbd bench-write utility hangs, if when writes rae going on we enable the feature exclusive lock

Version-Release number of selected component (if applicable):
ceph 10.1.1

How reproducible:
Always

Steps to Reproduce:
1. create an plain rbd image format 2.
2. start bench-writes on it.
3. While writes are goin on, enable exclusive lock

Actual results:
Rbd bench write hangs.

Expected results:
IO should exit gracefully

Additional info:

[root@magna009 ~]# rbd create Tejas/img2 --size 100G
[root@magna009 ~]# 
[root@magna009 ~]# rbd info Tejas/img2
rbd image 'img2':
	size 102400 MB in 25600 objects
	order 22 (4096 kB objects)
	block_name_prefix: rbd_data.10ab2ae8944a
	format: 2
	features: layering
	flags: 
[root@magna009 ~]# 

[root@magna031 ~]# rbd feature enable Tejas/img2 exclusive-lock,object-map,fast-diff
2016-04-14 07:32:28.571088 7f282ad4bd80 -1 asok(0x7f28362cee60) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/rbd-clients/ceph-client.admin.11309.139810684334224.asok': (2) No such file or directory
[root@magna031 ~]# 
[root@magna031 ~]# 
[root@magna031 ~]# rbd info Tejas/img2
2016-04-14 07:32:34.608944 7fc52ee8cd80 -1 asok(0x7fc539e77e00) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/rbd-clients/ceph-client.admin.11333.140485056757856.asok': (2) No such file or directory
rbd image 'img2':
	size 102400 MB in 25600 objects
	order 22 (4096 kB objects)
	block_name_prefix: rbd_data.10ab2ae8944a
	format: 2
	features: layering, exclusive-lock, object-map, fast-diff
	flags: object map invalid, fast diff invalid
[root@magna031 ~]# 
[root@magna031 ~]# 
[root@magna031 ~]# rbd du Tejas/img2
2016-04-14 07:33:16.874555 7f0edb131d80 -1 asok(0x7f0ee62c3e00) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/run/ceph/rbd-clients/ceph-client.admin.11355.139701967929440.asok': (2) No such file or directory
warning: fast-diff map is invalid for img2. operation may be slow.
NAME PROVISIONED USED 
img2     102400M 464M 
[root@magna031 ~]# 

[root@magna009 ~]# rbd bench-write Tejas/img2
bench-write  io_size 4096 io_threads 16 bytes 1073741824 pattern sequential
  SEC       OPS   OPS/SEC   BYTES/SEC
    1     17034  15905.39  65148488.63
    2     27706  13265.46  54335341.58
    3     41443  13792.57  56494374.03
    4     51952  12827.59  52541800.72   <--- hung for over an hour
^C
[root@magna009 ~]#

Comment 2 Tejas 2016-04-14 10:16:03 UTC
bench-write hangs if exclusive lock is enabled or disabled during writes.

Comment 3 Jason Dillaman 2016-04-14 12:17:03 UTC
Upstream PR: https://github.com/ceph/ceph/pull/8511

Comment 4 Tejas 2016-04-18 10:15:33 UTC
Seeing the same issue when exclusive-lock is enabled during OS install on QEMU also

Comment 5 Ken Dreyer (Red Hat) 2016-04-26 20:54:24 UTC
The above PR is present in v10.2.0.

Comment 7 Hemanth Kumar 2016-05-31 11:16:31 UTC
Unable to reproduce the issue. Moving to verified state.

Comment 9 errata-xmlrpc 2016-08-23 19:36:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1755.html