Bug 1326575

Summary: Seeing rbd: renaming snap failed: (110) Connection timed out, while renaming snapshot
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Tanay Ganguly <tganguly>
Component: RBDAssignee: Jason Dillaman <jdillama>
Status: CLOSED ERRATA QA Contact: ceph-qe-bugs <ceph-qe-bugs>
Severity: low Docs Contact:
Priority: unspecified    
Version: 2.0CC: ceph-eng-bugs, jdillama, kdreyer, kurs, nlevine, vakulkar
Target Milestone: rc   
Target Release: 2.0   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-10.2.0-1.el7cp Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-08-23 19:36:00 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tanay Ganguly 2016-04-13 06:51:10 UTC
Description of problem:
With IO in progress on the parent Volume, renaming snapshot i am seeing connection timed out, although in actual the Snapshot is getting renamed

Version-Release number of selected component (if applicable):
10.1.1.1

How reproducible:
Most of the times

Steps to Reproduce:
1. Create an Image.
2. Create Snap, Protect it and Clone it.
3. Run IO on Parent Image.
4. While step 3 is in progress, rename the snapshot.

Actual results:
Seeing a timeout while renaming the snapshot, but actually the snapshot is getting renamed.

Expected results:
Shouldn't see any timeout 

Additional info:

[root@cephqe3 yum.repos.d]# rbd snap rename Tanay-RBD/BIG_Image1@snap_new Tanay-RBD/BIG_Image1@teee
2016-04-13 17:17:29.731355 7f3914ed6d80  5 librbd::AioImageRequestWQ: 0x7f392013af80 : ictx=0x7f39201396f0
2016-04-13 17:17:29.751219 7f38f59aa700  5 librbd::AioImageRequestWQ: block_writes: 0x7f39201396f0, num=1
2016-04-13 17:17:29.751324 7f3914ed6d80  5 librbd::Operations: 0x7f39201368e0 snap_rename: snap_name=snap_new, new_snap_name=teee
2016-04-13 17:17:29.751337 7f3914ed6d80  5 librbd::Operations: 0x7f39201368e0 execute_snap_rename: snap_id=4, new_snap_name=teee
2016-04-13 17:17:29.751345 7f3914ed6d80  5 librbd::SnapshotRenameRequest: 0x7f392013b700 send_rename_snap
2016-04-13 17:17:29.756254 7f38f59aa700  5 librbd::SnapshotRenameRequest: 0x7f392013b700 should_complete: state=0, r=0
rbd: renaming snap failed: (110) Connection timed out
2016-04-13 17:17:34.802279 7f38f59aa700  5 librbd::AioImageRequestWQ: shut_down: in_flight=0
2016-04-13 17:17:34.802343 7f38f51a9700  5 librbd::AioImageRequestWQ: unblock_writes: 0x7f39201396f0, num=0
[root@cephqe3 yum.repos.d]# rbd -p Tanay-RBD --image BUG-CLONE info
2016-04-13 17:17:44.866114 7f07f2c18d80  5 librbd::AioImageRequestWQ: 0x7f07fdf8db90 : ictx=0x7f07fdf8c5d0
2016-04-13 17:17:44.871301 7f07d36ec700  5 librbd::AioImageRequestWQ: 0x7f07ac007390 : ictx=0x7f07ac0032c0
2016-04-13 17:17:44.877007 7f07d2eeb700  5 librbd::AioImageRequestWQ: block_writes: 0x7f07ac0032c0, num=1
2016-04-13 17:17:44.879152 7f07d36ec700  5 librbd::AioImageRequestWQ: unblock_writes: 0x7f07ac0032c0, num=0
rbd image 'BUG-CLONE':
        size 11211 MB in 2803 objects
        order 22 (4096 kB objects)
        block_name_prefix: rbd_data.149082eb141f2
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
        flags: 
        parent: Tanay-RBD/BIG_Image1@teee
        overlap: 11003 MB
2016-04-13 17:17:44.879707 7f07f2c18d80  5 librbd::AioImageRequestWQ: shut_down: in_flight=0
2016-04-13 17:17:44.879831 7f07d2eeb700  5 librbd::AioImageRequestWQ: shut_down: in_flight=0

Comment 2 Jason Dillaman 2016-04-13 11:36:33 UTC
Upstream PR: https://github.com/ceph/ceph/pull/8543

Comment 3 Ken Dreyer (Red Hat) 2016-04-26 20:52:41 UTC
The above PR is present in v10.2.0.

Comment 5 Tanay Ganguly 2016-05-30 10:03:16 UTC
Verified: 
ceph version 10.2.1-6.el7cp

Comment 7 errata-xmlrpc 2016-08-23 19:36:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1755.html