Bug 1784895

Summary: RBD block devices remained mapped after concurrent "rbd unmap" failures
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Jason Dillaman <jdillama>
Component: RBDAssignee: Jason Dillaman <jdillama>
Status: CLOSED ERRATA QA Contact: Harish Munjulur <hmunjulu>
Severity: urgent Docs Contact:
Priority: high    
Version: 4.0CC: ceph-eng-bugs, gpatta, hgurav, hmunjulu, hyelloji, jbrier, knortema, mkasturi, tchandra, tserlin
Target Milestone: rc   
Target Release: 4.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-14.2.8-3.el8, ceph-14.2.8-3.el7 Doc Type: Bug Fix
Doc Text:
.Multiple `rbd unmap` commands can be issued concurrently and the corresponding RBD block devices are unmapped successfully Previously, issuing concurrent `rbd unmap` commands could result in udev-related event race conditions. The commands would sporadically fail, and the corresponding RBD block devices might remain mapped to their node. With this update, the udev-related event race conditions have been fixed, and the commands no longer fail.
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-19 17:31:24 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1783415, 1816167    

Description Jason Dillaman 2019-12-18 15:42:37 UTC
Description of problem:
Unmapping 200 images concurrently leaves behind 10-20 mappings. This is simulating an OCS workload mapping/unmapping RBD PVs on a node.

Version-Release number of selected component (if applicable):
14.2.4-69.el8cp

How reproducible:
100%

Steps to Reproduce:
1. Map hundreds of RBD images to a single node (simulate OCS PVs getting mapped to a pod)
2. Unmap the volumes concurrently

Actual results:
Several RBD block devices remained mapped

Expected results:
All RBD block devices are unmapped

Comment 2 Jason Dillaman 2020-01-07 14:02:48 UTC
Moving to 4.0.z1 pending QE ack.

Comment 10 Harish Munjulur 2020-04-17 00:01:56 UTC
Issue not reproducible on ceph 4.1(14.2.8-21.el8)

Comment 11 Harish Munjulur 2020-04-17 00:02:56 UTC
QE: Verified

Comment 15 errata-xmlrpc 2020-05-19 17:31:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:2231