Bug 1899566

Summary: Attempting to background remove in-use image results in apparent stuck progress
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Jason Dillaman <jdillama>
Component: RBDAssignee: Greg Farnum <gfarnum>
Status: CLOSED ERRATA QA Contact: Gopi <gpatta>
Severity: low Docs Contact:
Priority: unspecified    
Version: 4.1CC: ceph-eng-bugs, kdreyer
Target Milestone: ---   
Target Release: 5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ceph-16.0.0-8633.el8cp Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-08-30 08:27:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1898565    

Description Jason Dillaman 2020-11-19 15:10:49 UTC
Description of problem:
When attempting to background remove an RBD image when it's still in-use (i.e. it has a child image attached), the remove will correctly fail but (1) it leaves an apparently stuck progress notification and (2) the MGR logs shows the (expected) failure every thirty seconds while it retries.

Note that is purely a progress status and log issue. Once the image can be deleted (since the associated child images are flattened or removed), the removal will correctly proceed.

Version-Release number of selected component (if applicable):
RHCS 4.1

How reproducible:
100%

Steps to Reproduce:
1. Create a parent and child image
2. Run "rbd task add [trash] remove <image-[id-]spec>"
3. 

Actual results:

Progress:

sh-4.4# ceph -s

< .... SNIP ... >
  progress:
    Removing image ocs-storagecluster-cephblockpool/a10a894b726cc from trash
      [..............................]
    Removing image ocs-storagecluster-cephblockpool/a10a87ad46bc5 from trash
      [..............................]
    Removing image ocs-storagecluster-cephblockpool/a10a862e83ce1 from trash
      [..............................]

Logs:

debug 2020-11-17 13:02:54.211 7f44e544e700 -1 librbd::SnapshotRemoveRequest: 0x558a65235a20 should_complete: encountered error: (16) Device or resource busy
debug 2020-11-17 13:02:54.211 7f44e544e700 -1 librbd::image::PreRemoveRequest: 0x558a64e78f20 handle_remove_snapshot: failed to auto-prune snapshot 16: (16) Device or resource busy
debug 2020-11-17 13:02:54.215 7f44e32ca700  0 mgr[rbd_support] execute_task: [errno 39] error deleting image from trash

Expected results:
Images that cannot be removed due to dependencies should not be listed in the progress and we should avoid logging errors 

Additional info:

Comment 1 Ken Dreyer (Red Hat) 2021-01-12 23:40:40 UTC
Changes are in master as v16.0.0-7436-g4040082610, so this is available in our downstream Pacific builds.

Comment 5 errata-xmlrpc 2021-08-30 08:27:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.0 bug fix and enhancement), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3294