Bug 1903612 - RBD fast-diff regression introduced due to snapshot-based mirroring changes
Summary: RBD fast-diff regression introduced due to snapshot-based mirroring changes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RBD
Version: 4.2
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.2
Assignee: Jason Dillaman
QA Contact: Harish Munjulur
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-02 13:59 UTC by Jason Dillaman
Modified: 2021-01-12 14:59 UTC (History)
6 users (show)

Fixed In Version: ceph-14.2.11-94.el8cp, ceph-14.2.11-94.el7cp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-01-12 14:58:46 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 48412 0 None None None 2020-12-02 13:59:56 UTC
Github ceph ceph pull 38389 0 None closed librbd: fix regression in object map diff request 2021-01-12 17:21:05 UTC
Github ceph ceph pull 38539 0 None closed librbd/object_map: don't assert if a snapshot doesn't exist 2021-01-12 17:21:05 UTC
Red Hat Product Errata RHSA-2021:0081 0 None None None 2021-01-12 14:59:11 UTC

Description Jason Dillaman 2020-12-02 13:59:57 UTC
Description of problem:
A regression was introduced during the introduction of snapshot-based mirroring that results in the fast-diff feature being broken. 

"rbd du" will not only take orders of magnitude longer, it will also potentially return incorrect results if the fast-diff feature is not enabled but object-map is enabled.

"rbd-mirror" in snapshot-based mirroring mode will not be able to use the speed optimization of only testing dirty blocks and instead will query deltas for each data block from remote OSDs. If the fast-diff feature is not enabled and object-map is enabled, it will result in data corruption.

Version-Release number of selected component (if applicable):
4.2

How reproducible:
100%

Steps to Reproduce:
1. create a large image
2. 'rbd du' against the image

Actual results:
Results can take minutes for large enough images.

Expected results:
Should execute almost immediately because fast-diff will be used


Additional info:

Comment 1 Jason Dillaman 2020-12-02 14:12:24 UTC
Note: this might also effect the dashboard since it calculates RBD image disk usage in the background if fast-diff feature is enabled, but the librbd API is not correctly using fast-diff.

Comment 5 Veera Raghava Reddy 2020-12-15 17:26:14 UTC
Hi Jason, From the description the bZ seems to be a regression and don't have workaround. We should include this BZ into 4.2

Comment 7 Veera Raghava Reddy 2020-12-15 18:18:16 UTC
Hi Jason, Thanks for sharing the details. Proposing the BZ for 4.2 based on the potential impact to OSP customers.

Comment 15 Harish Munjulur 2020-12-23 21:32:27 UTC
Verified on ceph version 14.2.11-95.el8cp

Comment 17 errata-xmlrpc 2021-01-12 14:58:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat Ceph Storage 4.2 Security and Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0081


Note You need to log in before you can comment on or make changes to this bug.