Bug 2259054 - Improve rbd_diff_iterate2() performance in fast-diff mode [5.3z]
Summary: Improve rbd_diff_iterate2() performance in fast-diff mode [5.3z]
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Ceph Storage
Classification: Red Hat Storage
Component: RBD
Version: 6.1
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 5.3z7
Assignee: Ilya Dryomov
QA Contact: Sunil Angadi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-01-18 21:39 UTC by Ilya Dryomov
Modified: 2024-06-26 10:01 UTC (History)
6 users (show)

Fixed In Version: ceph-16.2.10-257.el8cp
Doc Type: Enhancement
Doc Text:
Previously, RBD diff-iterate was not guaranteed to execute locally if exclusive lock was available when diffing against the beginning of time (`fromsnapname == NULL`) in fast-diff mode (`whole_object == true` with `fast-diff` image feature enabled and valid). With this enhancement, `rbd_diff_iterate2()` API performance is improved, thereby increasing the performance for QEMU live disk synchronization and backup use cases, where the `fast-diff` image feature is enabled.
Clone Of: 2258997
Environment:
Last Closed: 2024-06-26 10:01:49 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Ceph Project Bug Tracker 63341 0 None None None 2024-01-18 21:39:51 UTC
Red Hat Issue Tracker RHCEPH-8200 0 None None None 2024-01-18 21:42:15 UTC
Red Hat Product Errata RHSA-2024:4118 0 None None None 2024-06-26 10:01:54 UTC

Description Ilya Dryomov 2024-01-18 21:39:51 UTC
+++ This bug was initially created as a clone of Bug #2258997 +++

Description of problem:
when user tries rbd du <pool>/<image>
irrespective of the image size command takes quite long to provide disk usage of an image
It can be improved in terms of performance.


Version-Release number of selected component (if applicable):
Issue exist in RHCS 5, RHCS 6 and RHCS 7

How reproducible:
Always 

Steps to Reproduce:
1.Deploy ceph and create pools and image
2.Run some image with huge IO and some with less IO
3.Notice the time taken to execute the rbd du command both the images

Actual results:
irrespective of the image size command rbd du takes quite long to provide disk usage

Expected results:
The performance of fast-diff can be improved

Additional info:
https://tracker.ceph.com/issues/63341
https://gitlab.com/qemu-project/qemu/-/issues/1026

Comment 5 Ilya Dryomov 2024-06-05 08:20:10 UTC
(In reply to Sunil Angadi from comment #4)
> Tested using
> ceph version 16.2.10-260.el8cp (b20e1a5452628262667a6b060687917fde010343)
> pacific (stable)

Hi Sunil,

Is this the version of Ceph installed on the client node too (i.e. where rbdbackup_with_lock.sh script is run)?

> 
> QEMU available for latest rhel8.9 is
> "qemu-kvm-6.2.0-40.module+el8.9.0+20867+9a6a0901.2"

QEMU 6.2 should be affected, both according to my understanding based on the code and the original report at https://gitlab.com/qemu-project/qemu/-/issues/1026.

> Timestamp for "event":"JOB_STATUS_CHANGE" with "status":"running":
> 1717495086.944589
> Timestamp for "event":"BLOCK_JOB_COMPLETED": 1717495131.119845
> Now, subtract the first timestamp from the second:
> 
> 1717495086.944589 - 1717495131.119845 = 44.17seconds
> 
> this performance result is not same as RHEL 9 

Have you tried running the same script on a build without the fix?  Is there any difference in performance?

(Again, what matters the version of Ceph installed on the client node -- you should be able to perform both tests against the same cluster.)

Comment 8 errata-xmlrpc 2024-06-26 10:01:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat Ceph Storage 5.3 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:4118


Note You need to log in before you can comment on or make changes to this bug.