Bug 2269335

Summary: [7.0z backport] Improve rbd_diff_iterate2() performance in fast-diff mode
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Bipin Kunal <bkunal>
Component: RBDAssignee: Ilya Dryomov <idryomov>
Status: CLOSED DUPLICATE QA Contact: Manasa <mgowri>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.1CC: ceph-eng-bugs, cephqe-warriors, idryomov, sangadi, tserlin
Target Milestone: ---   
Target Release: 7.0z2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2258997
: 2269336 (view as bug list) Environment:
Last Closed: 2024-03-13 09:42:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2258997    
Bug Blocks: 2269336    

Description Bipin Kunal 2024-03-13 08:07:41 UTC
+++ This bug was initially created as a clone of Bug #2258997 +++

Description of problem:
when user tries rbd du <pool>/<image>
irrespective of the image size command takes quite long to provide disk usage of an image
It can be improved in terms of performance.


Version-Release number of selected component (if applicable):
Issue exist in RHCS 5, RHCS 6 and RHCS 7

How reproducible:
Always 

Steps to Reproduce:
1.Deploy ceph and create pools and image
2.Run some image with huge IO and some with less IO
3.Notice the time taken to execute the rbd du command both the images

Actual results:
irrespective of the image size command rbd du takes quite long to provide disk usage

Expected results:
The performance of fast-diff can be improved

Additional info:
https://tracker.ceph.com/issues/63341
https://gitlab.com/qemu-project/qemu/-/issues/1026

--- Additional comment from Ilya Dryomov on 2024-01-24 02:47:07 IST ---

Pushed to ceph-7.1-rhel-patches.

--- Additional comment from errata-xmlrpc on 2024-01-24 09:36:00 IST ---

Bug report changed to ON_QA status by Errata System.
A QE request has been submitted for advisory RHBA-2024:126567-01
https://errata.engineering.redhat.com/advisory/126567

--- Additional comment from errata-xmlrpc on 2024-01-24 09:36:07 IST ---

This bug has been added to advisory RHBA-2024:126567 by Thomas Serlin (tserlin)

--- Additional comment from Sunil Angadi on 2024-02-05 20:04:14 IST ---

QEMU used qemu-kvm-8.0.0-16.el9_3.3

[root@ceph-linus-zc4fdq-node4 ~]# cat rbdbackup_with_lock.sh 
#!/bin/bash
rbd create emptytestA --size $2 
rbd create emptytestB --size $2
$1 \
    -qmp stdio \
    -drive file=rbd:rbd/emptytestA:conf=/etc/ceph/ceph.conf:id=admin:keyring=/etc/ceph/ceph.client.admin.keyring,if=none,id=driveA,format=raw \
    -drive file=rbd:rbd/emptytestB:conf=/etc/ceph/ceph.conf:id=admin:keyring=/etc/ceph/ceph.client.admin.keyring,if=none,id=driveB,format=raw \
<<EOF
{"execute": "qmp_capabilities"}
{"execute": "blockdev-backup",
     "arguments": { "device": "driveA",
                    "sync": "full",
                    "target": "driveB" } }
EOF
rbd rm emptytestA
rbd rm emptytestB 



Test results of Fixed ceph version with exclusive lock available:
script used (rbdbackup_with_lock.sh)
---- 
100G  0.1s
200G  2.1s
500G  4.7s
1000G 9.12s
5000G 50s
50000G 461s

Other than fixed build with exclusive lock available:
---
10G 1.14s
100G 30s
200G 95.4s
500G 521s


[root@ceph-linus-zc4fdq-node4 ~]# cat rbdbackup_without_lock.sh 
#!/bin/bash
rbd create emptytestA --size $2
sudo rbd device map -o exclusive emptytestA 
rbd create emptytestB --size $2
$1 \
    -qmp stdio \
    -drive file=rbd:rbd/emptytestA:conf=/etc/ceph/ceph.conf:id=admin:keyring=/etc/ceph/ceph.client.admin.keyring,if=none,id=driveA,format=raw \
    -drive file=rbd:rbd/emptytestB:conf=/etc/ceph/ceph.conf:id=admin:keyring=/etc/ceph/ceph.client.admin.keyring,if=none,id=driveB,format=raw \
<<EOF
{"execute": "qmp_capabilities"}
{"execute": "blockdev-backup",
     "arguments": { "device": "driveA",
                    "sync": "full",
                    "target": "driveB" } }
EOF
sudo rbd device unmap emptytestA
rbd rm emptytestA
rbd rm emptytestB

Test results of Fixed ceph version with exclusive lock un-available:
script used (rbdbackup_without_lock.sh)
---
100G  6.6s
200G  15.3s
500G  45.8s
1000G 114s
5000G 966s

Other than fixed build with exclusive lock un-available:
---
10G 0.84s
100G 27s
200G 101s
500G 530s

sanity test also got passed
tier-1: http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-FA8XDE/
teir-2: http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-U04G0U/

Verified using
ceph version 18.2.1-10.el9cp (ccf42acecc9e7ec19c8994e4d2ca0180b612ad1e) reef (stable)