Bug 2064202 - VMware LSO - degradation in the multiple CephFS clone creation times ( average) on 4.10 vs 4.9
Summary: VMware LSO - degradation in the multiple CephFS clone creation times ( averag...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: csi-driver
Version: 4.10
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: yati padia
QA Contact: Elad
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-15 10:12 UTC by Yuli Persky
Modified: 2023-08-09 16:37 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-18 03:36:56 UTC
Embargoed:


Attachments (Terms of Use)

Description Yuli Persky 2022-03-15 10:12:17 UTC
Description of problem (please be detailed as possible and provide log
snippests):

There is a  degradation in the multiple CephFS clone creation times ( average) on 4.10 vs 4.9 on Vmware LSO platform. 

The average CephFS clone creation time (average of 512 clones) in 4.9 is: 5.8 sec
The average CephFS clone creation time (average of 512 clones) in 4.10 is: 7.6 sec


The full multpiple clone performance comparison is available at Performance Dashboard:
http://ocsperf.ceph.redhat.com:8080/index.php?version1=6&build1=46&platform1=2&az_topology1=3&test_name%5B%5D=13&version2=14&build2=37&platform2=2&az_topology2=3&version3=&build3=&version4=&build4=&submit=Choose+options


Version of all relevant components (if applicable):

4.10 cluster: 

OCP: 4.10.0-0.nightly-2022-02-15-041303
ODF: 4.10.0-156
Ceph: 16.2.7-49.el8cp

4.9 cluster: 

OCP: 4.9.21
ODF: 4.9.0-251.ci
Ceph: 16.2.0-146.el8cp


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?

No


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?

3


Can this issue reproducible?

Yes 

Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:

Yes, this is a regression as cann be seen from the results comparison here : http://ocsperf.ceph.redhat.com:8080/index.php?version1=6&build1=46&platform1=2&az_topology1=3&test_name%5B%5D=13&version2=14&build2=37&platform2=2&az_topology2=3&version3=&build3=&version4=&build4=&submit=Choose+options


Steps to Reproduce:
1. Deploy Vmware LSO cluster
2. run test_pvc_multi_clone_performance.py and compare results to the previous release. 
3.


Actual results:

CephFS multiple clones average time is around 30% longer on 4.10 compared to 4.9. 


Expected results:

The average clone cretation  time should be same or shorter in 4.10. 


Additional info:

4.10 Relevant Jenkins Job:

https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/10162/testReport/

4.10 must gather is available here: 

http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/ypersky-local10/ypersky-local10_20220215T103839/logs/testcases_1645035276/

4.9 Relevant Jenkins Job: 

https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/10514/

4.9 must gather is available in one of the links below:
   
http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/ypersky-lso9/ypersky-lso9_20220228T124019/logs/testcases_1646291211/

http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/ypersky-lso9/ypersky-lso9_20220228T124019/logs/testcases_1646313870/

Comment 5 Mudit Agarwal 2022-03-22 13:31:17 UTC
Not a 4.10 blocker based on the analysis done so far, keeping it open as it still awaits some answers.

Comment 6 Yuli Persky 2022-04-14 08:05:53 UTC
@Yati Padia,

I've re-run the test_pvc_multi_clone_performance.py again on the latest available ODF build. 


OCP Version	4.9.21	                        4.10.0-0.nightly-2022-04-12-031100
OCS Version	4.9.0-251.ci	                4.10.0-221
Ceph Version	16.2.0-146.el8cp        	16.2.7-98.el8cp



The relevant Jenkins run is : https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/job/qe-deploy-ocs-cluster/11763/

Must gather logs are available here: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/ypersky-ls10/ypersky-ls10_20220412T121953/logs/testcases_1649866832/

Performance Dashboard comparison between 4.10.0.221 ( the current run) and 4.9 is available here: http://ocsperf.ceph.redhat.com:8080/index.php?version1=6&build1=46&platform1=2&az_topology1=3&test_name%5B%5D=13&version2=14&build2=63&platform2=2&az_topology2=3&version3=&build3=&version4=&build4=&submit=Choose+options


We see that the degradation in CephFS AVERAGE multiple clone creation time is NOT reproduced. Actually the new results show improvement of 21% in 4.10. 

As far and I'm concerned - this BZ can be closed.

Comment 7 yati padia 2022-04-18 03:36:56 UTC
Thanks @ypersky, closing this bug.


Note You need to log in before you can comment on or make changes to this bug.