Description of problem (please be detailed as possible and provide log snippests): There is a degradation in pvc attach time for both RBD and CephFS PVCs in ODF 4.10 vs ODF 4.9 Version of all relevant components (if applicable): ODF 4.10.0.50 Note : you may find additional details in the following Jenkins job : https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/view/Performance/job/qe-trigger-aws-ipi-3az-rhcos-3m-3w-performance/56/ Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? 3 Can this issue reproducible? Yes. I also reproduced this problem ( degradation) on a cluster deployed with OCP 4.9 and ODF 4.10. Can this issue reproduce from the UI? If this is a regression, please provide more details to justify this: In 4.9 OCP + 4.9 ODF the average of 10 PVCs attach times were: RBD: 7.4 sec CephFS: 6.6 sec In 4.10 OCP + 4.10 ODF the average of 10 PVCs attach times were: RBD: 10.2 sec CephFS: 8.8 sec In 4.9 OCP + 4.10 ODF the average of 10 PVCs attach times were: RBD: 12.8 sec CephFS: 11 sec The detailed comparison report is available here: https://docs.google.com/document/d/1OJfARHBAJs6bkYqri_HpSNM_N5gchUQ6P-lKe6ujQ6o/edit# Steps to Reproduce: 1. Run test_pvc_attachtime.py test 2.Compare its results ( average attach time of 10 samples) to 4.9 results ( available in this report: https://docs.google.com/document/d/1vyufd55iDyvKeYOwoXwKSsNoRK2VR41QNTuH-iERR8s/edit ) 3. Actual results: Average attach time in ODF 4.10 ( with both OCP 4.9 and OCP 4.10) is at least 30% worse than in OCP 4.9 + ODF 4.9, for both RBD and CephFS. Please not that this is an average of 10 samples. Expected results: Average attach time should be same or shorter than in OCP 4.9 + ODF 4.9. Additional info: Relevand Jenkins job: https://ocs4-jenkins-csb-odf-qe.apps.ocp-c1.prod.psi.redhat.com/view/Performance/job/qe-trigger-aws-ipi-3az-rhcos-3m-3w-performance/56/ Comparison report: https://docs.google.com/document/d/1OJfARHBAJs6bkYqri_HpSNM_N5gchUQ6P-lKe6ujQ6o/edit#
Please note that must gather logs are available here: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/j-056ai3c33-p/j-056ai3c33-p_20211230T130122/logs/testcases_1640872857/
@Rakshith Thank you for pointing out that the imagePullPolicy` default is Always. We checked our test logs and this is indeed what is going on, not only in the test_pvc_attachtime test but also in others. This means that the currently reported measurements of attach/reattach times include pulling image. I've added fixing all the 3 tests to QPAS team workplan in P0 priority. After the tests are fixed we would be able to provide more accurate attach/reattach times.
All the performance tests that were using default pull policy: Always were fixed not to pull image each time. I will run them on 4.10 and 4.9 and will post here the results of this comparison.
An Update: I've run the fixed pvc_attachtime.py test ( the fix was not to pull image each time) on 4.10.0 build 184 latest and 4.9.4 build 7. The comparison is available here: http://ocsperf.ceph.redhat.com:8080/index.php?version1=17&build1=51&platform1=1&az_topology1=1&test_name%5B%5D=9&version2=14&build2=53&platform2=1&az_topology2=1&version3=&build3=&version4=&build4=&submit=Choose+options 4.9 must gather: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/lr5-ypersky-9aws/lr5-ypersky-9aws_20220309T120256/logs/testcases_1646831255/ 4.10 must gather: http://magna002.ceph.redhat.com/ocsci-jenkins/openshift-clusters/lr5-ypersky-10aws/lr5-ypersky-10aws_20220309T120401/logs/testcases_1646831358/ The comparison shown IMPROVEMENT in 4.10.0.184 in PVC attach time for both RBD (47%) and CephFS (42%). In 4.9 OCP + 4.9 ODF the average of 10 PVCs attach times were: RBD: 7.4 sec CephFS: 6.6 sec In 4.10 OCP + 4.10 ODF the average of 10 PVCs attach times were: RBD: 10.2 sec CephFS: 8.8 sec In 4.9 OCP + 4.10 ODF the average of 10 PVCs attach times were: RBD: 12.8 sec CephFS: 11 sec In the newly executed fixed test on 4.10OCP + 4.10 ODF the average of 10 PVC attach times are: RBD: 5.4 CephFS: 5.6 Theses are the best times measured so far. Therefore the BZ should be closed.