Bug 1879866
Summary: | [Tracker for Bug 1881316] FIO results on CephFS are up to 30% degraded from 4.5 | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat OpenShift Container Storage | Reporter: | Avi Liani <alayani> | |
Component: | ceph | Assignee: | Patrick Donnelly <pdonnell> | |
Status: | CLOSED NOTABUG | QA Contact: | Raz Tamir <ratamir> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 4.6 | CC: | assingh, bniver, ekuric, jijoy, madam, muagarwa, ocs-bugs, pdonnell, sostapov | |
Target Milestone: | --- | Keywords: | AutomationTriaged, Performance | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | No Doc Update | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1881316 (view as bug list) | Environment: | ||
Last Closed: | 2020-11-17 15:34:59 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1881316 |
Comment 2
Yaniv Kaul
2020-09-17 09:57:32 UTC
all logs and must-gather can be found at : http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/BZ-1879866/(In reply to Yaniv Kaul from comment #2) > So trying to understand the important items here: > 1. It's on VMware LSO - what about other platforms? did not tested, yet > 2. What's the difference between the Ceph versions? Same ceph version > 3. Is it on the same *OCP* versions? yes - it is on the same OCP, only OCS was upgrade > 4. RHCOS version? > 5. I assume 'append' is the main issue? You are referring the smallfile test, while this BZ is for the FIO test one page above in the report. > > How's Ceph doing? (ceph status would be nice to see!) All logs and must-gather can be found at : http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/BZ-1879866/ (In reply to Avi Liani from comment #3) > all logs and must-gather can be found at : > http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/BZ-1879866/(In > reply to Yaniv Kaul from comment #2) > > So trying to understand the important items here: > > 1. It's on VMware LSO - what about other platforms? > did not tested, yet > > > 2. What's the difference between the Ceph versions? > Same ceph version > > > 3. Is it on the same *OCP* versions? > yes - it is on the same OCP, only OCS was upgrade > > > 4. RHCOS version? # oc get node -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME compute-0 Ready worker 3d v1.18.3+6c42de8 10.1.160.85 10.1.160.85 Red Hat Enterprise Linux CoreOS 45.82.202008290529-0 (Ootpa) 4.18.0-193.14.3.el8_2.x86_64 cri-o://1.18.3-11.rhaos4.5.gite5bcc71.el8 compute-1 Ready worker 3d v1.18.3+6c42de8 10.1.160.105 10.1.160.105 Red Hat Enterprise Linux CoreOS 45.82.202008290529-0 (Ootpa) 4.18.0-193.14.3.el8_2.x86_64 cri-o://1.18.3-11.rhaos4.5.gite5bcc71.el8 compute-2 Ready worker 3d v1.18.3+6c42de8 10.1.160.141 10.1.160.141 Red Hat Enterprise Linux CoreOS 45.82.202008290529-0 (Ootpa) 4.18.0-193.14.3.el8_2.x86_64 cri-o://1.18.3-11.rhaos4.5.gite5bcc71.el8 control-plane-0 Ready master 3d v1.18.3+6c42de8 10.1.160.88 10.1.160.88 Red Hat Enterprise Linux CoreOS 45.82.202008290529-0 (Ootpa) 4.18.0-193.14.3.el8_2.x86_64 cri-o://1.18.3-11.rhaos4.5.gite5bcc71.el8 control-plane-1 Ready master 3d v1.18.3+6c42de8 10.1.160.86 10.1.160.86 Red Hat Enterprise Linux CoreOS 45.82.202008290529-0 (Ootpa) 4.18.0-193.14.3.el8_2.x86_64 cri-o://1.18.3-11.rhaos4.5.gite5bcc71.el8 control-plane-2 Ready master 3d v1.18.3+6c42de8 10.1.160.146 10.1.160.146 Red Hat Enterprise Linux CoreOS 45.82.202008290529-0 (Ootpa) 4.18.0-193.14.3.el8_2.x86_64 cri-o://1.18.3-11.rhaos4.5.gite5bcc71.el8 > > > 5. I assume 'append' is the main issue? > You are referring the smallfile test, while this BZ is for the FIO test one > page above in the report. > > > > How's Ceph doing? (ceph status would be nice to see!) > > > All logs and must-gather can be found at : > http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/BZ-1879866/ (In reply to Avi Liani from comment #3) > all logs and must-gather can be found at : > http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/BZ-1879866/(In > reply to Yaniv Kaul from comment #2) > > So trying to understand the important items here: > > 1. It's on VMware LSO - what about other platforms? > did not tested, yet > > > 2. What's the difference between the Ceph versions? > Same ceph version > > > 3. Is it on the same *OCP* versions? > yes - it is on the same OCP, only OCS was upgrade OCS 4.5 -> 4.6 but the Ceph version did not change? > > 4. RHCOS version? > > > 5. I assume 'append' is the main issue? > You are referring the smallfile test, while this BZ is for the FIO test one > page above in the report. > > > > How's Ceph doing? (ceph status would be nice to see!) > > > All logs and must-gather can be found at : > http://rhsqe-repo.lab.eng.blr.redhat.com/OCS/ocs-qe-bugs/BZ-1879866/ How can I access this machine to examine the logs. HTTP is unusable for analyzing logs. If the Ceph version did not change, there's no reason to believe that there is a bug in CephFS (bz1881316). I looked at: https://docs.google.com/document/d/1thRo0AGK2af2ECUGiLOBQ28UQtzWLX-2iYxeSfdRy9s/edit# This appears to be data from a single run. There really should be at least three runs to do a proper analysis. Finally, these are data path I/O where the MDS is minimally involved and the client I/O pattern will be almost the same as RBD. We have done extensive performance testing in the past which affirms this. I suspect this is a transient performance hiccup of some kind. Assigning to you Patrick to track the info you requested (needinfo). If it is not a bug please close it. Moving it out from 4.6, please bring it back if there is sufficient data. |