Bug 1984590
| Summary: | AWS - Degradation of performance of small files CephFS 4 KB file size in 4.8 compared to 4.7 results | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Yuli Persky <ypersky> |
| Component: | ceph | Assignee: | Greg Farnum <gfarnum> |
| Status: | CLOSED DUPLICATE | QA Contact: | Elad <ebenahar> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.8 | CC: | alayani, bengland, bniver, khiremat, kramdoss, madam, muagarwa, ocs-bugs, odf-bz-bot, pdonnell, rcyriac, shberry, vshankar |
| Target Milestone: | --- | Keywords: | Performance, Regression |
| Target Release: | --- | Flags: | kramdoss:
needinfo+
muagarwa: needinfo? (shberry) |
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-03-14 14:36:44 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Yuli Persky
2021-07-21 17:40:26 UTC
PLease note that the platform is AWS, not vmware ( as appears by mistake in the description). Full version list : HW Platform AWS Number of OCS nodes 3 Number of total OSDs 3 OSD Size (TiB) 2.00 Total available storage (GiB) 6,140 OCP Version 4.8.0-0.nightly-2021-07-04-112043 OCS Version 4.8.0-444.ci Ceph Version 14.2.11-183.el8cp While we collect the MG, Ccing CephFS team here too as I am not sure any significant changes in Ceph Core on this area. 14.2.11-184.el8cp (In reply to Humble Chirammal from comment #7) > While we collect the MG, Ccing CephFS team here too as I am not sure any > significant changes in Ceph Core on this area. > > 14.2.11-184.el8cp Below are the ceph versions mentioned in the doc for 4.8 and 4.7 tests : 4.8 version : 14.2.11-184.el8cp 4.7 version: 14.2.11-147.el8cp The numbers in this report show regression for sure, but the difference between throughput and IOPS numbers seems a bit strange. For 4-KiB files (indeed for any file smaller than 1 MiB), IOPS should = files/sec, and throughput (MB/s) should be files/sec x file size in KiB / 1000. In particular, some of the test1 results do not fit the above formula. I commented on that in the doc. What happened? Let's find out. The raw log data is in a link from the perf doc, perhaps the answer is in there. Avi, +1, if there are 3 samples, then as long as the %deviation is low (i.e. under 10%) then I don't think it's noise in the measurement. What is the %deviation in these measurements? Was cache dropping used? Remember that smallfile is not using O_DIRECT I/O, unlike fio. Unless you request fsync: y, it does not flush dirty pages. Also, smallfile has no notion of a "prefill" where it preallocates the space. So it's a very different workload. Smallfile tests don't generate readdirs unless you specifically request that operation. Can you change the target_size_ratio of cephfs pool and rerun the smallfile test for CEPHfs? This will tell CEPH that it should expect most data on CEPHfs poo and it will align the PGs accordingly. Here's how you set it: ceph osd pool set ocs-storagecluster-cephfilesystem-data0 target_size_ratio 0.95 and change the RBD pool to 0.05 ceph osd pool set ocs-storagecluster-cephblockpool target_size_ratio 0.05 Wait for all the PGs to balance before running the test. (pg_num and pgp_num should match in the pool description, ceph osd pool ls detail) Avi/Yuli, how are the results of 4.9 compared to 4.8? Actively being looked upon by ceph folks, changing the component. @Yaniv, The AWS Performance report ( 4.9 vs 4.8) is available here: https://docs.google.com/document/d/1vyufd55iDyvKeYOwoXwKSsNoRK2VR41QNTuH-iERR8s/edit Not a 4.9 regression, still being discussed. Moving it out based on the offline discussion with QE https://bugzilla.redhat.com/show_bug.cgi?id=2015520#c29 Defer to Ben/Venky. (I'm on paternity leave.) No decision yet on this, not a 4.10 blocker. Moving it out. Setting NI on Venky based on Patrick's comment. Closing as dupe based on https://bugzilla.redhat.com/show_bug.cgi?id=2015520#c29 and the follow-on discussions there and here. *** This bug has been marked as a duplicate of bug 2015520 *** Clearing my NI. |