Description of problem: With a random write workload, RHEL8 delivers approx 3x the IOPS as the same workload on RHEL9. ESXi also sees this drop in performance. Version-Release number of selected component (if applicable): 7.0 How reproducible: Every time Steps to Reproduce: 1. Create RHEL8.8 and RHEL9.2 clients connecting to different nvmeof subsystems (same gateway) 2. Provide 8 namespaces to each client 3. use fio to run random write workload at 4KB blocksize across all namespaces concurrently - with each namespace loaded to qdepth of 128 4. Review fio results file Actual results: Examples RHEL8 IOPS = 313,924 RHEL9 IOPS = 134,174 Expected results: For the same workload, variance is expected but not to this degree. Normal expecation would be 5-10%. Additional info:
To help identify the issue, output from the following commands have been requested. 1. during the connect: - tcpdump. 2. shortly after the connect: - dmesg -T - nvmf_get_transports - nvmf_get_subsystems - rpc_nvmf_subsystem_get_qpairs - rpc_nvmf_subsystem_get_controllers - rpc_nvmf_subsystem_get_listeners 3. during io - nvmf_get_stats - bdev_get_iostat Output from these commands is attached as two separate tar files, one for rhel8 and the other for rhel9
I have tried different parameters on the connect command in RHEL9 -i 8 -Q 1024 -W 8 -Q 1024 And also different tuned-profiles network-latency throughput-performance
Created attachment 1996497 [details] tcpdumps from rhel8 and rhel9 during connect
I looked at the tcpdumps with wireshark and when I applied the nvme-tcp filter for the rhel9 tcpdump, nothing was shown! With RHEL8, it just worked, and I can see the nvme/tcp packets. Although the "problem" client is already RHEL9.2, I registered the server and ran an update which had some interesting updates: kernel (from 5.14.0-284.11.1 to 5.14.0-284.30.1), nvme-cli and libnvme. However, a repeat of the test run did NOT close the gap with the RHEL8 IOPS result.
Created attachment 1996499 [details] ceph nvmeof gw conf file
Created attachment 1996502 [details] output from get_bdevs
Created attachment 1996503 [details] nvme list output from rhel9
I think we need to try and reproduce it in 7.1?
I don't have any free hardware in the Scalelab to test this - everything is ESX8 or RHEL9. I didn't think RHEL8 was going to be supported anyway - so perhaps this issue should just move to the upstream backlog? @aviv.caro what do you think>?
Per Paul's comment at https://ibm-systems-storage.slack.com/archives/C05AM6G7ZF1/p1716200930999259 these can be closed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Critical: Red Hat Ceph Storage 7.1 security, enhancements, and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:3925
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days