Bug 2247140
| Summary: | RHEL9 random write performance significantly less than RHEL8 | ||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Paul Cuzner <pcuzner> | ||||||||||||||||||
| Component: | NVMeOF | Assignee: | Aviv Caro <acaro> | ||||||||||||||||||
| Status: | CLOSED ERRATA | QA Contact: | Paul Cuzner <pcuzner> | ||||||||||||||||||
| Severity: | medium | Docs Contact: | ceph-doc-bot <ceph-doc-bugzilla> | ||||||||||||||||||
| Priority: | unspecified | ||||||||||||||||||||
| Version: | 7.0 | CC: | aviv.caro, cephqe-warriors, idryomov, owasserm, rlepaksh, tserlin | ||||||||||||||||||
| Target Milestone: | --- | ||||||||||||||||||||
| Target Release: | 7.1 | ||||||||||||||||||||
| Hardware: | Unspecified | ||||||||||||||||||||
| OS: | Unspecified | ||||||||||||||||||||
| Whiteboard: | |||||||||||||||||||||
| Fixed In Version: | ceph-18.2.1-157.el9cp | Doc Type: | If docs needed, set a value | ||||||||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||||||||
| Clone Of: | Environment: | ||||||||||||||||||||
| Last Closed: | 2024-06-13 14:22:40 UTC | Type: | Bug | ||||||||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||||||||
| Documentation: | --- | CRM: | |||||||||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||||
| Embargoed: | |||||||||||||||||||||
| Attachments: |
|
||||||||||||||||||||
|
Description
Paul Cuzner
2023-10-30 21:58:43 UTC
To help identify the issue, output from the following commands have been requested. 1. during the connect: - tcpdump. 2. shortly after the connect: - dmesg -T - nvmf_get_transports - nvmf_get_subsystems - rpc_nvmf_subsystem_get_qpairs - rpc_nvmf_subsystem_get_controllers - rpc_nvmf_subsystem_get_listeners 3. during io - nvmf_get_stats - bdev_get_iostat Output from these commands is attached as two separate tar files, one for rhel8 and the other for rhel9 I have tried different parameters on the connect command in RHEL9 -i 8 -Q 1024 -W 8 -Q 1024 And also different tuned-profiles network-latency throughput-performance Created attachment 1996497 [details]
tcpdumps from rhel8 and rhel9 during connect
I looked at the tcpdumps with wireshark and when I applied the nvme-tcp filter for the rhel9 tcpdump, nothing was shown! With RHEL8, it just worked, and I can see the nvme/tcp packets. Although the "problem" client is already RHEL9.2, I registered the server and ran an update which had some interesting updates: kernel (from 5.14.0-284.11.1 to 5.14.0-284.30.1), nvme-cli and libnvme. However, a repeat of the test run did NOT close the gap with the RHEL8 IOPS result. Created attachment 1996499 [details]
ceph nvmeof gw conf file
Created attachment 1996502 [details]
output from get_bdevs
Created attachment 1996503 [details]
nvme list output from rhel9
I think we need to try and reproduce it in 7.1? I don't have any free hardware in the Scalelab to test this - everything is ESX8 or RHEL9. I didn't think RHEL8 was going to be supported anyway - so perhaps this issue should just move to the upstream backlog? @aviv.caro what do you think>? Per Paul's comment at https://ibm-systems-storage.slack.com/archives/C05AM6G7ZF1/p1716200930999259 these can be closed. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Critical: Red Hat Ceph Storage 7.1 security, enhancements, and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:3925 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |