Bug 2237038
| Summary: | [RHOSP 17.1 Hackfest] Attempt to connect to RHCS 5.3 cluster from RHEL 9.2 using v1 protocol causes client core dump | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Andrew Austin <aaustin> |
| Component: | RADOS | Assignee: | Radoslaw Zarzynski <rzarzyns> |
| Status: | CLOSED ERRATA | QA Contact: | Harsh Kumar <hakumar> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 5.3 | CC: | bhubbard, ceph-eng-bugs, cephqe-warriors, ngangadh, nojha, sostapov, tserlin, vumrao |
| Target Milestone: | --- | ||
| Target Release: | 7.1 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | ceph-18.2.1-61.el9cp | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2024-06-13 14:21:07 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Please specify the severity of this bug. Severity is defined here: https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity. This seems to be related to https://bugzilla.redhat.com/show_bug.cgi?id=2235738. After locating that bug, I found that one of my mons had weight = 10 while the other two had weight = 0. Setting the odd mon to weight 0 resolved the issue. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Critical: Red Hat Ceph Storage 7.1 security, enhancements, and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:3925 |
Created attachment 1986778 [details] core dump from RHCS 5 client Description of problem: Executing ceph client commands from a RHEL 9.2 to an RHCS 5.3 cluster using messenger v1 results in the client crashing with a core dump. Commands succeed when the client is limited to v2; however RHOSP always configures the v1 port so this breaks integration between RHOSP 17.1 and an external RHCS 5.3 cluster. Version-Release number of selected component (if applicable): Tested with RHEL 9.2 (kernel 5.14.0-284.25.1.el9_2.x86_64) and both RHCS 5 (16.2.10-208.el9cp) and RHCS 6 (17.2.6-100.el9cp) clients. The cluster tested was version 16.2.10-187.el8cp. Connection to an RHCS 6.1 cluster seemed to be fine. How reproducible: From a fresh RHEL 9.2 machine with ceph-common installed, configure the ceph client to communicate with the RHCS cluster over v1 protocol. Run ceph df or ceph status to trigger the crash. Steps to Reproduce: 1.Configure a RHEL 9.2 machine with a ceph client pointing to an RHCS 5.3 cluster with only v1 ports 2.Run ceph df 3.Observe crash and core dump Actual results: When v1 protocol is used (forced or fallback), there is a core dump with the message below. Expected results: The command should return the normal ceph df output. Additional info: stderr text on crash: /usr/include/c++/11/bits/random.tcc:2667: void std::discrete_distribution<_IntType>::param_type::_M_initialize() [with _IntType = int]: Assertion '__sum > 0' failed. Aborted (core dumped) Workaround for users using ceph CLI: configure only v2 endpoints in ceph.conf Since OpenStack always configures libvirt to use port 6789, there does not seem to be a workaround for RHOSP 17.1 integration.