Bug 2170310
| Summary: | 5.x client with 4.x cluster : RBD IO failed saying rbd: symbol lookup error: rbd: undefined symbol | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Vasishta <vashastr> |
| Component: | RADOS | Assignee: | Brad Hubbard <bhubbard> |
| Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Pawan <pdhiran> |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | 5.3 | CC: | bhubbard, ceph-eng-bugs, cephqe-warriors, ngangadh, vumrao |
| Target Milestone: | --- | Keywords: | Automation |
| Target Release: | 6.1z1 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-06-29 00:44:29 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Vasishta
2023-02-16 05:42:59 UTC
$ echo _ZN8librados7v14_2_05IoCtx7notify2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERN4ceph6buffer7v14_2_04listEmPSD_|c++filt librados::v14_2_0::IoCtx::notify2(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph::buffer::v14_2_0::list&, unsigned long, ceph::buffer::v14_2_0::list*) Can you post the output of the following? # ldd /usr/bin/rbd # rpm -qf /usr/bin/rbd # nm -gD /usr/bin/rbd # nm -gD /usr/lib/librados.so.2 # rpm -qf /usr/lib/librados.so.2 This assumes the librados in the ldd output of the first command is /usr/lib/librados.so.2 (it should be). If that is not the case then use whatever path ldd returns for the last two commands please. (In reply to Vasishta from comment #0) > Additional info: > Found similar issues in hammer to jewel upgrade upstream suite: > http://pastebin.test.redhat.com/1091748 Sorry for incorrect link, it is https://tracker.ceph.com/issues/17809 (In reply to Brad Hubbard from comment #1) Hi Brad, Tried couple of times to reproduce but did not hit above issue. Recent attempt involved same version of 4.x (14.2.22-128.el8cp) with recent version of 5.x build # rpm -qf /usr/lib64/librados.so.2 librados2-16.2.10-137.el8cp.x86_64 I tried using same automation suite (downstream (cephci)), in this suite we upgrad packages and clients in parallel with IOs. So all iterations might happen in slightly random fashion. Test which failed - https://github.com/red-hat-storage/cephci/blob/master/suites/pacific/upgrades/tier-1_upgrade_test-4x-to-5x-rpm.yaml#L101-L158 Implementation of parallelism https://github.com/red-hat-storage/cephci/blob/master/tests/parallel/test_parallel.py#L38-L41 ------> https://github.com/red-hat-storage/cephci/blob/master/ceph/parallel.py Does this seem to be corner case issue ? We can either wait until you can gather all the information in comment #1 or we can close this as having insufficient data. At the moment I don't believe the combination of binaries that caused this would actually be supported in practice but I need confirmation of the exact binaries involved and the symbol data to make that call. Let me know whether you want to leave it open or just reopen it later when you have the requested data. |