Bug 2170310

Summary: 5.x client with 4.x cluster : RBD IO failed saying rbd: symbol lookup error: rbd: undefined symbol
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Vasishta <vashastr>
Component: RADOSAssignee: Brad Hubbard <bhubbard>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Pawan <pdhiran>
Severity: low Docs Contact:
Priority: low    
Version: 5.3CC: bhubbard, ceph-eng-bugs, cephqe-warriors, ngangadh, vumrao
Target Milestone: ---Keywords: Automation
Target Release: 6.1z1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-06-29 00:44:29 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Vasishta 2023-02-16 05:42:59 UTC
Description of problem:
While using 5.x client with 4.x cluster, rbd command failed saying 
rbd: symbol lookup error: rbd: undefined symbol: _ZN8librados7v14_2_05IoCtx7notify2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERN4ceph6buffer7v14_2_04listEmPSD_, version LIBRADOS_14.2.0

Version-Release number of selected component (if applicable):
16.2.10-133

How reproducible:
Not been able to reproduce due to some dependency errors, will update further

Steps to Reproduce:
1. Configure 4.x cluster
2. Upgrade client package to 5.x ceph
3. Try rbd commands, (we tried rbd resize)

Actual results:
rbd: symbol lookup error: rbd: undefined symbol: _ZN8librados7v14_2_05IoCtx7notify2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERN4ceph6buffer7v14_2_04listEmPSD_, version LIBRADOS_14.2.0

Expected results:
No errors

Additional info:
Found similar issues in hammer to jewel upgrade upstream suite:
http://pastebin.test.redhat.com/1091748

Comment 1 Brad Hubbard 2023-02-16 06:17:06 UTC
$ echo _ZN8librados7v14_2_05IoCtx7notify2ERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERN4ceph6buffer7v14_2_04listEmPSD_|c++filt 
librados::v14_2_0::IoCtx::notify2(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, ceph::buffer::v14_2_0::list&, unsigned long, ceph::buffer::v14_2_0::list*)

Can you post the output of the following?

# ldd /usr/bin/rbd
# rpm -qf /usr/bin/rbd
# nm -gD /usr/bin/rbd
# nm -gD /usr/lib/librados.so.2
# rpm -qf /usr/lib/librados.so.2

This assumes the librados in the ldd output of the first command is
/usr/lib/librados.so.2 (it should be). If that is not the case then use whatever
path ldd returns for the last two commands please.

Comment 2 Vasishta 2023-02-16 11:39:47 UTC
(In reply to Vasishta from comment #0)

> Additional info:
> Found similar issues in hammer to jewel upgrade upstream suite:
> http://pastebin.test.redhat.com/1091748

Sorry for incorrect link, it is https://tracker.ceph.com/issues/17809

Comment 3 Vasishta 2023-02-20 17:46:13 UTC
(In reply to Brad Hubbard from comment #1)

Hi Brad,


Tried couple of times to reproduce but did not hit above issue.
Recent attempt involved same version of 4.x (14.2.22-128.el8cp) with recent version of 5.x build

# rpm -qf /usr/lib64/librados.so.2
librados2-16.2.10-137.el8cp.x86_64

I tried using same automation suite (downstream (cephci)), in this suite we upgrad packages and clients in parallel with IOs.
So all iterations might happen in slightly random fashion.
Test which failed - https://github.com/red-hat-storage/cephci/blob/master/suites/pacific/upgrades/tier-1_upgrade_test-4x-to-5x-rpm.yaml#L101-L158
Implementation of parallelism 
https://github.com/red-hat-storage/cephci/blob/master/tests/parallel/test_parallel.py#L38-L41 ------> https://github.com/red-hat-storage/cephci/blob/master/ceph/parallel.py

Does this seem to be corner case issue ?

Comment 4 Brad Hubbard 2023-02-20 22:13:22 UTC
We can either wait until you can gather all the information in comment #1 or we
can close this as having insufficient data. At the moment I don't believe the
combination of binaries that caused this would actually be supported in practice
but I need confirmation of the exact binaries involved and the symbol data to
make that call. Let me know whether you want to leave it open or just reopen it
later when you have the requested data.