Description of problem: cephfs_mirror: fix crash in update_fs_mirrors(), when calling FSMirror::is_failed() Version-Release number of selected component (if applicable): RHCS6 How reproducible: This cannot be reproduced manually. You need to run the tests in test_mirroring.py and it crashes, even after all tests are passed. The ceph-client.mirror logs show the bt: -16> 2023-12-28T17:51:26.774+0000 7fdc39f63700 10 monclient: tick -15> 2023-12-28T17:51:26.900+0000 7fdc3d76a700 10 cephfs::mirror::Utils connect: connected to cluster=ceph using client=client.mirror -14> 2023-12-28T17:51:26.901+0000 7fdc3d76a700 20 cephfs::mirror::Utils mount: filesystem={fscid=56, fs_name=cephfs} -13> 2023-12-28T17:51:26.908+0000 7fdc3d76a700 10 cephfs::mirror::Utils mount: mounted filesystem={fscid=56, fs_name=cephfs} -12> 2023-12-28T17:51:26.908+0000 7fdc3d76a700 10 cephfs::mirror::FSMirror init: rados addrs=172.21.15.115:0/449872678 -11> 2023-12-28T17:51:26.908+0000 7fdc3d76a700 20 cephfs::mirror::FSMirror init_instance_watcher -10> 2023-12-28T17:51:26.908+0000 7fdc3d76a700 20 cephfs::mirror::InstanceWatcher init -9> 2023-12-28T17:51:26.908+0000 7fdc3d76a700 20 cephfs::mirror::InstanceWatcher create_instance -8> 2023-12-28T17:51:26.908+0000 7fdc3d76a700 20 cephfs::mirror::Mirror handle_enable_mirroring: filesystem={fscid=54, fs_name=cephfs}, peers=, r=-2 -7> 2023-12-28T17:51:26.908+0000 7fdc3ff6f700 -1 asok(0x5651f2796000) AdminSocket: error writing response length (32) Broken pipe -6> 2023-12-28T17:51:26.910+0000 7fdc31752700 20 cephfs::mirror::InstanceWatcher handle_create_instance: r=0 -5> 2023-12-28T17:51:26.910+0000 7fdc31752700 20 cephfs::mirror::InstanceWatcher register_watcher -4> 2023-12-28T17:51:26.910+0000 7fdc31752700 20 cephfs::mirror::Watcher register_watch -3> 2023-12-28T17:51:26.911+0000 7fdc31f53700 20 cephfs::mirror::Watcher handle_register_watch: r=0 -2> 2023-12-28T17:51:26.911+0000 7fdc31f53700 20 cephfs::mirror::InstanceWatcher handle_register_watcher: r=0 -1> 2023-12-28T17:51:26.911+0000 7fdc31f53700 20 cephfs::mirror::FSMirror handle_init_instance_watcher: r=0 0> 2023-12-28T17:51:26.912+0000 7fdc3cf69700 -1 *** Caught signal (Segmentation fault) ** in thread 7fdc3cf69700 thread_name:safe_timer ceph version 16.2.14-417-gc5564c79 (c5564c7988cbaadc3382253af5843a8595347c2d) pacific (stable) 1: /lib64/libpthread.so.0(+0x12ce0) [0x7fdc4476ace0] 2: __pthread_mutex_lock() 3: (std::mutex::lock()+0x17) [0x5651f0710357] 4: (cephfs::mirror::Mirror::update_fs_mirrors()+0x827) [0x5651f070e7d7] 5: (Context::complete(int)+0xd) [0x5651f070f6ed] 6: (CommonSafeTimer<std::mutex>::timer_thread()+0x10f) [0x7fdc4563d65f] 7: (CommonSafeTimerThread<std::mutex>::entry()+0x11) [0x7fdc4563e9f1] 8: /lib64/libpthread.so.0(+0x81cf) [0x7fdc447601cf] 9: clone() NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Steps to Reproduce: 1. 2. 3. Actual results: The ceph-client.mirror logs show the above bt. Expected results: Additional info:
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 6.1 security, bug fix, and enhancement updates.), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:5960