Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 2280662

Summary: cephfs_mirror: fix crash in update_fs_mirrors()
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Jos Collin <jcollin>
Component: CephFSAssignee: Jos Collin <jcollin>
Status: CLOSED NEXTRELEASE QA Contact: Hemanth Kumar <hyelloji>
Severity: medium Docs Contact:
Priority: high    
Version: 7.0CC: ceph-eng-bugs, cephqe-warriors, gfarnum, hyelloji, ngangadh, vshankar
Target Milestone: ---   
Target Release: 7.0z4   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: FSMirror::is_failed() function is using null m_instance_watcher and m_mirror_watcher pointers to call the member function InstanceWatcher::is_failed() and MirrorWatcher::is_failed() respectively. Consequence: Crash Fix: The patch checks if m_instance_watcher and m_mirror_watcher before using them. Result:
Story Points: ---
Clone Of: 2280636 Environment:
Last Closed: 2025-02-05 14:47:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2280636, 2280665    
Bug Blocks:    

Description Jos Collin 2024-05-15 15:11:24 UTC
+++ This bug was initially created as a clone of Bug #2280636 +++

Description of problem:
cephfs_mirror: fix crash in update_fs_mirrors(), when calling FSMirror::is_failed()

Version-Release number of selected component (if applicable):
RHCS6

How reproducible:
This cannot be reproduced manually. You need to run the tests in test_mirroring.py and it crashes, even after all tests are passed. The ceph-client.mirror logs show the bt:

     -16> 2023-12-28T17:51:26.774+0000 7fdc39f63700 10 monclient: tick
   -15> 2023-12-28T17:51:26.900+0000 7fdc3d76a700 10 cephfs::mirror::Utils connect: connected to cluster=ceph using client=client.mirror
   -14> 2023-12-28T17:51:26.901+0000 7fdc3d76a700 20 cephfs::mirror::Utils mount: filesystem={fscid=56, fs_name=cephfs}
   -13> 2023-12-28T17:51:26.908+0000 7fdc3d76a700 10 cephfs::mirror::Utils mount: mounted filesystem={fscid=56, fs_name=cephfs}
   -12> 2023-12-28T17:51:26.908+0000 7fdc3d76a700 10 cephfs::mirror::FSMirror init: rados addrs=172.21.15.115:0/449872678
   -11> 2023-12-28T17:51:26.908+0000 7fdc3d76a700 20 cephfs::mirror::FSMirror init_instance_watcher
   -10> 2023-12-28T17:51:26.908+0000 7fdc3d76a700 20 cephfs::mirror::InstanceWatcher init
    -9> 2023-12-28T17:51:26.908+0000 7fdc3d76a700 20 cephfs::mirror::InstanceWatcher create_instance
    -8> 2023-12-28T17:51:26.908+0000 7fdc3d76a700 20 cephfs::mirror::Mirror handle_enable_mirroring: filesystem={fscid=54, fs_name=cephfs}, peers=, r=-2
    -7> 2023-12-28T17:51:26.908+0000 7fdc3ff6f700 -1 asok(0x5651f2796000) AdminSocket: error writing response length (32) Broken pipe
    -6> 2023-12-28T17:51:26.910+0000 7fdc31752700 20 cephfs::mirror::InstanceWatcher handle_create_instance: r=0
    -5> 2023-12-28T17:51:26.910+0000 7fdc31752700 20 cephfs::mirror::InstanceWatcher register_watcher
    -4> 2023-12-28T17:51:26.910+0000 7fdc31752700 20 cephfs::mirror::Watcher register_watch
    -3> 2023-12-28T17:51:26.911+0000 7fdc31f53700 20 cephfs::mirror::Watcher handle_register_watch: r=0
    -2> 2023-12-28T17:51:26.911+0000 7fdc31f53700 20 cephfs::mirror::InstanceWatcher handle_register_watcher: r=0
    -1> 2023-12-28T17:51:26.911+0000 7fdc31f53700 20 cephfs::mirror::FSMirror handle_init_instance_watcher: r=0
     0> 2023-12-28T17:51:26.912+0000 7fdc3cf69700 -1 *** Caught signal (Segmentation fault) **
 in thread 7fdc3cf69700 thread_name:safe_timer

 ceph version 16.2.14-417-gc5564c79 (c5564c7988cbaadc3382253af5843a8595347c2d) pacific (stable)
 1: /lib64/libpthread.so.0(+0x12ce0) [0x7fdc4476ace0]
 2: __pthread_mutex_lock()
 3: (std::mutex::lock()+0x17) [0x5651f0710357]
 4: (cephfs::mirror::Mirror::update_fs_mirrors()+0x827) [0x5651f070e7d7]
 5: (Context::complete(int)+0xd) [0x5651f070f6ed]
 6: (CommonSafeTimer<std::mutex>::timer_thread()+0x10f) [0x7fdc4563d65f]
 7: (CommonSafeTimerThread<std::mutex>::entry()+0x11) [0x7fdc4563e9f1]
 8: /lib64/libpthread.so.0(+0x81cf) [0x7fdc447601cf]
 9: clone()
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Steps to Reproduce:
1.
2.
3.

Actual results:
The ceph-client.mirror logs show the above bt.

Expected results:


Additional info:

Comment 1 Venky Shankar 2024-05-22 07:29:21 UTC
Oh I see this bz is for 7.0 z-stream. moving that back.

Comment 7 Red Hat Bugzilla 2025-06-06 04:25:03 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days