(In reply to Sunil Angadi from comment #4) ... > > Tried the test mentioned here > https://tracker.ceph.com/issues/61607#note-2 > > ran script more than 3 hours, > > also used this upstream script https://github.com/ceph/ceph/pull/53535 > to test in on downstream ceph clusters > > + sleep 10 > ++ ceph mgr dump > ++ jq '.active_clients[]' > ++ jq 'select(.name == "rbd_support")' > ++ jq -r '[.addrvec[0].addr, "/", .addrvec[0].nonce|tostring] | add' > + CLIENT_ADDR=10.8.131.14:0/3677632432 > ++ date +%s > + CURRENT_TIME=1700653795 > + (( CURRENT_TIME <= END_TIME )) > + [[ -n 10.8.131.14:0/3677632432 ]] > + [[ 10.8.131.14:0/3677632432 != > \1\0\.\8\.\1\3\1\.\1\4\:\0\/\3\6\7\7\6\3\2\4\3\2 ]] > + sleep 10 > > Tried it on multiple snapshot scheduling intervals like 1m, 3m, 5m > while initiating the client blocklist script > able to run IO continuosly > Hey Sunil, this is great. Can you let me know how many times you ran the script for 3 hours? > Each time observed that rbd_support client was recovered and > mirror snapshots got created as per the schedule > ... > > > @Raman, > seen some logs in mgr as below, > can you please confirm these are expected due to client blocklist test? > > 2023-11-22T10:14:02.085+0000 7fe573234640 -1 librbd::io::AioCompletion: > 0x55d8d209fce0 fail: (108) Cannot send after transport endpoint shutdown > 2023-11-22T10:14:02.085+0000 7fe573a35640 -1 > librbd::watcher::RewatchRequest: 0x55d8ce46b810 handle_unwatch client > blocklisted > 2023-11-22T10:14:02.085+0000 7fe573a35640 -1 > librbd::watcher::RewatchRequest: 0x55d8d068a5a0 handle_unwatch client > blocklisted > 2023-11-22T10:14:02.085+0000 7fe573a35640 -1 librbd::ImageWatcher: > 0x55d8cd5e0c00 image watch failed: 94389613608960, (107) Transport endpoint > is not connected > 2023-11-22T10:14:02.085+0000 7fe573234640 -1 > librbd::managed_lock::BreakRequest: 0x55d8cca16500 handle_break_lock: failed > to break lock: (108) Cannot send after transport endpoint shutdown > 2023-11-22T10:14:02.085+0000 7fe573a35640 -1 librbd::Watcher: 0x55d8cd5e0c00 > handle_error: handle=94389613608960: (107) Transport endpoint is not > connected > 2023-11-22T10:14:02.085+0000 7fe573234640 -1 > librbd::managed_lock::AcquireRequest: 0x55d8d0000690 handle_break_lock: > failed to break lock : (108) Cannot send after transport endpoint shutdown > 2023-11-22T10:14:02.085+0000 7fe573234640 -1 librbd::ManagedLock: > 0x55d8d0531bb8 handle_acquire_lock: failed to acquire exclusive lock: (108) > Cannot send after transport endpoint shutdown > 2023-11-22T10:14:02.085+0000 7fe573a35640 -1 librbd::ImageWatcher: > 0x55d8cdcb6900 image watch failed: 94389702976512, (107) Transport endpoint > is not connected > 2023-11-22T10:14:02.085+0000 7fe573a35640 -1 librbd::Watcher: 0x55d8cdcb6900 > handle_error: handle=94389702976512: (107) Transport endpoint is not > connected > 2023-11-22T10:14:02.085+0000 7fe573a35640 -1 > librbd::mirror::snapshot::CreatePrimaryRequest: 0x55d8c7c1a820 > handle_create_snapshot: failed to create mirror snapshot: (108) Cannot send > after transport endpoint shutdown > 2023-11-22T10:14:02.085+0000 7fe573a35640 -1 librbd::io::AioCompletion: > 0x55d8d1758580 fail: (108) Cannot send after transport endpoint shutdown > 2023-11-22T10:14:02.085+0000 7fe573234640 -1 librbd::ImageWatcher: > 0x55d8cdb3e300 image watch failed: 94389554526848, (107) Transport endpoint > is not connected > 2023-11-22T10:14:02.085+0000 7fe573234640 -1 librbd::Watcher: 0x55d8cdb3e300 > handle_error: handle=94389554526848: (107) Transport endpoint is not > connected > 2023-11-22T10:14:02.085+0000 7fe573234640 -1 > librbd::watcher::RewatchRequest: 0x55d8ce223810 handle_unwatch client > blocklisted > 2023-11-22T10:14:02.085+0000 7fe573234640 -1 > librbd::watcher::RewatchRequest: 0x55d8d3f448c0 handle_unwatch client > blocklisted > 2023-11-22T10:14:02.085+0000 7fe573a35640 -1 librbd::image::OpenRequest: > failed to retrieve name: (108) Cannot send after transport endpoint shutdown > 2023-11-22T10:14:02.085+0000 7fe573234640 -1 librbd::ImageState: > 0x55d8d193db80 failed to open image: (108) Cannot send after transport > endpoint shutdown > 2023-11-22T10:14:02.087+0000 7fe573a35640 -1 librbd::image::OpenRequest: > failed to retrieve name: (108) Cannot send after transport endpoint shutdown > 2023-11-22T10:14:02.087+0000 7fe573234640 -1 librbd::ImageState: > 0x55d8cad28200 failed to open image: (108) Cannot send after transport > endpoint shutdown > 2023-11-22T10:14:02.089+0000 7fe573a35640 -1 librbd::image::OpenRequest: > failed to retrieve name: (108) Cannot send after transport endpoint shutdown Yes, this is expected. You're seeing the result of blocklisting the rbd_support module's client (by the script); the client is no longer able to connect to the cluster.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat Ceph Storage 6.1 security, enhancements, and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:7740