Description of problem: Configured two clusters with two way snapshot based mirroring with both clusters hosting primary images with delay in between the clusters simulated using qdisc. Images were observed for days for snapshot scheduling with multiple build to build upgrades. While testing blocklist recovery fix,tried blocklisting for a an hour in loop. Later observed that the snapshot schedule had stopped for all the primary images in one of the clusters. Upon observation, it appeared like client blocklist had got recovered but there seemed MirrorSnapshotScheduleHandler.shutdown might have got hung possibly in wait_for_pending() Version-Release number of selected component (if applicable): 17.2.6-69 How reproducible: Tried once on 22 images, snapshot scheduling has got affected on all images as handler might have got affected. Steps to Reproduce: Configure snapshot based mirroring at image level, add rbd_support client blocklisting in loop and observe snapshot schedule. Actual results: Snapshot schedule got stuck stalling mirroring Expected results: Snapshot scheduling must be intact Additional info:
Vasishta's test: 22 images were scheduled to have their mirror-snapshots created once every 3 minutes. rbd_support module's rados client was added to the blocklist every 5 seconds. Observation: Close to 50 minutes into this test, the rbd_support module failed to recover from blocklisting. The rbd_support module was able to recover from blocklisting until then. It was observed that the MirrorSnapshotScheduleHandler failed to shutdown properly, and this stopped the rbd_support module recovery to proceed further. The handler shutdown got stuck waiting for all the pending snapshot creation requests to fail or finish. Initial analysis of the failed recovery: Snapshot requests were created at 2023-05-29T18:51:00 ( identified by log message "CreateSnapshotRequests.add") for all the 22 images. The add requests started at 2023-05-29T18:50:59.999+0000 and ended at 2023-05-29T18:51:00.021+0000. First sign of blocklisting occurred at 2023-05-29T18:51:00.262+0000 , librbd::watcher::RewatchRequest: 0x55fa591f4910 handle_unwatch r=-108 . After blocklisting, it was observed that create snapshot requests for 20 images finished (1 completed and 19 errored out). Only create snapshot requests for 2 images ( "2//1231886181b19" and "2//17109c5f76935") failed to finish. The failure of these 2 snapshot requests to finish must have blocked the shutdown of the MirrorSnapshotScheduleHandler. I need to dig deeper to figure out why these 2 requests failed to error out or complete.
Of the 22 images whose snapshots were scheduled, 12 errored out when opening the image for snapshot, 8 errored out when creating a snapshot for the image, and 2 failed to error out (that caused the recovery to hang). Please note the correction in https://bugzilla.redhat.com/show_bug.cgi?id=2211290#c3 , where I mentioned that 1 mirror snapshot creation completed. For the 2 scheduled mirror snapshot operations that failed to error out, I see "librbd::api::Mirror: image_snapshot_create" and later "librbd::ImageWatcher: 0x55fa64aa8f00 remote snap_create request". However, for a mirror snapshot schedule that errored out when creating a snapshot, I see "librbd::api::Mirror: image_snapshot_create" followed by "librbd::mirror::snapshot::CreatePrimaryRequest: 0x55fa5955e9c0 create_snapshot ", "librbd::Operations: 0x55fa5e73c7e0 execute_snap_create", and then "[rbd_support ERROR root] error when creating snapshot for 2//14a6a474a58d0: -108". Please see below for detailed log snippets. Log snippet for mirror snapshot schedule that *did not error out* and caused the rbd_support module recovery to fail, ``` 2023-05-29T18:51:00.008+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_get_mirror_info 2//1231886181b19: r=0 info={'global_id': '0d8be64e-956a-42be-817c-b1945343b921', 'state': 1, 'primary': True} 2023-05-29T18:51:00.008+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.create_snapshot for 2//1231886181b19 2023-05-29T18:51:00.008+0000 7fb11add0640 20 librbd::api::Mirror: image_snapshot_create: ictx=0x55fa61151000 2023-05-29T18:51:00.008+0000 7fb11add0640 20 librbd::mirror::GetInfoRequest: 0x55fa5beb12c0 get_mirror_image: 2023-05-29T18:51:00.008+0000 7fb11add0640 10 librbd::image::RefreshRequest: 0x55fa5d8e3880 handle_v2_get_group: r=0 2023-05-29T18:51:00.008+0000 7fb11add0640 10 librbd::image::RefreshRequest: 0x55fa5d8e3880 send_v2_get_snapshots 2023-05-29T18:51:00.008+0000 7fb11add0640 10 librbd::image::OpenRequest: handle_v2_get_initial_metadata: r=0 2023-05-29T18:51:00.008+0000 7fb11add0640 10 librbd::image::OpenRequest: 0x55fa5d9bb840 send_v2_get_create_timestamp 2023-05-29T18:51:00.008+0000 7fb11add0640 20 librbd::mirror::GetInfoRequest: 0x55fa5beb1980 handle_get_mirror_image: r=0 2023-05-29T18:51:00.008+0000 7fb11add0640 10 librbd::mirror::GetInfoRequest: 0x55fa5beb1980 calc_promotion_state: promotion_state=1, primary_mirror_uuid= 2023-05-29T18:51:00.008+0000 7fb11add0640 20 librbd::mirror::GetInfoRequest: 0x55fa5beb1980 finish: r=0 2023-05-29T18:51:00.008+0000 7fb11add0640 20 librbd: C_AioCompletion::finish: r=0 ... 2023-05-29T18:51:00.010+0000 7fb11add0640 10 librbd::ImageWatcher: 0x55fa64aa8f00 remote snap_create request: [342496,94533739825408,15101] [mirror state=primary, complete=0, mirror_peer_uuids=52d86adc-3b8e-4aba-b0d2-9818b7f83340, clean_since_snap_id=head] .mirror.primary.0d8be64e-956a-42be-817c-b1945343b921.cf193596-7096-4d58-bd60-a6727703a3c5 0 2023-05-29T18:51:00.010+0000 7fb11add0640 20 librbd::ExclusiveLock: 0x55fa59570ea0 accept_request=0 (request_type=0) 2023-05-29T18:51:00.010+0000 7fb11add0640 10 librbd::Watcher::C_NotifyAck 0x55fa594dd4a0 finish: r=0 2023-05-29T18:51:00.010+0000 7fb11b5d1640 20 librbd::mirror::GetInfoRequest: 0x55fa5beb0f00 handle_get_mirror_image: r=0 2023-05-29T18:51:00.010+0000 7fb11b5d1640 10 librbd::mirror::GetInfoRequest: 0x55fa5beb0f00 calc_promotion_state: promotion_state=1, primary_mirror_uuid= 2023-05-29T18:51:00.010+0000 7fb11b5d1640 20 librbd::mirror::GetInfoRequest: 0x55fa5beb0f00 finish: r=0 2023-05-29T18:51:00.010+0000 7fb11b5d1640 20 librbd: C_AioCompletion::finish: r=0 ``` Log snippet for a mirror snapshot schedule that errored out when creating a snapshot as expected, ``` 2023-05-29T18:51:00.008+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_get_mirror_info 2//14a6a474a58d0: r=0 info={'global_id': '242275a6-bedc-4aac-aa5d-dde6a01eab32', 'state': 1, 'primary': True} 2023-05-29T18:51:00.008+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.create_snapshot for 2//14a6a474a58d0 2023-05-29T18:51:00.008+0000 7fb11add0640 20 librbd::api::Mirror: image_snapshot_create: ictx=0x55fa60969000 2023-05-29T18:51:00.008+0000 7fb11add0640 20 librbd::mirror::GetInfoRequest: 0x55fa5beb1980 get_mirror_image: 2023-05-29T18:51:00.008+0000 7fb11add0640 10 librbd::image::RefreshRequest: 0x55fa5d8e3880 handle_v2_get_snapshots: r=0 2023-05-29T18:51:00.008+0000 7fb11add0640 10 librbd::image::RefreshRequest: 0x55fa5d8e3880 send_v2_init_exclusive_lock 2023-05-29T18:51:00.008+0000 7fb11add0640 10 librbd::ExclusiveLock: 0x55fa51e82000 init: features=61 2023-05-29T18:51:00.008+0000 7fb11add0640 5 librbd::io::Dispatcher: 0x55fa5e73d320 register_dispatch: dispatch_layer=3 2023-05-29T18:51:00.008+0000 7fb11add0640 20 librbd::exclusive_lock::ImageDispatch: 0x55fa594b21c0 set_require_lock: direction=1, enabled=1 2023-05-29T18:51:00.008+0000 7fb11add0640 20 librbd::io::AsyncOperation: 0x55fa5bb6bdf0 start_op ... 2023-05-29T18:51:00.009+0000 7fb11add0640 15 librbd::mirror::snapshot::CreatePrimaryRequest: 0x55fa5955e9c0 handle_get_mirror_peers: r=0 2023-05-29T18:51:00.009+0000 7fb11add0640 15 librbd::mirror::snapshot::CreatePrimaryRequest: 0x55fa5955e9c0 create_snapshot: name=.mirror.primary.242275a6-bedc-4aac-aa5d-dde6a01eab32.8b6baca8-5d16-4fb9-8f83-eda964f303e4, ns=[mirror state=primary, complete=0, mirror_peer_uuids=52d86adc-3b8e-4aba-b0d2-9818b7f83340, clean_since_snap_id=head] 2023-05-29T18:51:00.009+0000 7fb11add0640 5 librbd::Operations: 0x55fa5e73c7e0 snap_create: snap_name=.mirror.primary.242275a6-bedc-4aac-aa5d-dde6a01eab32.8b6baca8-5d16-4fb9-8f83-eda964f303e4 ... 2023-05-29T18:51:00.010+0000 7fb11add0640 20 librbd::Operations: start_op: snap create 0x55fa59006710 2023-05-29T18:51:00.010+0000 7fb11add0640 20 librbd::Operations: start snap create 0x55fa59006710 2023-05-29T18:51:00.010+0000 7fb11add0640 20 librbd::Operations: execute_local_request 2023-05-29T18:51:00.010+0000 7fb11add0640 20 librbd::ManagedLock: 0x55fa51ea4018 is_lock_owner: =1 2023-05-29T18:51:00.010+0000 7fb11b5d1640 10 librbd::Watcher: 0x55fa5d493800 handle_register_watch: r=0 2023-05-29T18:51:00.010+0000 7fb11add0640 5 librbd::Operations: 0x55fa5e73c7e0 execute_snap_create: snap_name=.mirror.primary.242275a6-bedc-4aac-aa5d-dde6a01eab32.8b6baca8-5d16-4fb9-8f83-eda964f303e4 2023-05-29T18:51:00.010+0000 7fb11b5d1640 10 librbd::image::OpenRequest: 0x55fa5d9bb840 handle_register_watch: r=0 2023-05-29T18:51:00.010+0000 7fb11add0640 5 librbd::SnapshotCreateRequest: 0x55fa590c6000 send_notify_quiesce ... 2023-05-29T18:51:00.010+0000 7fb11b5d1640 20 librbd: C_OpenComplete::finish: r=0 2023-05-29T18:51:00.010+0000 7fb11b5d1640 20 librbd: C_AioCompletion::finish: r=0 ... 2023-05-29T18:51:02.612+0000 7fb11b5d1640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_create_snapshot for 2//14a6a474a58d0: r=-108, snap_id=None 2023-05-29T18:51:02.612+0000 7fb11b5d1640 0 [rbd_support ERROR root] error when creating snapshot for 2//14a6a474a58d0: -108 2023-05-29T18:51:02.612+0000 7fb11b5d1640 0 [rbd_support DEBUG root] CreateSnapshotRequests.close_image 2//14a6a474a58d0 ```
Created attachment 1969296 [details] mgr logs around when rbd_support module recovery failed on client blocklisting The mgr logs around when the rbd_support module failed to recover are in "2023-05-29T18:51.txt" . The file is used to compare the logs of images scheduled for snapshots, images whose snapshots creation errored out with those whose snapshot creation didn't on the rados client being blocklisted. Below are log snippets of images scheduled for snapshots whose snapshot creation errored out as expected on client being blocklisted. ``` $ grep -E '171317162a276|a92d6ef8-68ce-4a33-b9d8-916198f12469' 2023-05-29T18\:51.txt 2023-05-29T18:51:00.006+0000 7fb0fa44f640 0 [rbd_support DEBUG root] CreateSnapshotRequests.add: 2//171317162a276 2023-05-29T18:51:00.006+0000 7fb0fa44f640 0 [rbd_support DEBUG root] CreateSnapshotRequests.open_image: 2//171317162a276 2023-05-29T18:51:00.008+0000 7fb0fa44f640 10 librbd::ImageCtx: 0x55fa6610e000 ImageCtx: image_name=, image_id=171317162a276 2023-05-29T18:51:00.008+0000 7fb0fa44f640 0 [rbd_support DEBUG root] MirrorSnapshotScheduleHandler: scheduling 2//171317162a276 at 2023-05-29 18:54:00 2023-05-29T18:51:00.009+0000 7fb11add0640 10 librbd::ImageCtx: init_layout stripe_unit 4194304 stripe_count 1 object_size 4194304 prefix rbd_data.171317162a276 format rbd_data.171317162a276.%016llx 2023-05-29T18:51:00.010+0000 7fb11b5d1640 20 librbd::image::RefreshRequest: new snapshot id=484290 name=.mirror.primary.a92d6ef8-68ce-4a33-b9d8-916198f12469.8e9f4188-eb6b-4af8-8627-f93403afd0c5 size=107374182400 2023-05-29T18:51:00.010+0000 7fb11b5d1640 20 librbd::image::RefreshRequest: new snapshot id=484108 name=.mirror.primary.a92d6ef8-68ce-4a33-b9d8-916198f12469.c9ef235d-c490-4604-a11a-6930f8f8daad size=107374182400 2023-05-29T18:51:00.010+0000 7fb11b5d1640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_open_image 2//171317162a276: r=0 2023-05-29T18:51:00.010+0000 7fb11b5d1640 0 [rbd_support DEBUG root] CreateSnapshotRequests.get_mirror_mode: 2//171317162a276 2023-05-29T18:51:00.011+0000 7fb11b5d1640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_get_mirror_mode 2//171317162a276: r=0 mode=1 2023-05-29T18:51:00.011+0000 7fb11b5d1640 0 [rbd_support DEBUG root] CreateSnapshotRequests.get_mirror_info: 2//171317162a276 2023-05-29T18:51:00.011+0000 7fb11b5d1640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_get_mirror_info 2//171317162a276: r=0 info={'global_id': 'a92d6ef8-68ce-4a33-b9d8-916198f12469', 'state': 1, 'primary': True} 2023-05-29T18:51:00.011+0000 7fb11b5d1640 0 [rbd_support DEBUG root] CreateSnapshotRequests.create_snapshot for 2//171317162a276 2023-05-29T18:51:00.011+0000 7fb11add0640 15 librbd::mirror::snapshot::CreatePrimaryRequest: 0x55fa5955fd40 create_snapshot: name=.mirror.primary.a92d6ef8-68ce-4a33-b9d8-916198f12469.b733b096-4479-446d-af38-96e4bf4dd3d1, ns=[mirror state=primary, complete=0, mirror_peer_uuids=52d86adc-3b8e-4aba-b0d2-9818b7f83340, clean_since_snap_id=head] 2023-05-29T18:51:00.011+0000 7fb11add0640 5 librbd::Operations: 0x55fa5e73d3b0 snap_create: snap_name=.mirror.primary.a92d6ef8-68ce-4a33-b9d8-916198f12469.b733b096-4479-446d-af38-96e4bf4dd3d1 2023-05-29T18:51:00.012+0000 7fb11b5d1640 10 librbd::object_map::RefreshRequest: 0x55fa5b547860 send_lock: oid=rbd_object_map.171317162a276 2023-05-29T18:51:00.012+0000 7fb11b5d1640 10 librbd::object_map::LockRequest: 0x55fa5dc66fc0 send_lock: oid=rbd_object_map.171317162a276 2023-05-29T18:51:00.012+0000 7fb11b5d1640 10 librbd::object_map::RefreshRequest: 0x55fa5b547860 send_load: oid=rbd_object_map.171317162a276 2023-05-29T18:51:00.013+0000 7fb11b5d1640 5 librbd::Operations: 0x55fa5e73d3b0 execute_snap_create: snap_name=.mirror.primary.a92d6ef8-68ce-4a33-b9d8-916198f12469.b733b096-4479-446d-af38-96e4bf4dd3d1 2023-05-29T18:51:02.613+0000 7fb11b5d1640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_create_snapshot for 2//171317162a276: r=-108, snap_id=None 2023-05-29T18:51:02.613+0000 7fb11b5d1640 0 [rbd_support ERROR root] error when creating snapshot for 2//171317162a276: -108 2023-05-29T18:51:02.613+0000 7fb11b5d1640 0 [rbd_support DEBUG root] CreateSnapshotRequests.close_image 2//171317162a276 2023-05-29T18:51:02.613+0000 7fb11b5d1640 10 librbd::object_map::UnlockRequest: 0x55fa63e074d0 send_unlock: oid=rbd_object_map.171317162a276 2023-05-29T18:51:02.616+0000 7fb1045a3640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_close_image 2//171317162a276: r=0 2023-05-29T18:51:02.616+0000 7fb1045a3640 0 [rbd_support DEBUG root] CreateSnapshotRequests.finish: 2//171317162a276 ``` and ``` $ grep -E '1bded98edb7a0|35496098-d456-4cf9-940f-fb67aac15d69' 2023-05-29T18\:51.txt 2023-05-29T18:51:00.008+0000 7fb0fa44f640 0 [rbd_support DEBUG root] CreateSnapshotRequests.add: 2//1bded98edb7a0 2023-05-29T18:51:00.008+0000 7fb0fa44f640 0 [rbd_support DEBUG root] CreateSnapshotRequests.open_image: 2//1bded98edb7a0 2023-05-29T18:51:00.010+0000 7fb0fa44f640 10 librbd::ImageCtx: 0x55fa5d37a000 ImageCtx: image_name=, image_id=1bded98edb7a0 2023-05-29T18:51:00.010+0000 7fb0fa44f640 0 [rbd_support DEBUG root] MirrorSnapshotScheduleHandler: scheduling 2//1bded98edb7a0 at 2023-05-29 18:54:00 2023-05-29T18:51:00.011+0000 7fb11b5d1640 10 librbd::ImageCtx: init_layout stripe_unit 4194304 stripe_count 1 object_size 4194304 prefix rbd_data.1bded98edb7a0 format rbd_data.1bded98edb7a0.%016llx 2023-05-29T18:51:00.012+0000 7fb11add0640 20 librbd::image::RefreshRequest: new snapshot id=484296 name=.mirror.primary.35496098-d456-4cf9-940f-fb67aac15d69.a32de766-e1b4-40f2-8dd1-208d31e27d23 size=107374182400 2023-05-29T18:51:00.012+0000 7fb11add0640 20 librbd::image::RefreshRequest: new snapshot id=484114 name=.mirror.primary.35496098-d456-4cf9-940f-fb67aac15d69.21c93a23-c7af-4d50-9ee7-8692ef5e6ebb size=107374182400 2023-05-29T18:51:00.012+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_open_image 2//1bded98edb7a0: r=0 2023-05-29T18:51:00.012+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.get_mirror_mode: 2//1bded98edb7a0 2023-05-29T18:51:00.013+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_get_mirror_mode 2//1bded98edb7a0: r=0 mode=1 2023-05-29T18:51:00.013+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.get_mirror_info: 2//1bded98edb7a0 2023-05-29T18:51:00.013+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_get_mirror_info 2//1bded98edb7a0: r=0 info={'global_id': '35496098-d456-4cf9-940f-fb67aac15d69', 'state': 1, 'primary': True} 2023-05-29T18:51:00.013+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.create_snapshot for 2//1bded98edb7a0 2023-05-29T18:51:00.013+0000 7fb11add0640 15 librbd::mirror::snapshot::CreatePrimaryRequest: 0x55fa5955f6c0 create_snapshot: name=.mirror.primary.35496098-d456-4cf9-940f-fb67aac15d69.6b6c5a31-828e-48c7-8ebe-c017679df345, ns=[mirror state=primary, complete=0, mirror_peer_uuids=52d86adc-3b8e-4aba-b0d2-9818b7f83340, clean_since_snap_id=head] 2023-05-29T18:51:00.013+0000 7fb11add0640 5 librbd::Operations: 0x55fa5e73d560 snap_create: snap_name=.mirror.primary.35496098-d456-4cf9-940f-fb67aac15d69.6b6c5a31-828e-48c7-8ebe-c017679df345 2023-05-29T18:51:00.014+0000 7fb11add0640 10 librbd::ImageWatcher: 0x55fa650cdb00 remote snap_create request: [342496,94533745813632,15107] [mirror state=primary, complete=0, mirror_peer_uuids=52d86adc-3b8e-4aba-b0d2-9818b7f83340, clean_since_snap_id=head] .mirror.primary.35496098-d456-4cf9-940f-fb67aac15d69.6b6c5a31-828e-48c7-8ebe-c017679df345 0 2023-05-29T18:51:00.314+0000 7fb11b5d1640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_create_snapshot for 2//1bded98edb7a0: r=-108, snap_id=None 2023-05-29T18:51:00.314+0000 7fb11b5d1640 0 [rbd_support ERROR root] error when creating snapshot for 2//1bded98edb7a0: -108 2023-05-29T18:51:00.314+0000 7fb11b5d1640 0 [rbd_support DEBUG root] CreateSnapshotRequests.close_image 2//1bded98edb7a0 2023-05-29T18:51:00.314+0000 7fb1045a3640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_close_image 2//1bded98edb7a0: r=0 2023-05-29T18:51:00.314+0000 7fb1045a3640 0 [rbd_support DEBUG root] CreateSnapshotRequests.finish: 2//1bded98edb7a0 ``` Following is the log snippet of the image that was scheduled for a mirror snapshot. On rbd_support module's client being blocklisted, the image's snapshot creation did not error out. This stalled the recovery of the rbd_support module. ``` $ grep -E '1231886181b19|0d8be64e-956a-42be-817c-b1945343b921' 2023-05-29T18\:51.txt 2023-05-29T18:50:59.999+0000 7fb0fa44f640 0 [rbd_support DEBUG root] CreateSnapshotRequests.add: 2//1231886181b19 2023-05-29T18:50:59.999+0000 7fb0fa44f640 0 [rbd_support DEBUG root] CreateSnapshotRequests.open_image: 2//1231886181b19 2023-05-29T18:51:00.001+0000 7fb0fa44f640 10 librbd::ImageCtx: 0x55fa61151000 ImageCtx: image_name=, image_id=1231886181b19 2023-05-29T18:51:00.001+0000 7fb0fa44f640 0 [rbd_support DEBUG root] MirrorSnapshotScheduleHandler: scheduling 2//1231886181b19 at 2023-05-29 18:54:00 2023-05-29T18:51:00.002+0000 7fb11add0640 10 librbd::ImageCtx: init_layout stripe_unit 4194304 stripe_count 1 object_size 4194304 prefix rbd_data.1231886181b19 format rbd_data.1231886181b19.%016llx 2023-05-29T18:51:00.005+0000 7fb11b5d1640 20 librbd::image::RefreshRequest: new snapshot id=484294 name=.mirror.primary.0d8be64e-956a-42be-817c-b1945343b921.a45b233b-5297-4504-8c8a-cf16fa1b5b86 size=107374182400 2023-05-29T18:51:00.005+0000 7fb11b5d1640 20 librbd::image::RefreshRequest: new snapshot id=484112 name=.mirror.primary.0d8be64e-956a-42be-817c-b1945343b921.f5fea28e-0750-4df4-90f1-66176b95b092 size=107374182400 2023-05-29T18:51:00.007+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_open_image 2//1231886181b19: r=0 2023-05-29T18:51:00.007+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.get_mirror_mode: 2//1231886181b19 2023-05-29T18:51:00.008+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_get_mirror_mode 2//1231886181b19: r=0 mode=1 2023-05-29T18:51:00.008+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.get_mirror_info: 2//1231886181b19 2023-05-29T18:51:00.008+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.handle_get_mirror_info 2//1231886181b19: r=0 info={'global_id': '0d8be64e-956a-42be-817c-b1945343b921', 'state': 1, 'primary': True} 2023-05-29T18:51:00.008+0000 7fb11add0640 0 [rbd_support DEBUG root] CreateSnapshotRequests.create_snapshot for 2//1231886181b19 2023-05-29T18:51:00.008+0000 7fb11add0640 15 librbd::mirror::snapshot::CreatePrimaryRequest: 0x55fa5955e4e0 create_snapshot: name=.mirror.primary.0d8be64e-956a-42be-817c-b1945343b921.cf193596-7096-4d58-bd60-a6727703a3c5, ns=[mirror state=primary, complete=0, mirror_peer_uuids=52d86adc-3b8e-4aba-b0d2-9818b7f83340, clean_since_snap_id=head] 2023-05-29T18:51:00.008+0000 7fb11add0640 5 librbd::Operations: 0x55fa5e73c630 snap_create: snap_name=.mirror.primary.0d8be64e-956a-42be-817c-b1945343b921.cf193596-7096-4d58-bd60-a6727703a3c5 2023-05-29T18:51:00.010+0000 7fb11add0640 10 librbd::ImageWatcher: 0x55fa64aa8f00 remote snap_create request: [342496,94533739825408,15101] [mirror state=primary, complete=0, mirror_peer_uuids=52d86adc-3b8e-4aba-b0d2-9818b7f83340, clean_since_snap_id=head] .mirror.primary.0d8be64e-956a-42be-817c-b1945343b921.cf193596-7096-4d58-bd60-a6727703a3c5 0 ``` Looking at these log snippets, it seems like librbd code sometimes fails to error out due to client blocklisting after remote snap_create request is issued.
Ilya, are we tracking this as a 6.1z1 blocker?
Hi Ram. Could you please confirm if this BZ needs to be added to the 6.1z2 release notes? If so, please provide the doc type and text. Thanks.
I've added the required doc text.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat Ceph Storage 6.1 security, enhancement, and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:5693