Description of problem (please be detailed as possible and provide log snippests): Version of all relevant components (if applicable): ACM 2.10 GA'ed ODF 4.15 GA'ed ceph version 17.2.6-196.el9cp (cbbf2cfb549196ca18c0c9caff9124d83ed681a4) quincy (stable) OCP 4.15.0-0.nightly-2024-03-24-023440 VolSync 0.9.0 Submariner 0.17 (GA'ed alongside ACM 2.10) Does this issue impact your ability to continue to work with the product (please explain in detail what is the user impact)? Is there any workaround available to the best of your knowledge? Rate from 1 - 5 the complexity of the scenario you performed that caused this bug (1 - very simple, 5 - very complex)? Can this issue reproducible? Can this issue reproduce from the UI? No, but CLI workaround is available Run ""oc get volumesnapshot -A | grep false"" where the workloads are primary and delete the stale volumesnapshots which are stuck (older ones only). If they are not deleted, remove the finalizers for the VolumeSnapshot and their corresponding VolumeSnapshotContent. If this is a regression, please provide more details to justify this: Steps to Reproduce: 1. Deployed DR protected 6 RBD and 6 CephFS workloads on C1 over a RDR setup of both subscription and appset types (1 each) and failedover (with all clusters up and running) and relocated them in such a way that they are finally running on C1 and maintain a unique state such as Deployed, FailedOver and Relocated (check drpc output below). Such as if busybox-1 is failedover to C2, it is failedover back to C1 and so on (with all clusters up and running). We also have 4 workloads (2 RBD and 2 CephFS) on C2 and they remain as it is in the Deployed state. 2. After 2nd operation when workloads are finally running on C1, let IOs continue overnight and ensure data sync is progressing well. 3. Actual results: Current drpc looks like this: amanagrawal@Amans-MacBook-Pro ~ % drpc NAMESPACE NAME AGE PREFERREDCLUSTER FAILOVERCLUSTER DESIREDSTATE CURRENTSTATE PROGRESSION START TIME DURATION PEER READY busybox-workloads-10 rbd-sub-busybox10-placement-1-drpc 2d7h amagrawa-c1-29m amagrawa-c2-29m Relocate Relocated Completed 2024-03-30T11:46:04Z 3m34.194136561s True busybox-workloads-11 rbd-sub-busybox11-placement-1-drpc 2d7h amagrawa-c1-29m Deployed Completed 2024-03-30T08:50:53Z 17.08813336s True busybox-workloads-12 rbd-sub-busybox12-placement-1-drpc 2d7h amagrawa-c1-29m Deployed Completed 2024-03-30T08:53:01Z 15.048723358s True busybox-workloads-13 rbd-sub-busybox13-placement-1-drpc 2d7h amagrawa-c2-29m Deployed Completed 2024-03-30T08:54:07Z 1.052097882s True busybox-workloads-14 cephfs-sub-busybox14-placement-1-drpc 2d7h amagrawa-c2-29m amagrawa-c1-29m Failover FailedOver Completed 2024-03-30T11:45:33Z 2m13.87941597s True busybox-workloads-15 cephfs-sub-busybox15-placement-1-drpc 2d7h amagrawa-c1-29m amagrawa-c2-29m Relocate Relocated Completed 2024-03-30T11:45:41Z 2m25.765604202s True busybox-workloads-16 cephfs-sub-busybox16-placement-1-drpc 2d7h amagrawa-c1-29m Deployed Completed 2024-03-30T08:58:36Z 45.206438694s True busybox-workloads-17 cephfs-sub-busybox17-placement-1-drpc 2d7h amagrawa-c2-29m Deployed Completed 2024-03-30T09:00:45Z 45.250150435s True busybox-workloads-9 rbd-sub-busybox9-placement-1-drpc 2d7h amagrawa-c2-29m amagrawa-c1-29m Failover FailedOver Completed 2024-03-30T11:45:51Z 3m52.912910746s True openshift-gitops cephfs-appset-busybox5-placement-drpc 2d7h amagrawa-c2-29m amagrawa-c1-29m Failover FailedOver Completed 2024-03-30T11:45:22Z 1m54.720876193s True openshift-gitops cephfs-appset-busybox6-placement-drpc 2d7h amagrawa-c1-29m amagrawa-c2-29m Relocate Relocated Completed 2024-03-30T11:45:27Z 4m59.996397972s True openshift-gitops cephfs-appset-busybox7-placement-drpc 2d7h amagrawa-c1-29m Deployed Completed 2024-03-30T08:33:50Z 45.322308361s True openshift-gitops cephfs-appset-busybox8-placement-drpc 2d7h amagrawa-c2-29m Deployed Completed 2024-03-30T08:34:58Z 51.174493729s True openshift-gitops rbd-appset-busybox1-placement-drpc 2d7h amagrawa-c2-29m amagrawa-c1-29m Failover FailedOver Completed 2024-03-30T11:46:12Z 4m35.129560857s True openshift-gitops rbd-appset-busybox2-placement-drpc 2d7h amagrawa-c1-29m amagrawa-c2-29m Relocate Relocated Completed 2024-03-30T11:46:15Z 10m42.195053331s True openshift-gitops rbd-appset-busybox3-placement-drpc 2d7h amagrawa-c1-29m Deployed Completed 2024-03-30T08:29:02Z 19.120449331s True openshift-gitops rbd-appset-busybox4-placement-drpc 2d7h amagrawa-c2-29m Deployed Completed 2024-03-30T08:30:30Z 1.049141938s True Data sync wasn't progressing for busybox-workloads-6, 7 and 14 busybox-workloads-6 seems to recover and then busybox-workloads-8 seems to be impacted. amanagrawal@Amans-MacBook-Pro 01april24 % oc get volumesnapshot -A | grep false busybox-workloads-14 volsync-busybox-pvc-1-src false busybox-pvc-1 ocs-storagecluster-cephfsplugin-snapclass snapcontent-a42538b4-eeef-45c3-94cf-3ef9bdf4feae 28h busybox-workloads-7 volsync-busybox-pvc-4-src false busybox-pvc-4 ocs-storagecluster-cephfsplugin-snapclass snapcontent-ac363a4b-2b11-443a-b968-2a705381959d 28h busybox-workloads-8 busybox-pvc-2-20240401030130 false busybox-pvc-2 ocs-storagecluster-cephfsplugin-snapclass snapcontent-e9d04116-9cd3-4021-b11d-a2066a46eb0a 10h Expected results: VolumeSnapshot->ReadyToUse should be True for CephFS workloads and data sync should progress as expected while running IOs. Additional info:
This is the only read call in the cephfs client debug log that's reading data off the OSD (data pool): ``` 2024-05-06T11:36:57.841121364Z debug 2024-05-06T11:36:57.840+0000 7f9c27504640 10 client.584545 _read_async 0x10000020458.head(faked_ino=0 nref=10 ll_ref=0 cap_refs={2048=1} open={1=1} mode=100640 size=158/0 nlink =1 btime=2024-04-30T20:53:25.974991+0000 mtime=2024-04-30T20:53:25.975615+0000 ctime=2024-04-30T20:53:26.386700+0000 change_attr=7 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[0x10000020458 ts 0/0 objects 0 dirty_or_ tx 0] parents=0x100000066cb.head[".meta"] 0x5634ec816a00) 0~4096 2024-05-06T11:36:57.841121364Z debug 2024-05-06T11:36:57.840+0000 7f9c27504640 10 client.584545 min_bytes=131072 max_bytes=16777216 max_periods=4 2024-05-06T11:36:57.841121364Z debug 2024-05-06T11:36:57.840+0000 7f9c27504640 10 client.584545.objecter _op_submit op 0x5634ec41fe00 2024-05-06T11:36:57.841135304Z debug 2024-05-06T11:36:57.840+0000 7f9c27504640 20 client.584545.objecter _calc_target epoch 7696 base 10000020458.00000000 @12 precalc_pgid 0 pgid 0.0 is_read 2024-05-06T11:36:57.841135304Z debug 2024-05-06T11:36:57.840+0000 7f9c27504640 20 client.584545.objecter _calc_target target 10000020458.00000000 @12 -> pgid 12.53bb8e39 2024-05-06T11:36:57.841140932Z debug 2024-05-06T11:36:57.840+0000 7f9c27504640 10 client.584545.objecter _calc_target raw pgid 12.53bb8e39 -> actual 12.19 acting [0,1,2] primary 0 2024-05-06T11:36:57.841179626Z debug 2024-05-06T11:36:57.840+0000 7f9c27504640 20 client.584545.objecter _get_session s=0x5634ec88e480 osd=0 3 2024-05-06T11:36:57.841179626Z debug 2024-05-06T11:36:57.840+0000 7f9c27504640 10 client.584545.objecter _op_submit oid 10000020458.00000000 '@12' '@12' [read 0~158] tid 1 osd.0 2024-05-06T11:36:57.841183709Z debug 2024-05-06T11:36:57.840+0000 7f9c27504640 20 client.584545.objecter get_session s=0x5634ec88e480 osd=0 3 2024-05-06T11:36:57.841183709Z debug 2024-05-06T11:36:57.840+0000 7f9c27504640 15 client.584545.objecter _session_op_assign 0 1 2024-05-06T11:36:57.841199599Z debug 2024-05-06T11:36:57.840+0000 7f9c27504640 15 client.584545.objecter _send_op 1 to 12.19 on osd.0 ``` And later... ``` 2024-05-06T11:37:12.813418983Z debug 2024-05-06T11:37:12.812+0000 7f9c18b67640 10 client.584545.objecter tick 2024-05-06T11:37:12.813418983Z debug 2024-05-06T11:37:12.812+0000 7f9c18b67640 2 client.584545.objecter tid 1 on osd.0 is laggy 2024-05-06T11:37:12.813418983Z debug 2024-05-06T11:37:12.812+0000 7f9c18b67640 10 client.584545.objecter _maybe_request_map subscribing (onetime) to next osd map 2024-05-06T11:37:12.813551139Z debug 2024-05-06T11:37:12.812+0000 7f9cedb7c640 20 --2- 10.131.1.11:0/1469616214 >> [v2:242.1.255.250:3300/0,v1:242.1.255.250:6789/0] conn(0x5634ec834400 0x5634ec6bf180 secure :-1 s=READY pgs=110714 cs=0 l=1 rev1=1 crypto rx=0x5634ec799e00 tx=0x5634ec535f80 comp rx=0 tx=0).write_message sending m=0x5634ec8ba4e0 seq=7 src=client.584545 off=0 2024-05-06T11:37:12.814280714Z debug 2024-05-06T11:37:12.813+0000 7f9c16362640 10 client.584545.objecter ms_dispatch 0x5634e28b8000 osd_map(7697..7697 src has 7082..7697) 2024-05-06T11:37:12.814280714Z debug 2024-05-06T11:37:12.813+0000 7f9c16362640 3 client.584545.objecter handle_osd_map got epochs [7697,7697] > 7696 2024-05-06T11:37:12.814280714Z debug 2024-05-06T11:37:12.813+0000 7f9c16362640 3 client.584545.objecter handle_osd_map decoding incremental epoch 7697 2024-05-06T11:37:12.814468790Z debug 2024-05-06T11:37:12.813+0000 7f9c16362640 10 client.584545.objecter checking op 1 2024-05-06T11:37:12.814468790Z debug 2024-05-06T11:37:12.813+0000 7f9c16362640 20 client.584545.objecter _calc_target epoch 7697 base 10000020458.00000000 @12 precalc_pgid 0 pgid 0.0 is_read 2024-05-06T11:37:12.814468790Z debug 2024-05-06T11:37:12.813+0000 7f9c16362640 20 client.584545.objecter _calc_target target 10000020458.00000000 @12 -> pgid 12.53bb8e39 2024-05-06T11:37:12.814468790Z debug 2024-05-06T11:37:12.813+0000 7f9c16362640 20 client.584545.objecter dump_active .. 0 homeless 2024-05-06T11:37:12.814468790Z debug 2024-05-06T11:37:12.813+0000 7f9c16362640 20 client.584545.objecter 1 12.53bb8e39 osd.0 10000020458.00000000 [read 0~158] ``` The "tid 1 on osd.0 is laggy" log appears over and over again and I believe the [read 0~158] osd_op is tid 1 to osd.0.
OK. So, the above update was with grep `client.584545` just to see what's happening to the call at the client/objecter level, not necessarily at the messenger layer. That's what upcoming: [read 0~158] is being sent to osd.0, so objecter initiates a connect with endpoint: [v2:242.1.255.254:6800/2685472366,v1:242.1.255.254:6801/2685472366]. Then, soon after that ``` 2024-05-06T11:36:57.841199599Z debug 2024-05-06T11:36:57.840+0000 7f9c27504640 1 -- 10.131.1.11:0/1469616214 --> [v2:242.1.255.254:6800/2685472366,v1:242.1.255.254:6801/2685472366] -- osd_op(unknown.0.0:1 12.19 12:9c71ddca:::10000020458.00000000:head [read 0~158] snapc 0=[] ondisk+read+known_if_redirected+supports_pool_eio e7696) -- 0x5634ec6ef000 con 0x5634ec835000 2024-05-06T11:36:57.841199599Z debug 2024-05-06T11:36:57.840+0000 7f9ced37b640 20 -- 10.131.1.11:0/1469616214 >> [v2:242.1.255.254:6800/2685472366,v1:242.1.255.254:6801/2685472366] conn(0x5634ec835000 msgr2=0x5634ec836b00 unknown :-1 s=STATE_CONNECTING l=0).process 2024-05-06T11:36:57.841204861Z debug 2024-05-06T11:36:57.840+0000 7f9c27504640 5 --2- 10.131.1.11:0/1469616214 >> [v2:242.1.255.254:6800/2685472366,v1:242.1.255.254:6801/2685472366] conn(0x5634ec835000 0x5634ec836b00 unknown :-1 s=START_CONNECT pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).send_message enqueueing message m=0x5634ec6ef000 type=42 osd_op(client.584545.0:1 12.19 12:9c71ddca:::10000020458.00000000:head [read 0~158] snapc 0=[] ondisk+read+known_if_redirected+supports_pool_eio e7696) 2024-05-06T11:36:57.841204861Z debug 2024-05-06T11:36:57.840+0000 7f9c27504640 15 --2- 10.131.1.11:0/1469616214 >> [v2:242.1.255.254:6800/2685472366,v1:242.1.255.254:6801/2685472366] conn(0x5634ec835000 0x5634ec836b00 unknown :-1 s=START_CONNECT pgs=0 cs=0 l=0 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0).send_message inline write is denied, reschedule m=0x5634ec6ef000 ``` due to (osd) connection still in-progress (which is hinted as "nonblock connect inprogress" below)? ``` 2024-05-06T11:36:57.841393513Z debug 2024-05-06T11:36:57.840+0000 7f9ced37b640 20 Event(0x5634e35b0988 nevent=5000 time_id=56).create_file_event create event started fd=83 mask=1 original mask is 0 2024-05-06T11:36:57.841393513Z debug 2024-05-06T11:36:57.840+0000 7f9ced37b640 20 EpollDriver.add_event add event fd=83 cur_mask=0 add_mask=1 to 11 2024-05-06T11:36:57.841393513Z debug 2024-05-06T11:36:57.840+0000 7f9ced37b640 20 Event(0x5634e35b0988 nevent=5000 time_id=56).create_file_event create event end fd=83 mask=1 current mask is 1 2024-05-06T11:36:57.841393513Z debug 2024-05-06T11:36:57.840+0000 7f9ced37b640 10 NetHandler reconnect reconnect: 114 Operation already in progress 2024-05-06T11:36:57.841401866Z debug 2024-05-06T11:36:57.840+0000 7f9ced37b640 10 -- 10.131.1.11:0/1469616214 >> [v2:242.1.255.254:6800/2685472366,v1:242.1.255.254:6801/2685472366] conn(0x5634ec835000 msgr2=0x5634ec836b00 unknown :-1 s=STATE_CONNECTING_RE l=0).process nonblock connect inprogress ``` After going through lot many messenger related logs which I could not fully make sense of, at this point log messages such as ``` 2024-05-06T11:37:07.288541431Z debug 2024-05-06T11:37:07.287+0000 7f9cee37d640 10 -- 10.131.1.11:0/751099213 >> [v2:242.1.255.254:6800/2685472366,v1:242.1.255.254:6801/2685472366] conn(0x5634ec428000 msgr2=0x5634ec014100 unknown :-1 s=STATE_CONNECTING_RE l=1).process nonblock connect inprogress ``` are still showing up.
At "2024-05-06T11:43:10.691169634" the "connect improgress" log messages are still seen: ``` 2024-05-06T11:43:10.691169634Z debug 2024-05-06T11:43:10.690+0000 7f9cedb7c640 10 -- 10.131.1.11:0/1651991364 >> [v2:242.1.255.254:6800/2685472366,v1:242.1.255.254:6801/2685472366] conn(0x5634ec8ea400 msgr2=0x5634 ec8f8580 unknown :-1 s=STATE_CONNECTING_RE l=1).process nonblock connect inprogress ``` What's happening at the messenger layer? cc Radek.
Radek/Brad - https://tracker.ceph.com/issues/65986
Now I see that the sync is progressing well on this setup. Probably due to pod restart/node reboot.