From the cluster logs we could make out that the PGs went into recovering state after the autoscaler kicked in and PG splitting had occurred. The cluster logs unfortunately did not contain the logs that captured recovery messages. However, I was able to reproduce this issue on a vstart cluster. To do this, I followed the following steps: 1. Set norecovery and nobackfill flags 2. Created a pool 3. Used radosbench to write objects to this pool 4. While objects are being written to this pool, I increased the number of PGs (pg_num) of this pool. Once I did this, I observed that a few PGs went into recovering state and using the PG query command I could see the num_objects_recovered increasing. I believe the reason I could not capture these logs in the cluster is that by the time we started recording debug_osd=30 OSD logs, all I/O on the cluster had been stopped. From the logs, we figured out that recovery starts when we try write to a degraded object or try to read a missing object. Both of these instances cause maybe_kick_recovery() to start recovery for that particular object (recover_object). This work-flow does not check if the norecover flag is set or not. I will be working on fixing this issue. Attaching the logs below for reference: Write to a degraded object: 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 10 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded mbc={255={(2+0)=2}}] do_op osd_op(client.4155.0:466 1.1 1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head [set-alloc-hint object_size 4194304 write_size 4194304,write 0~4194304 in=4194304b] snapc 0=[] RETRY=1 ondisk+retry+write+known_if_redirected+supports_pool_eio e22) v8 may_write -> write-ordered flags ondisk+retry+write+known_if_redirected+supports_pool_eio 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 10 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded mbc={255={(2+0)=2}}] readable_with_acting: locs:1,2 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 10 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded mbc={255={(2+0)=2}}] do_op degraded_or_backfilling_object 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 10 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded mbc={255={(2+0)=2}}] do_op wait_for_degraded_object 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 7 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded mbc={255={(2+0)=2}}] object 1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head v 21'466, recovering. 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 10 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded mbc={255={(2+0)=2}}] prep_object_replica_pushes: on 1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head Starting recovery_op and building push op: 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 20 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded mbc={255={(2+0)=2}}] recovery got recovery read lock on 1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 10 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded mbc={255={(2+0)=2}}] start_recovery_op 1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 10 osd.1 23 start_recovery_op pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded rops=1 mbc={255={(2+0)=2}}] 1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head (0/3 rops) 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 10 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded rops=1 mbc={255={(2+0)=2}}] recover_object: 1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 20 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded rops=1 mbc={255={(2+0)=2}}] start_pushes soid 1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 10 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded rops=1 mbc={255={(2+0)=2}}] prep_push_to_replica: 1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head v21'466 size 4194304 to osd.0 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 15 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded rops=1 mbc={255={(2+0)=2}}] push_to_replica snapset is 0=[]:{} 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 10 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded rops=1 mbc={255={(2+0)=2}}] calc_head_subsets 1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head clone_overlap {} 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 10 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded rops=1 mbc={255={(2+0)=2}}] calc_head_subsets 1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head data_subset [0~4194304] 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 10 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded rops=1 mbc={255={(2+0)=2}}] skipping clone, nothing needs to clone 2023-08-24T10:53:21.916+0530 7ffa5831e6c0 7 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded rops=1 mbc={255={(2+0)=2}}] build_push_op 1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head v 21'466 size 4194304 recovery_info: ObjectRecoveryInfo(1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head@21'466, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0) Send pushes to osd.0 from osd.1: 2023-08-24T10:53:22.049+0530 7ffa5831e6c0 20 osd.1 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=0 lpr=22 pi=[18,22)/1 crt=21'468 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded rops=2 mbc={255={(2+0)=2}}] send_pushes: sending push PushOp(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head, version: 21'467, data_included: [0~4194304], data_size: 4194304, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head@21'467, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false)) to osd.0 2023-08-24T10:53:22.049+0530 7ffa5831e6c0 1 -- [v2:192.168.68.109:6812/1081787642,v1:192.168.68.109:6813/1081787642] --> [v2:192.168.68.109:6804/2591898341,v1:192.168.68.109:6805/2591898341] -- MOSDPGPush(1.1 23/22 [PushOp(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head, version: 21'467, data_included: [0~4194304], data_size: 4194304, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head@21'467, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) v4 -- 0x10d031e0 con 0xa3fd180 osd.0 logs queueing PGRecoveryMsg: 2023-08-24T10:53:22.060+0530 7f212a5546c0 1 -- [v2:192.168.68.109:6804/2591898341,v1:192.168.68.109:6805/2591898341] <== osd.1 v2:192.168.68.109:6812/1081787642 514 ==== MOSDPGPush(1.1 23/22 [PushOp(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head, version: 21'467, data_included: [0~4194304], data_size: 4194304, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head@21'467, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) v4 ==== 4195427+0+0 (crc 0 0 0) 0xb78e680 con 0x8acdb00 2023-08-24T10:53:22.060+0530 7f212a5546c0 15 osd.0 23 enqueue_op MOSDPGPush(1.1 23/22 [PushOp(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head, version: 21'467, data_included: [0~4194304], data_size: 4194304, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head@21'467, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) v4 prio 196 type 105 cost 4195304 latency 0.008027 epoch 23 MOSDPGPush(1.1 23/22 [PushOp(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head, version: 21'467, data_included: [0~4194304], data_size: 4194304, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head@21'467, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) v4 2023-08-24T10:53:22.060+0530 7f212a5546c0 20 osd.0 op_wq(1) _enqueue OpSchedulerItem(1.1 PGRecoveryMsg(op=MOSDPGPush(1.1 23/22 [PushOp(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head, version: 21'467, data_included: [0~4194304], data_size: 4194304, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head@21'467, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) v4) class_id 2 prio 196 cost 4195304 e23) 2023-08-24T10:53:22.139+0530 7f210f51e6c0 10 osd.0 23 dequeue_op MOSDPGPush(1.1 23/22 [PushOp(1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head, version: 21'466, data_included: [0~4194304], data_size: 4194304, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(1:9dc90fba:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object465:head@21'466, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) v4 finish 2023-08-24T10:53:22.139+0530 7f210f51e6c0 20 osd.0 op_wq(1) _process 1.1 to_process <> waiting <> waiting_peering {} 2023-08-24T10:53:22.139+0530 7f210f51e6c0 20 osd.0 op_wq(1) _process OpSchedulerItem(1.1 PGRecoveryMsg(op=MOSDPGPush(1.1 23/22 [PushOp(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head, version: 21'467, data_included: [0~4194304], data_size: 4194304, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head@21'467, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) v4) class_id 2 prio 196 cost 4195304 e23) queued 2023-08-24T10:53:22.139+0530 7f210f51e6c0 30 osd.0 pg_epoch: 23 pg[1.1( v 21'468 lc 21'466 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=1 lpr=22 pi=[18,22)/1 luod=0'0 crt=21'468 lcod 0'0 mlcod 0'0 active m=1 mbc={}] lock 2023-08-24T10:53:22.139+0530 7f210f51e6c0 20 osd.0 op_wq(1) _process 1.1 to_process <OpSchedulerItem(1.1 PGRecoveryMsg(op=MOSDPGPush(1.1 23/22 [PushOp(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head, version: 21'467, data_included: [0~4194304], data_size: 4194304, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head@21'467, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) v4) class_id 2 prio 196 cost 4195304 e23)> waiting <> waiting_peering {} 2023-08-24T10:53:22.140+0530 7f210f51e6c0 20 osd.0 op_wq(1) _process OpSchedulerItem(1.1 PGRecoveryMsg(op=MOSDPGPush(1.1 23/22 [PushOp(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head, version: 21'467, data_included: [0~4194304], data_size: 4194304, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head@21'467, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) v4) class_id 2 prio 196 cost 4195304 e23) pg 0xb772000 2023-08-24T10:53:22.140+0530 7f210f51e6c0 10 osd.0 23 dequeue_op MOSDPGPush(1.1 23/22 [PushOp(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head, version: 21'467, data_included: [0~4194304], data_size: 4194304, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head@21'467, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) v4 prio 196 cost 4195304 latency 0.087563 MOSDPGPush(1.1 23/22 [PushOp(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head, version: 21'467, data_included: [0~4194304], data_size: 4194304, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head@21'467, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) v4 pg pg[1.1( v 21'468 lc 21'466 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=1 lpr=22 pi=[18,22)/1 luod=0'0 crt=21'468 lcod 0'0 mlcod 0'0 active m=1 mbc={}] 2023-08-24T10:53:22.140+0530 7f210f51e6c0 10 osd.0 23 maybe_share_map: con v2:192.168.68.109:6812/1081787642 our osdmap epoch of 23 is not newer than session's projected_epoch of 23 2023-08-24T10:53:22.140+0530 7f210f51e6c0 10 osd.0 pg_epoch: 23 pg[1.1( v 21'468 lc 21'466 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=1 lpr=22 pi=[18,22)/1 luod=0'0 crt=21'468 lcod 0'0 mlcod 0'0 active m=1 mbc={}] _handle_message: MOSDPGPush(1.1 23/22 [PushOp(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head, version: 21'467, data_included: [0~4194304], data_size: 4194304, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head@21'467, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false, error:false))]) v4 2023-08-24T10:53:22.140+0530 7f210f51e6c0 10 osd.0 pg_epoch: 23 pg[1.1( v 21'468 lc 21'466 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=1 lpr=22 pi=[18,22)/1 luod=0'0 crt=21'468 lcod 0'0 mlcod 0'0 active m=1 mbc={}] handle_push ObjectRecoveryInfo(1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head@21'467, size: 4194304, copy_subset: [0~4194304], clone_subset: {}, snapset: 0=[]:{}, object_exist: 0)ObjectRecoveryProgress(!first, data_recovered_to:4194304, data_complete:true, omap_recovered_to:, omap_complete:true, error:false) 2023-08-24T10:53:22.140+0530 7f210f51e6c0 20 osd.0 pg_epoch: 23 pg[1.1( v 21'468 lc 21'466 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=1 lpr=22 pi=[18,22)/1 luod=0'0 crt=21'468 lcod 0'0 mlcod 0'0 active m=1 mbc={}] submit_push_data recovering object 1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head copy_subset: [0~4194304] intervals_included: [0~4194304] data_zeros: [] 2023-08-24T10:53:22.140+0530 7f210f51e6c0 10 osd.0 pg_epoch: 23 pg[1.1( v 21'468 lc 21'466 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=1 lpr=22 pi=[18,22)/1 luod=0'0 crt=21'468 lcod 0'0 mlcod 0'0 active m=1 mbc={}] on_local_recover: 1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head 2023-08-24T10:53:22.140+0530 7f210f51e6c0 10 osd.0 pg_epoch: 23 pg[1.1( v 21'468 lc 21'466 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=1 lpr=22 pi=[18,22)/1 luod=0'0 crt=21'468 lcod 0'0 mlcod 0'0 active m=1 mbc={}] PeeringState::recover_got got missing 1:bb40a492:::benchmark_data_li-9e0a7d4c-34fa-11b2-a85c-bf_356346_object466:head v 21'467 2023-08-24T10:53:22.140+0530 7f210f51e6c0 10 osd.0 pg_epoch: 23 pg[1.1( v 21'468 (0'0,21'468] local-lis/les=22/23 n=117 ec=22/18 lis/c=22/18 les/c/f=23/19/0 sis=22) [1,0,2] r=1 lpr=22 pi=[18,22)/1 luod=0'0 crt=21'468 lcod 0'0 mlcod 0'0 active mbc={}] PeeringState::recover_got last_complete now 21'468 log.complete_to at end
PR currently being reviewed - https://github.com/ceph/ceph/pull/54708
Hi Akash, I've added the doc text and type.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Critical: Red Hat Ceph Storage 7.1 security, enhancements, and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:3925