Description of problem: created a kafka cluster with 3 zookeepers and 3 brokers. created a topic from kafka side with 1 partition and 3 replicas. rgw crash observed at complete-multipart-upload of 84th object. automation fail log: http://magna002.ceph.redhat.com/cephci-jenkins/hsm/POC_kafka_cluster/test_bucket_notification_kafka_broker_multipart.console.log_with_kafka_3_nodes_and_partioned_topic_from_kafka_side_rgw_crash2 rgw logs at debug_level 20: http://magna002.ceph.redhat.com/cephci-jenkins/hsm/POC_kafka_cluster/ceph-client.rgw.rgw.all.ceph-hsm-kafka-bieny8-node5.gjydto.log rgw crash snippet: -11> 2025-02-03T18:09:12.711+0000 7f6664e3e640 20 Kafka connect: connection found -10> 2025-02-03T18:09:12.711+0000 7f6664e3e640 20 req 8967395081007705679 0.017001484s INFO: push endpoint created: kafka://10.0.64.191:9092 -9> 2025-02-03T18:09:12.715+0000 7f665ce2e640 20 handle_completion(): completion ok for obj=prefix1key_johnb.444-bucky-8-1_84 -8> 2025-02-03T18:09:12.746+0000 7f663cdee640 20 Kafka publish: reused existing topic: cephci-kafka-broker-ack-type-2096f35cb63a43ff -7> 2025-02-03T18:09:12.746+0000 7f663cdee640 20 Kafka publish (with callback, tag=185): OK. Queue has: 1 callbacks -6> 2025-02-03T18:09:12.752+0000 7f663cdee640 20 Kafka run: ack received with result=Success -5> 2025-02-03T18:09:12.752+0000 7f663cdee640 20 Kafka run: n/ack received, invoking callback with tag=185 -4> 2025-02-03T18:09:12.753+0000 7f66b26d9640 20 req 8967395081007705679 0.059005152s s3:complete_multipart get_obj_state: octx=0x563e84dc1420 obj=johnb.444-bucky-8-1:_multipart_prefix1key_johnb.444-bucky-8-1_84.2~O_oxiRBfnl-01_L0gnPkof0dZbwFKdF.meta state=0x563e861505e8 s->prefetch_data=0 -3> 2025-02-03T18:09:12.753+0000 7f66b26d9640 20 req 8967395081007705679 0.059005152s s3:complete_multipart get_obj_state: octx=0x563e84dc1420 obj=johnb.444-bucky-8-1:_multipart_prefix1key_johnb.444-bucky-8-1_84.2~O_oxiRBfnl-01_L0gnPkof0dZbwFKdF.meta state=0x563e861505e8 s->prefetch_data=0 -2> 2025-02-03T18:09:12.753+0000 7f66b26d9640 20 req 8967395081007705679 0.059005152s s3:complete_multipart prepare_atomic_modification: state is not atomic. state=0x563e861505e8 -1> 2025-02-03T18:09:12.753+0000 7f66b26d9640 20 req 8967395081007705679 0.059005152s s3:complete_multipart bucket index object: :.dir.783c75e7-fe5f-43fd-ace0-823b18d29506.40816.22.10 0> 2025-02-03T18:09:12.759+0000 7f663cdee640 -1 *** Caught signal (Aborted) ** in thread 7f663cdee640 thread_name:kafka_manager ceph version 19.2.0-64.el9cp (cc053eea5c90d0938f70b48dc0a70b46aeeb4369) squid (stable) 1: /lib64/libc.so.6(+0x3e730) [0x7f676bbce730] 2: /lib64/libc.so.6(+0x8ba6c) [0x7f676bc1ba6c] 3: raise() 4: abort() 5: /lib64/libc.so.6(+0x29170) [0x7f676bbb9170] 6: /lib64/libc.so.6(+0x37217) [0x7f676bbc7217] 7: /lib64/libc.so.6(+0x92248) [0x7f676bc22248] 8: (std::_Function_handler<void (int), RGWPubSubKafkaEndpoint::send(rgw_pubsub_s3_event const&, optional_yield)::{lambda(int)#1}>::_M_invoke(std::_Any_data const&, int&&)+0x95) [0x563e7c0b46e5] 9: (rgw::kafka::message_callback(rd_kafka_s*, rd_kafka_message_s const*, void*)+0x20f) [0x563e7c11ea8f] 10: /lib64/librdkafka.so.1(+0x256ef) [0x7f676c3166ef] 11: /lib64/librdkafka.so.1(+0x5b862) [0x7f676c34c862] 12: rd_kafka_poll() 13: (rgw::kafka::Manager::run()+0x5a9) [0x563e7c126689] 14: /lib64/libstdc++.so.6(+0xdbad4) [0x7f676bf6bad4] 15: /lib64/libc.so.6(+0x89d22) [0x7f676bc19d22] 16: /lib64/libc.so.6(+0x10ed40) [0x7f676bc9ed40] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Version-Release number of selected component (if applicable): ceph version 19.2.0-64.el9cp How reproducible: always with the automation script at upload of n'th object Steps to Reproduce: 1.create a kafka cluster with 3 zookeepers and 3 brokers. 2.create a ceph cluster with rgw daemon 3.create a kafka topic with 1 partition and 3 replicas /usr/local/kafka/bin/kafka-topics.sh --create --topic cephci-kafka-broker-ack-type-2096f35cb63a43ff --bootstrap-server kafka://10.0.66.18:9092 --partitions 1 --replication-factor 3 /usr/local/kafka/bin/kafka-topics.sh --describe --topic cephci-kafka-broker-ack-type-2096f35cb63a43ff --bootstrap-server kafka://10.0.66.18:9092 Topic: cephci-kafka-broker-ack-type-2096f35cb63a43ff TopicId: FgYQWStjSK6h63kWGyrBEQ PartitionCount: 1 ReplicationFactor: 3 Configs: segment.bytes=1073741824 Topic: cephci-kafka-broker-ack-type-2096f35cb63a43ff Partition: 0 Leader: 2 Replicas: 2,0,1 Isr: 2,0,1 4.create a topic from rgw side with the same topic name and different kafka broker push endpoint than the one where the topic partition is present(here topic partition is present on broker0 but I gave broker1 address) 5.create a bucket and put bucket notifications with the above topic arn 6.upload multipart objects into the bucket. rgw crashing for n'th object complete-multipart-upload Actual results: rgw crash observed for complete-multipart-upload Expected results: rgw should not crash Additional info: