Description of problem: We created a stretch cluster RHCS setup where there are 3 nodes in each of the 2 data centers and one arbiter node in cloud. Each data center has 2 MONs running and a 5th MON is running in the arbiter node. A kernel rbd mount was performed on a client machine where data was being continuously written to it. When all 3 nodes of 2nd data center were shutdown, the IO stopped. Both read and writes failed on the volume. Version-Release number of selected component (if applicable): Server info: ceph version 16.2.0-152.el8cp (e456e8b705cb2f4a779689a0d80b122bcb0d67c9) pacific (stable) image tag 5-103 How reproducible: Always Steps to Reproduce: 1. Create a stretch cluster 2. Mount a rbd volume 3. Shutdown all OSD nodes that belong to one of the data centers/failure domain. Actual results: IO hangs. Expected results: IO should continue. Additional info: Server side info ======================= [root@ceph-0 ~]# ceph status cluster: id: b0d624f6-89ea-11ec-ad08-005056838602 health: HEALTH_WARN 3 hosts fail cephadm check We are missing stretch mode buckets, only requiring 1 of 2 buckets to peer insufficient standby MDS daemons available 2/5 mons down, quorum ceph-0,ceph-2,ceph-6 1 datacenter (6 osds) down 6 osds down 3 hosts (6 osds) down Degraded data redundancy: 1040/2080 objects degraded (50.000%), 90 pgs degraded, 201 pgs undersized services: mon: 5 daemons, quorum ceph-0,ceph-2,ceph-6 (age 5h), out of quorum: ceph-3, ceph-5 mgr: ceph-0.fhbxvx(active, since 3d) mds: 1/1 daemons up osd: 12 osds: 6 up (since 5h), 12 in (since 6w) rgw: 1 daemon active (1 hosts, 1 zones) data: volumes: 1/1 healthy pools: 8 pools, 201 pgs objects: 520 objects, 724 MiB usage: 12 GiB used, 588 GiB / 600 GiB avail pgs: 1040/2080 objects degraded (50.000%) 111 active+undersized 90 active+undersized+degraded [root@ceph-0 ~]# ceph osd pool ls device_health_metrics cephfs.cephfs.meta cephfs.cephfs.data .rgw.root default.rgw.log default.rgw.control default.rgw.meta rbdblockpool [root@ceph-0 ~]# ceph osd crush rule dump stretch_rule { "rule_id": 1, "rule_name": "stretch_rule", "ruleset": 1, "type": 1, "min_size": 1, "max_size": 10, "steps": [ { "op": "take", "item": -16, "item_name": "site1" }, { "op": "chooseleaf_firstn", "num": 2, "type": "host" }, { "op": "emit" }, { "op": "take", "item": -17, "item_name": "site2" }, { "op": "chooseleaf_firstn", "num": 2, "type": "host" }, { "op": "emit" } ] } RBD Volume Name : csi-vol-37119694-aca7-11ec-b69e-0a580a050c22 Client info =============== sh-4.4# cat /etc/redhat-release Red Hat Enterprise Linux CoreOS release 4.10 sh-4.4# uname -a Linux perf1-qpmpf-ocs-94mfj 4.18.0-305.34.2.el8_4.x86_64 #1 SMP Mon Jan 17 09:42:23 EST 2022 x86_64 x86_64 x86_64 GNU/Linux sh-4.4# modinfo libceph filename: /lib/modules/4.18.0-305.34.2.el8_4.x86_64/kernel/net/ceph/libceph.ko.xz license: GPL description: Ceph core library author: Patience Warnick <patience> author: Yehuda Sadeh <yehuda.net> author: Sage Weil <sage> rhelversion: 8.4 srcversion: 56D6E7804420E592C2C8124 depends: libcrc32c,dns_resolver intree: Y name: libceph vermagic: 4.18.0-305.34.2.el8_4.x86_64 SMP mod_unload modversions sh-4.4# mount | grep pvc-48a06b6b-6e69-48b9-9d3f-6b4d4e4aeff7 /dev/rbd2 on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-48a06b6b-6e69-48b9-9d3f-6b4d4e4aeff7/globalmount/0001-0011-openshift-storage-0000000000000008-37119694-aca7-11ec-b69e-0a580a050c22 type ext4 (rw,relatime,seclabel,stripe=16,_netdev) /dev/rbd2 on /var/lib/kubelet/pods/fa5c2e5d-ac40-46f9-8969-44afb25f3168/volumes/kubernetes.io~csi/pvc-48a06b6b-6e69-48b9-9d3f-6b4d4e4aeff7/mount type ext4 (rw,relatime,seclabel,stripe=16,_netdev) We are also attaching the client debug logs as a file.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat Ceph Storage 5.1 Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:4622
*** Bug 2081751 has been marked as a duplicate of this bug. ***