2069405 – IO blocked on a kernel rbd mount when a datacenter is shutdown in a stretch cluster

Bug 2069405 - IO blocked on a kernel rbd mount when a datacenter is shutdown in a stretch cluster

Summary: IO blocked on a kernel rbd mount when a datacenter is shutdown in a stretch c...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Ceph Storage
Classification:	Red Hat Storage
Component:	RADOS
Sub Component:
Version:	5.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	5.1z1
Assignee:	Ilya Dryomov
QA Contact:	Pawan
Docs Contact:	Mary Frances Hull
URL:
Whiteboard:
Duplicates (1):	2081751 (view as bug list)
Depends On:
Blocks:	2085458
TreeView+	depends on / blocked

Reported:	2022-03-28 20:30 UTC by Raghavendra Talur
Modified:	2023-06-13 04:19 UTC (History)
CC List:	26 users (show)
Fixed In Version:	ceph-16.2.7-101.el8cp
Doc Type:	Bug Fix
Doc Text:	.The Ceph Monitor properly updates the OSD map on stretch cluster mode transitions Previously, the Ceph Monitor would not properly update the OSD map on stretch cluster mode transitions resulting in some clients being unable to resend the OSD requests. The corresponding client I/O requests would appear hung. With this release, the Ceph Monitor was fixed to properly update the OSD map and clients resend the OSD requests as expected. The stretch cluster mode transition does not precipitate the client I/O requests that appear hung.
Clone Of:
Environment:
Last Closed:	2022-05-18 10:38:18 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Ceph Project Bug Tracker	55158	None	None	None	2022-04-01 09:53:07 UTC
Red Hat Issue Tracker	RHCEPH-3882	None	None	None	2022-03-28 20:34:23 UTC
Red Hat Product Errata	RHBA-2022:4622	None	None	None	2022-05-18 10:38:44 UTC

Description Raghavendra Talur 2022-03-28 20:30:51 UTC

Description of problem:

We created a stretch cluster RHCS setup where there are 3 nodes in each of the 2 data centers and one arbiter node in cloud. Each data center has 2 MONs running and a 5th MON is running in the arbiter node.

A kernel rbd mount was performed on a client machine where data was being continuously written to it. When all 3 nodes of 2nd data center were shutdown, the IO stopped. Both read and writes failed on the volume.


Version-Release number of selected component (if applicable):

Server info:
ceph version 16.2.0-152.el8cp (e456e8b705cb2f4a779689a0d80b122bcb0d67c9) pacific (stable)
image tag 5-103


How reproducible:
Always


Steps to Reproduce:
1. Create a stretch cluster
2. Mount a rbd volume
3. Shutdown all OSD nodes that belong to one of the data centers/failure domain.

Actual results:
IO hangs.

Expected results:
IO should continue.

Additional info:


Server side info
=======================

[root@ceph-0 ~]# ceph status
  cluster:
    id:     b0d624f6-89ea-11ec-ad08-005056838602
    health: HEALTH_WARN
            3 hosts fail cephadm check
            We are missing stretch mode buckets, only requiring 1 of 2 buckets to peer
            insufficient standby MDS daemons available
            2/5 mons down, quorum ceph-0,ceph-2,ceph-6
            1 datacenter (6 osds) down
            6 osds down
            3 hosts (6 osds) down
            Degraded data redundancy: 1040/2080 objects degraded (50.000%), 90 pgs degraded, 201 pgs undersized
 
  services:
    mon: 5 daemons, quorum ceph-0,ceph-2,ceph-6 (age 5h), out of quorum: ceph-3, ceph-5
    mgr: ceph-0.fhbxvx(active, since 3d)
    mds: 1/1 daemons up
    osd: 12 osds: 6 up (since 5h), 12 in (since 6w)
    rgw: 1 daemon active (1 hosts, 1 zones)
 
  data:
    volumes: 1/1 healthy
    pools:   8 pools, 201 pgs
    objects: 520 objects, 724 MiB
    usage:   12 GiB used, 588 GiB / 600 GiB avail
    pgs:     1040/2080 objects degraded (50.000%)
             111 active+undersized
             90  active+undersized+degraded



[root@ceph-0 ~]# ceph osd pool ls
device_health_metrics
cephfs.cephfs.meta
cephfs.cephfs.data
.rgw.root
default.rgw.log
default.rgw.control
default.rgw.meta
rbdblockpool


[root@ceph-0 ~]# ceph osd crush rule dump  stretch_rule
{
    "rule_id": 1,
    "rule_name": "stretch_rule",
    "ruleset": 1,
    "type": 1,
    "min_size": 1,
    "max_size": 10,
    "steps": [
        {
            "op": "take",
            "item": -16,
            "item_name": "site1"
        },
        {
            "op": "chooseleaf_firstn",
            "num": 2,
            "type": "host"
        },
        {
            "op": "emit"
        },
        {
            "op": "take",
            "item": -17,
            "item_name": "site2"
        },
        {
            "op": "chooseleaf_firstn",
            "num": 2,
            "type": "host"
        },
        {
            "op": "emit"
        }
    ]
}


RBD Volume Name : csi-vol-37119694-aca7-11ec-b69e-0a580a050c22



Client info
===============

sh-4.4# cat /etc/redhat-release 
Red Hat Enterprise Linux CoreOS release 4.10


sh-4.4# uname -a
Linux perf1-qpmpf-ocs-94mfj 4.18.0-305.34.2.el8_4.x86_64 #1 SMP Mon Jan 17 09:42:23 EST 2022 x86_64 x86_64 x86_64 GNU/Linux


sh-4.4# modinfo libceph
filename:       /lib/modules/4.18.0-305.34.2.el8_4.x86_64/kernel/net/ceph/libceph.ko.xz
license:        GPL
description:    Ceph core library
author:         Patience Warnick <patience>
author:         Yehuda Sadeh <yehuda.net>
author:         Sage Weil <sage>
rhelversion:    8.4
srcversion:     56D6E7804420E592C2C8124
depends:        libcrc32c,dns_resolver
intree:         Y
name:           libceph
vermagic:       4.18.0-305.34.2.el8_4.x86_64 SMP mod_unload modversions 



sh-4.4# mount | grep pvc-48a06b6b-6e69-48b9-9d3f-6b4d4e4aeff7
/dev/rbd2 on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-48a06b6b-6e69-48b9-9d3f-6b4d4e4aeff7/globalmount/0001-0011-openshift-storage-0000000000000008-37119694-aca7-11ec-b69e-0a580a050c22 type ext4 (rw,relatime,seclabel,stripe=16,_netdev)
/dev/rbd2 on /var/lib/kubelet/pods/fa5c2e5d-ac40-46f9-8969-44afb25f3168/volumes/kubernetes.io~csi/pvc-48a06b6b-6e69-48b9-9d3f-6b4d4e4aeff7/mount type ext4 (rw,relatime,seclabel,stripe=16,_netdev)



We are also attaching the client debug logs as a file.

Comment 17 errata-xmlrpc 2022-05-18 10:38:18 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat Ceph Storage 5.1 Bug Fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:4622

Comment 18 nravinas 2022-06-15 15:37:29 UTC

*** Bug 2081751 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.

aclewett
akupczyk
amathuri
bhubbard
ceph-eng-bugs
ceph-qe-bugs
choffman
ebonilla
idryomov
kseeger
ksirivad
lflores
maugarci
mbukatov
mhull
nojha
nravinas
pdhange
rabhat
rfriedma
rmandyam
rzarzyns
sseshasa
tserlin
vereddy
vumrao