Description of problem: See: https://tracker.ceph.com/issues/53194
Hi @pdonnell , Can you Please help in verifying this bug. Any specific steps to follow or specific teutolagy script that needs to be executed.
Just run the failover test in teuthology (--filter failover").
Hi @pdonnell, teuthology runs are failing for one or the other reason. cmd used : ./teuthology-suite -n 10 -c master -s fs --ceph-repo https://github.com/AmarnatReddy/ceph.git --suite-repo https://github.com/AmarnatReddy/ceph.git --suite-branch master /home/amk/rh8x_5.1.yaml -e amk -m clara --distro-version 8.5 --distro rhel -t rh --filter failover recent failure : Command failed on clara003 with status 1: 'sudo yum remove cephadm ceph-mon ceph-mgr ceph-osd ceph-mds ceph-radosgw ceph-test ceph-selinux ceph-fuse python-rados python-rbd python-cephfs rbd-mirror bison flex elfutils-libelf-devel openssl-devel NetworkManager iproute util-linux libacl-devel libaio-devel libattr-devel libtool libuuid-devel xfsdump xfsprogs xfsprogs-devel libaio-devel libtool libuuid-devel xfsprogs-devel python3-cephfs cephfs-top cephfs-mirror bison flex elfutils-libelf-devel openssl-devel NetworkManager iproute util-linux libacl-devel libaio-devel libattr-devel libtool libuuid-devel xfsdump xfsprogs xfsprogs-devel libaio-devel libtool libuuid-devel xfsprogs-devel python3-cephfs cephfs-top cephfs-mirror -y' could please let me know is there any other way I can validate this? Regards, Amarnath
Hi Patrick, I have verified it on the latest build(16.2.7-48.el8cp). I see the active node coming back to an active state after initiating the `ceph fail mds 0`.It is not getting stuck at up:resolve state. I don't see any message dropping in the logs. commands executed : [root@ceph-bz-mds-9ozvvy-node7 ~]# ceph fs status cephfs - 1 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 replay cephfs.ceph-bz-mds-9ozvvy-node6.mllnlu 0 0 0 0 1 active cephfs.ceph-bz-mds-9ozvvy-node5.btiybz Reqs: 4 /s 285 169 58 141 POOL TYPE USED AVAIL cephfs.cephfs.meta metadata[root@ceph-bz-mds-9ozvvy-node7 ~]# ceph config set mds mds_sleep_rank_change 10000000.0 [root@ceph-bz-mds-9ozvvy-node7 ~]# ceph config set mds mds_connect_bootstrapping True [root@ceph-bz-mds-9ozvvy-node7 ~]# ceph -s cluster: id: 4041e752-888c-11ec-9ac6-fa163e1e31c2 health: HEALTH_OK services: mon: 3 daemons, quorum ceph-bz-mds-9ozvvy-node1-installer,ceph-bz-mds-9ozvvy-node2,ceph-bz-mds-9ozvvy-node3 (age 40m) mgr: ceph-bz-mds-9ozvvy-node1-installer.fzndpb(active, since 43m), standbys: ceph-bz-mds-9ozvvy-node2.znbodr mds: 2/2 daemons up, 1 standby osd: 12 osds: 12 up (since 39m), 12 in (since 39m) data: volumes: 1/1 healthy pools: 3 pools, 65 pgs objects: 1.45k objects, 1.9 GiB usage: 6.0 GiB used, 174 GiB / 180 GiB avail pgs: 65 active+clean io: client: 21 MiB/s rd, 63 MiB/s wr, 22 op/s rd, 55 op/s wr [root@ceph-bz-mds-9ozvvy-node7 ~]# ceph fs status cephfs - 1 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active cephfs.ceph-bz-mds-9ozvvy-node4.varcwu Reqs: 11 /s 1330 1331 283 1312 1 active cephfs.ceph-bz-mds-9ozvvy-node5.btiybz Reqs: 11 /s 109 113 61 93 POOL TYPE USED AVAIL cephfs.cephfs.meta metadata 101M 54.9G cephfs.cephfs.data data 3631M 54.9G STANDBY MDS cephfs.ceph-bz-mds-9ozvvy-node6.mllnlu MDS version: ceph version 16.2.7-48.el8cp (49480538844c9255f03e5b0dccc609ea8fbf2656) pacific (stable) [root@ceph-bz-mds-9ozvvy-node7 ~]# ceph mds fail 0 failed mds gid 14463 111M 54.9G cephfs.cephfs.data data 3751M 54.9G STANDBY MDS cephfs.ceph-bz-mds-9ozvvy-node4.varcwu MDS version: ceph version 16.2.7-48.el8cp (49480538844c9255f03e5b0dccc609ea8fbf2656) pacific (stable) [root@ceph-bz-mds-9ozvvy-node7 ~]# ceph fs status cephfs - 1 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 replay cephfs.ceph-bz-mds-9ozvvy-node6.mllnlu 0 0 0 0 1 active cephfs.ceph-bz-mds-9ozvvy-node5.btiybz Reqs: 0 /s 267 151 58 123 POOL TYPE USED AVAIL cephfs.cephfs.meta metadata 111M 54.9G cephfs.cephfs.data data 3273M 54.9G STANDBY MDS cephfs.ceph-bz-mds-9ozvvy-node4.varcwu MDS version: ceph version 16.2.7-48.el8cp (49480538844c9255f03e5b0dccc609ea8fbf2656) pacific (stable) [root@ceph-bz-mds-9ozvvy-node7 ~]# ceph fs status cephfs - 1 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 resolve cephfs.ceph-bz-mds-9ozvvy-node6.mllnlu 2943 1356 290 0 1 active cephfs.ceph-bz-mds-9ozvvy-node5.btiybz Reqs: 0 /s 267 151 58 123 POOL TYPE USED AVAIL cephfs.cephfs.meta metadata 111M 55.1G cephfs.cephfs.data data 3211M 55.1G STANDBY MDS cephfs.ceph-bz-mds-9ozvvy-node4.varcwu MDS version: ceph version 16.2.7-48.el8cp (49480538844c9255f03e5b0dccc609ea8fbf2656) pacific (stable) [root@ceph-bz-mds-9ozvvy-node7 ~]# ceph fs status cephfs - 1 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 resolve cephfs.ceph-bz-mds-9ozvvy-node6.mllnlu 2943 1356 290 0 1 active cephfs.ceph-bz-mds-9ozvvy-node5.btiybz Reqs: 0 /s 267 151 58 123 POOL TYPE USED AVAIL cephfs.cephfs.meta metadata 111M 55.1G cephfs.cephfs.data data 3211M 55.1G STANDBY MDS cephfs.ceph-bz-mds-9ozvvy-node4.varcwu MDS version: ceph version 16.2.7-48.el8cp (49480538844c9255f03e5b0dccc609ea8fbf2656) pacific (stable) [root@ceph-bz-mds-9ozvvy-node7 ~]# ceph fs status cephfs - 1 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 reconnect cephfs.ceph-bz-mds-9ozvvy-node6.mllnlu 2943 1356 290 0 1 active cephfs.ceph-bz-mds-9ozvvy-node5.btiybz Reqs: 0 /s 267 151 58 119 POOL TYPE USED AVAIL cephfs.cephfs.meta metadata 111M 55.1G cephfs.cephfs.data data 3211M 55.1G STANDBY MDS cephfs.ceph-bz-mds-9ozvvy-node4.varcwu MDS version: ceph version 16.2.7-48.el8cp (49480538844c9255f03e5b0dccc609ea8fbf2656) pacific (stable) [root@ceph-bz-mds-9ozvvy-node7 ~]# ceph fs status cephfs - 1 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 reconnect cephfs.ceph-bz-mds-9ozvvy-node6.mllnlu 2943 1356 290 0 1 active cephfs.ceph-bz-mds-9ozvvy-node5.btiybz Reqs: 0 /s 267 151 58 119 POOL TYPE USED AVAIL cephfs.cephfs.meta metadata 111M 55.1G cephfs.cephfs.data data 3211M 55.1G STANDBY MDS cephfs.ceph-bz-mds-9ozvvy-node4.varcwu MDS version: ceph version 16.2.7-48.el8cp (49480538844c9255f03e5b0dccc609ea8fbf2656) pacific (stable) [root@ceph-bz-mds-9ozvvy-node7 ~]# ceph fs status cephfs - 1 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active cephfs.ceph-bz-mds-9ozvvy-node6.mllnlu Reqs: 30 /s 2988 1356 290 21 1 active cephfs.ceph-bz-mds-9ozvvy-node5.btiybz Reqs: 0 /s 270 142 58 120 POOL TYPE USED AVAIL cephfs.cephfs.meta metadata 112M 55.3G cephfs.cephfs.data data 2941M 55.3G STANDBY MDS cephfs.ceph-bz-mds-9ozvvy-node4.varcwu MDS version: ceph version 16.2.7-48.el8cp (49480538844c9255f03e5b0dccc609ea8fbf2656) pacific (stable)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat Ceph Storage 5.1 Security, Enhancement, and Bug Fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1174