Bug 2304292
| Summary: | [cephfs] mds stuck in clienteply state | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Amarnath <amk> |
| Component: | CephFS | Assignee: | Venky Shankar <vshankar> |
| Status: | CLOSED DUPLICATE | QA Contact: | Hemanth Kumar <hyelloji> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 8.0 | CC: | ceph-eng-bugs, cephqe-warriors |
| Target Milestone: | --- | ||
| Target Release: | 8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2024-08-22 04:37:31 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Please specify the severity of this bug. Severity is defined here: https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity. |
Description of problem: We have 4 node cluster with below roles [root@mero014 ~]# ceph orch host ls HOST ADDR LABELS STATUS mero017 10.8.129.237 _admin,osd,mon,mgr,rgw,installer mero018 10.8.129.238 osd,_admin,mon,mgr,rgw mero019 10.8.129.239 osd-bak,mgr,mon,mds mero020 10.8.129.240 node-exporter,alertmanager,osd,mds,grafana,prometheus 4 hosts in cluster [root@mero014 ~]# Created filesystem and set max_mds to 2 [root@mero014 ~]# ceph fs status cephfs - 27 clients ====== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active cephfs.mero020.rfboyy Reqs: 0 /s 125 28 23 32 1 clientreplay cephfs.mero020.pudtvz 10 13 11 0 POOL TYPE USED AVAIL cephfs.cephfs.meta metadata 537M 96.0T cephfs.cephfs.data data 30.0G 96.0T cephfs-ec - 87 clients ========= RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active cephfs-ec.mero018.icqakl Reqs: 0 /s 14 17 12 76 1 active cephfs-ec.mero019.oisevx Reqs: 0 /s 10 13 12 28 POOL TYPE USED AVAIL cephfs.cephfs-ec.meta metadata 816k 96.0T cephfs.cephfs-ec.data data 3000M 96.0T cephfs_1 - 1 clients ======== RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active cephfs_1.mero020.rwomqq Reqs: 0 /s 10 13 12 1 POOL TYPE USED AVAIL cephfs.cephfs_1.meta metadata 96.0k 96.0T cephfs.cephfs_1.data data 0 96.0T cephfs_io - 1 clients ========= RANK STATE MDS ACTIVITY DNS INOS DIRS CAPS 0 active cephfs_io.mero017.cyawhn Reqs: 0 /s 10 13 12 1 POOL TYPE USED AVAIL cephfs.cephfs_io.meta metadata 101k 96.0T cephfs.cephfs_io.data data 0 96.0T STANDBY MDS cephfs_1.mero017.vcsaum cephfs.mero018.kkgluj cephfs.mero020.paudfu cephfs.mero019.ifxmuk cephfs-ec.mero020.ymclas cephfs-ec.mero019.urydro cephfs_io.mero019.mbyyis MDS version: ceph version 19.1.0-22.el9cp (e5b7dfedb7d8a66d166eb0f98361f71bdb7905ad) squid (rc) [root@mero014 ~]# Both active nodes have been deployed in same mero020 node The cluster came to this state after running baremetal suites. mds Logs : http://magna002.ceph.redhat.com/ceph-qe-logs/amk_1/mds_clientreply/ Version-Release number of selected component (if applicable): [root@mero014 ~]# ceph versions { "mon": { "ceph version 19.1.0-22.el9cp (e5b7dfedb7d8a66d166eb0f98361f71bdb7905ad) squid (rc)": 3 }, "mgr": { "ceph version 19.1.0-22.el9cp (e5b7dfedb7d8a66d166eb0f98361f71bdb7905ad) squid (rc)": 3 }, "osd": { "ceph version 19.1.0-22.el9cp (e5b7dfedb7d8a66d166eb0f98361f71bdb7905ad) squid (rc)": 44 }, "mds": { "ceph version 19.1.0-22.el9cp (e5b7dfedb7d8a66d166eb0f98361f71bdb7905ad) squid (rc)": 13 }, "rgw": { "ceph version 19.1.0-22.el9cp (e5b7dfedb7d8a66d166eb0f98361f71bdb7905ad) squid (rc)": 2 }, "overall": { "ceph version 19.1.0-22.el9cp (e5b7dfedb7d8a66d166eb0f98361f71bdb7905ad) squid (rc)": 65 } } How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: