Bug 2223249
| Summary: | NFS server is getting crashed and moving to error state | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Ceph Storage | Reporter: | Amarnath <amk> |
| Component: | NFS-Ganesha | Assignee: | Frank Filz <ffilz> |
| Status: | ON_QA --- | QA Contact: | Manisha Saini <msaini> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.1 | CC: | cephqe-warriors, hyelloji, kdreyer, ngangadh, sostapov, vdas, vereddy |
| Target Milestone: | --- | ||
| Target Release: | 7.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | nfs-ganesha-5.5-1.el9cp | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Missed the window for 6.1 z1. Retargeting to 6.1 z2. What version of Ganesha is running? Do you have a stack back trace from the crash? I'm going to assume it's the known crash fixed in V5.2. We should have the latest version (V5.4) available soon. @amk - Can you retest in the latest version and update the BZ? I see a new build is available with nfs-ganesha v5.5(https://bugzilla.redhat.com/show_bug.cgi?id=2232674#c4) Hi Hemanth On latest build we are not hitting the issue NFS version 5.5 Logs : http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-OOSH4J/ [root@ceph-nfs-reg-oosh4j-node8 ~]# ceph orch ps NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID mds.cephfs.ceph-nfs-reg-oosh4j-node3.yswqxh ceph-nfs-reg-oosh4j-node3 running (21m) - 21m 17.9M - 18.2.0-17.el9cp 468120a40641 a059459d1478 mds.cephfs.ceph-nfs-reg-oosh4j-node4.fccokk ceph-nfs-reg-oosh4j-node4 running (21m) - 21m 64.4M - 18.2.0-17.el9cp 468120a40641 e7ac0645cd36 mds.cephfs.ceph-nfs-reg-oosh4j-node5.rkuxuc ceph-nfs-reg-oosh4j-node5 running (21m) - 21m 18.3M - 18.2.0-17.el9cp 468120a40641 5da8bfb9a4bd mds.cephfs.ceph-nfs-reg-oosh4j-node6.lopvxu ceph-nfs-reg-oosh4j-node6 running (21m) - 21m 16.5M - 18.2.0-17.el9cp 468120a40641 ac15caa841dd mds.cephfs.ceph-nfs-reg-oosh4j-node7.cxfpyi ceph-nfs-reg-oosh4j-node7 running (21m) - 21m 61.8M - 18.2.0-17.el9cp 468120a40641 eae04546b0ef mgr.ceph-nfs-reg-oosh4j-node1-installer.qsermm ceph-nfs-reg-oosh4j-node1-installer *:9283,8765 running (29m) - 29m 942M - 18.2.0-17.el9cp 468120a40641 4fa9fe22c3bd mgr.ceph-nfs-reg-oosh4j-node2.pzxfsr ceph-nfs-reg-oosh4j-node2 *:8443,8765 running (27m) - 27m 433M - 18.2.0-17.el9cp 468120a40641 cda90e9dfd6d mon.ceph-nfs-reg-oosh4j-node1-installer ceph-nfs-reg-oosh4j-node1-installer running (29m) - 29m 52.3M 2048M 18.2.0-17.el9cp 468120a40641 04a9c3f696ff mon.ceph-nfs-reg-oosh4j-node2 ceph-nfs-reg-oosh4j-node2 running (25m) - 25m 47.9M 2048M 18.2.0-17.el9cp 468120a40641 86f5940d2962 mon.ceph-nfs-reg-oosh4j-node3 ceph-nfs-reg-oosh4j-node3 running (25m) - 25m 46.7M 2048M 18.2.0-17.el9cp 468120a40641 b5463561eb58 nfs.cephfs-nfs.0.0.ceph-nfs-reg-oosh4j-node6.hfuigq ceph-nfs-reg-oosh4j-node6 *:2049 running (13m) - 13m 845M - 5.5 468120a40641 3956df735221 osd.0 ceph-nfs-reg-oosh4j-node4 running (21m) - 21m 191M 4096M 18.2.0-17.el9cp 468120a40641 0af25e2cc84e osd.1 ceph-nfs-reg-oosh4j-node6 running (21m) - 21m 195M 4096M 18.2.0-17.el9cp 468120a40641 c42b26410fa4 osd.2 ceph-nfs-reg-oosh4j-node5 running (21m) - 21m 249M 4096M 18.2.0-17.el9cp 468120a40641 bcc99a3137de osd.3 ceph-nfs-reg-oosh4j-node4 running (21m) - 21m 274M 4096M 18.2.0-17.el9cp 468120a40641 7b2bf51933ff osd.4 ceph-nfs-reg-oosh4j-node6 running (21m) - 21m 234M 4096M 18.2.0-17.el9cp 468120a40641 cb05d9df3d7d osd.5 ceph-nfs-reg-oosh4j-node5 running (21m) - 21m 255M 4096M 18.2.0-17.el9cp 468120a40641 fd04084d8b7d osd.6 ceph-nfs-reg-oosh4j-node6 running (21m) - 21m 220M 4096M 18.2.0-17.el9cp 468120a40641 1c3d0be85a4c osd.7 ceph-nfs-reg-oosh4j-node4 running (21m) - 21m 290M 4096M 18.2.0-17.el9cp 468120a40641 762f16cb0347 osd.8 ceph-nfs-reg-oosh4j-node5 running (21m) - 21m 217M 4096M 18.2.0-17.el9cp 468120a40641 30008d3daa8e osd.9 ceph-nfs-reg-oosh4j-node6 running (21m) - 21m 174M 4096M 18.2.0-17.el9cp 468120a40641 1d287b0b11ec osd.10 ceph-nfs-reg-oosh4j-node4 running (21m) - 21m 198M 4096M 18.2.0-17.el9cp 468120a40641 ecac226d6a48 osd.11 ceph-nfs-reg-oosh4j-node5 running (21m) - 21m 236M 4096M 18.2.0-17.el9cp 468120a40641 3217c23755fd Regards, Amarnath |
Description of problem: NFS server is getting crashed and moving to error state. We are using Cephfs as backend storage Steps Followed: This we have encountered when we ran our automation script. It is failing while running for n in {1..20}; do dd if=/dev/urandom of=/mnt/nfs_VA6JX/volumes/_nogroup/subvolume2/6c7638e9-bb94-49a2-a832-703807b264e1/file$(printf %03d $n) bs=500k count=1000; done NFS logs : Automation script logs : http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-S0SH7I/cephfs_nfs_snapshot_clone_operations_0.log NFS server logs : http://magna002.ceph.redhat.com/ceph-qe-logs/amar/BZ_NFS_logs.txt [root@ceph-amk-test-o56vqd-node6 edf01e48-21a9-11ee-becc-fa163e6d5609]# podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 41a4a13349a8 registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:cf8710ef94bf3dcb65b998f90ce0c0ecf80bee8541fe6034f95d252500046cfd -n osd.2 -f --set... About an hour ago Up About an hour ceph-edf01e48-21a9-11ee-becc-fa163e6d5609-osd-2 8c8a2ae00d8d registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:cf8710ef94bf3dcb65b998f90ce0c0ecf80bee8541fe6034f95d252500046cfd -n osd.8 -f --set... About an hour ago Up About an hour ceph-edf01e48-21a9-11ee-becc-fa163e6d5609-osd-8 2e4f25e306fb registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:cf8710ef94bf3dcb65b998f90ce0c0ecf80bee8541fe6034f95d252500046cfd -n osd.11 -f --se... About an hour ago Up About an hour ceph-edf01e48-21a9-11ee-becc-fa163e6d5609-osd-11 ea356097720c registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:cf8710ef94bf3dcb65b998f90ce0c0ecf80bee8541fe6034f95d252500046cfd -n osd.5 -f --set... About an hour ago Up About an hour ceph-edf01e48-21a9-11ee-becc-fa163e6d5609-osd-5 e4bc52d0a37d registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:cf8710ef94bf3dcb65b998f90ce0c0ecf80bee8541fe6034f95d252500046cfd -n mds.cephfs.cep... 59 minutes ago Up 59 minutes ceph-edf01e48-21a9-11ee-becc-fa163e6d5609-mds-cephfs-ceph-amk-test-o56vqd-node6-ibqcai 272e4942bc8d registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:cf8710ef94bf3dcb65b998f90ce0c0ecf80bee8541fe6034f95d252500046cfd -F -L STDERR -N N... 2 seconds ago Up 2 seconds ceph-edf01e48-21a9-11ee-becc-fa163e6d5609-nfs-cephfs-nfs-0-0-ceph-amk-test-o56vqd-node6-uppiec [root@ceph-amk-test-o56vqd-node6 edf01e48-21a9-11ee-becc-fa163e6d5609]# podman logs 272e4942bc8d [root@ceph-amk-test-o56vqd-node6 edf01e48-21a9-11ee-becc-fa163e6d5609]# podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 41a4a13349a8 registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:cf8710ef94bf3dcb65b998f90ce0c0ecf80bee8541fe6034f95d252500046cfd -n osd.2 -f --set... About an hour ago Up About an hour ceph-edf01e48-21a9-11ee-becc-fa163e6d5609-osd-2 8c8a2ae00d8d registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:cf8710ef94bf3dcb65b998f90ce0c0ecf80bee8541fe6034f95d252500046cfd -n osd.8 -f --set... About an hour ago Up About an hour ceph-edf01e48-21a9-11ee-becc-fa163e6d5609-osd-8 2e4f25e306fb registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:cf8710ef94bf3dcb65b998f90ce0c0ecf80bee8541fe6034f95d252500046cfd -n osd.11 -f --se... About an hour ago Up About an hour ceph-edf01e48-21a9-11ee-becc-fa163e6d5609-osd-11 ea356097720c registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:cf8710ef94bf3dcb65b998f90ce0c0ecf80bee8541fe6034f95d252500046cfd -n osd.5 -f --set... About an hour ago Up About an hour ceph-edf01e48-21a9-11ee-becc-fa163e6d5609-osd-5 e4bc52d0a37d registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:cf8710ef94bf3dcb65b998f90ce0c0ecf80bee8541fe6034f95d252500046cfd -n mds.cephfs.cep... About an hour ago Up About an hour ceph-edf01e48-21a9-11ee-becc-fa163e6d5609-mds-cephfs-ceph-amk-test-o56vqd-node6-ibqcai [root@ceph-amk-test-o56vqd-node9 ~]# ceph orch ps NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID mds.cephfs.ceph-amk-test-o56vqd-node3.bjjbwo ceph-amk-test-o56vqd-node3 running (63m) 3m ago 18h 17.4M - 17.2.6-96.el9cp 85cb9476225e 870a32dca594 mds.cephfs.ceph-amk-test-o56vqd-node4.xcfydx ceph-amk-test-o56vqd-node4 running (63m) 3m ago 18h 65.8M - 17.2.6-96.el9cp 85cb9476225e 618abd182541 mds.cephfs.ceph-amk-test-o56vqd-node5.asejol ceph-amk-test-o56vqd-node5 running (63m) 3m ago 18h 89.8M - 17.2.6-96.el9cp 85cb9476225e bc88ffc160c2 mds.cephfs.ceph-amk-test-o56vqd-node6.ibqcai ceph-amk-test-o56vqd-node6 running (63m) 9m ago 18h 69.7M - 17.2.6-96.el9cp 85cb9476225e e4bc52d0a37d mds.cephfs.ceph-amk-test-o56vqd-node7.mumwju ceph-amk-test-o56vqd-node7 running (63m) 6m ago 18h 70.3M - 17.2.6-96.el9cp 85cb9476225e b6251c0a6620 mgr.ceph-amk-test-o56vqd-node1-installer.hcqtsu ceph-amk-test-o56vqd-node1-installer *:9283 running (18h) 6m ago 18h 408M - 17.2.6-96.el9cp 85cb9476225e c9e2cb37570e mgr.ceph-amk-test-o56vqd-node2.bldjxz ceph-amk-test-o56vqd-node2 *:8443 running (18h) 6m ago 18h 500M - 17.2.6-96.el9cp 85cb9476225e 41358a1086f1 mon.ceph-amk-test-o56vqd-node1-installer ceph-amk-test-o56vqd-node1-installer running (63m) 6m ago 18h 109M 2048M 17.2.6-96.el9cp 85cb9476225e f458e2e97bed mon.ceph-amk-test-o56vqd-node2 ceph-amk-test-o56vqd-node2 running (63m) 6m ago 18h 100M 2048M 17.2.6-96.el9cp 85cb9476225e c2c25a66ac31 mon.ceph-amk-test-o56vqd-node3 ceph-amk-test-o56vqd-node3 running (63m) 3m ago 18h 104M 2048M 17.2.6-96.el9cp 85cb9476225e 8580c3977a8e nfs.cephfs-nfs.0.0.ceph-amk-test-o56vqd-node6.uppiec ceph-amk-test-o56vqd-node6 *:2049 running (9m) 9m ago 72m 15.0M - 5.1 85cb9476225e 53c4ca2ce000 osd.0 ceph-amk-test-o56vqd-node5 running (70m) 3m ago 18h 492M 4096M 17.2.6-96.el9cp 85cb9476225e 875d3b758333 osd.1 ceph-amk-test-o56vqd-node4 running (72m) 3m ago 18h 470M 4096M 17.2.6-96.el9cp 85cb9476225e 86984e2016f1 osd.2 ceph-amk-test-o56vqd-node6 running (67m) 9m ago 18h 446M 4096M 17.2.6-96.el9cp 85cb9476225e 41a4a13349a8 osd.3 ceph-amk-test-o56vqd-node5 running (69m) 3m ago 18h 459M 4096M 17.2.6-96.el9cp 85cb9476225e f330e2de7e64 osd.4 ceph-amk-test-o56vqd-node4 running (71m) 3m ago 18h 524M 4096M 17.2.6-96.el9cp 85cb9476225e 946f3fdee22e osd.5 ceph-amk-test-o56vqd-node6 running (67m) 9m ago 18h 516M 4096M 17.2.6-96.el9cp 85cb9476225e ea356097720c osd.6 ceph-amk-test-o56vqd-node5 running (69m) 3m ago 18h 499M 4096M 17.2.6-96.el9cp 85cb9476225e e98cc06ac9c3 osd.7 ceph-amk-test-o56vqd-node4 running (71m) 3m ago 18h 480M 4096M 17.2.6-96.el9cp 85cb9476225e 01031abcea23 osd.8 ceph-amk-test-o56vqd-node6 running (67m) 9m ago 18h 421M 4096M 17.2.6-96.el9cp 85cb9476225e 8c8a2ae00d8d osd.9 ceph-amk-test-o56vqd-node5 running (69m) 3m ago 18h 415M 4096M 17.2.6-96.el9cp 85cb9476225e eb569f668458 osd.10 ceph-amk-test-o56vqd-node4 running (72m) 3m ago 18h 446M 4096M 17.2.6-96.el9cp 85cb9476225e 19513209737c osd.11 ceph-amk-test-o56vqd-node6 running (67m) 9m ago 18h 385M 4096M 17.2.6-96.el9cp 85cb9476225e 2e4f25e306fb [root@ceph-amk-test-o56vqd-node9 ~]# ceph orch ps NAME HOST PORTS STATUS REFRESHED AGE MEM USE MEM LIM VERSION IMAGE ID CONTAINER ID mds.cephfs.ceph-amk-test-o56vqd-node3.bjjbwo ceph-amk-test-o56vqd-node3 running (65m) 4m ago 18h 17.4M - 17.2.6-96.el9cp 85cb9476225e 870a32dca594 mds.cephfs.ceph-amk-test-o56vqd-node4.xcfydx ceph-amk-test-o56vqd-node4 running (64m) 4m ago 18h 65.8M - 17.2.6-96.el9cp 85cb9476225e 618abd182541 mds.cephfs.ceph-amk-test-o56vqd-node5.asejol ceph-amk-test-o56vqd-node5 running (64m) 4m ago 18h 89.8M - 17.2.6-96.el9cp 85cb9476225e bc88ffc160c2 mds.cephfs.ceph-amk-test-o56vqd-node6.ibqcai ceph-amk-test-o56vqd-node6 running (64m) 15s ago 18h 75.3M - 17.2.6-96.el9cp 85cb9476225e e4bc52d0a37d mds.cephfs.ceph-amk-test-o56vqd-node7.mumwju ceph-amk-test-o56vqd-node7 running (64m) 7m ago 18h 70.3M - 17.2.6-96.el9cp 85cb9476225e b6251c0a6620 mgr.ceph-amk-test-o56vqd-node1-installer.hcqtsu ceph-amk-test-o56vqd-node1-installer *:9283 running (18h) 7m ago 18h 408M - 17.2.6-96.el9cp 85cb9476225e c9e2cb37570e mgr.ceph-amk-test-o56vqd-node2.bldjxz ceph-amk-test-o56vqd-node2 *:8443 running (18h) 7m ago 18h 500M - 17.2.6-96.el9cp 85cb9476225e 41358a1086f1 mon.ceph-amk-test-o56vqd-node1-installer ceph-amk-test-o56vqd-node1-installer running (65m) 7m ago 18h 109M 2048M 17.2.6-96.el9cp 85cb9476225e f458e2e97bed mon.ceph-amk-test-o56vqd-node2 ceph-amk-test-o56vqd-node2 running (65m) 7m ago 18h 100M 2048M 17.2.6-96.el9cp 85cb9476225e c2c25a66ac31 mon.ceph-amk-test-o56vqd-node3 ceph-amk-test-o56vqd-node3 running (65m) 4m ago 18h 104M 2048M 17.2.6-96.el9cp 85cb9476225e 8580c3977a8e nfs.cephfs-nfs.0.0.ceph-amk-test-o56vqd-node6.uppiec ceph-amk-test-o56vqd-node6 *:2049 error 15s ago 74m - - <unknown> <unknown> <unknown> osd.0 ceph-amk-test-o56vqd-node5 running (71m) 4m ago 18h 492M 4096M 17.2.6-96.el9cp 85cb9476225e 875d3b758333 osd.1 ceph-amk-test-o56vqd-node4 running (73m) 4m ago 18h 470M 4096M 17.2.6-96.el9cp 85cb9476225e 86984e2016f1 osd.2 ceph-amk-test-o56vqd-node6 running (69m) 15s ago 18h 486M 4096M 17.2.6-96.el9cp 85cb9476225e 41a4a13349a8 osd.3 ceph-amk-test-o56vqd-node5 running (70m) 4m ago 18h 459M 4096M 17.2.6-96.el9cp 85cb9476225e f330e2de7e64 osd.4 ceph-amk-test-o56vqd-node4 running (72m) 4m ago 18h 524M 4096M 17.2.6-96.el9cp 85cb9476225e 946f3fdee22e osd.5 ceph-amk-test-o56vqd-node6 running (68m) 15s ago 18h 540M 4096M 17.2.6-96.el9cp 85cb9476225e ea356097720c osd.6 ceph-amk-test-o56vqd-node5 running (70m) 4m ago 18h 499M 4096M 17.2.6-96.el9cp 85cb9476225e e98cc06ac9c3 osd.7 ceph-amk-test-o56vqd-node4 running (73m) 4m ago 18h 480M 4096M 17.2.6-96.el9cp 85cb9476225e 01031abcea23 osd.8 ceph-amk-test-o56vqd-node6 running (68m) 15s ago 18h 436M 4096M 17.2.6-96.el9cp 85cb9476225e 8c8a2ae00d8d osd.9 ceph-amk-test-o56vqd-node5 running (71m) 4m ago 18h 415M 4096M 17.2.6-96.el9cp 85cb9476225e eb569f668458 osd.10 ceph-amk-test-o56vqd-node4 running (73m) 4m ago 18h 446M 4096M 17.2.6-96.el9cp 85cb9476225e 19513209737c osd.11 ceph-amk-test-o56vqd-node6 running (68m) 15s ago 18h 406M 4096M 17.2.6-96.el9cp 85cb9476225e 2e4f25e306fb [root@ceph-amk-test-o56vqd-node9 ~]# Version-Release number of selected component (if applicable): How reproducible: 1/1 Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: