Created attachment 2076847 [details] Ganesha unit journal logs Description of problem: During the RHCS 8 validation, the ceph-nfs cluster is unable to start with the proxy-protocol option introduced to support Manila with a CephIngress daemon. In particular, the following stacktrace is found: '''' Feb 17 04:07:21 ceph-e36ghuwn-0 ceph-3b6a1b36-5e55-5bfd-80a8-5d723433981e-nfs-cephfs-2-0-ceph-e36ghuwn-0-qppsbw[403604]: 17/02/2025 09:07:21 : epoch 67b2fc2f : ceph-e36ghuwn-0 : ganesha.nfsd-2[svc_6] rpc :TIRPC :EVENT :handle_haproxy_header: 0x7f41540095a0 fd 26 proxy header rest len failed header rlen = % (will set dead) Feb 17 04:07:22 ceph-e36ghuwn-0 ceph-3b6a1b36-5e55-5bfd-80a8-5d723433981e-nfs-cephfs-2-0-ceph-e36ghuwn-0-qppsbw[403604]: 17/02/2025 09:07:22 : epoch 67b2fc2f : ceph-e36ghuwn-0 : ganesha.nfsd-2[svc_11] rpc :TIRPC :EVENT :handle_haproxy_header: 0x7f41540095a0 fd 26 proxy header rest len failed header rlen = % (will set dead) Feb 17 04:07:23 ceph-e36ghuwn-0 ceph-3b6a1b36-5e55-5bfd-80a8-5d723433981e-nfs-cephfs-2-0-ceph-e36ghuwn-0-qppsbw[403604]: 17/02/2025 09:07:23 : epoch 67b2fc2f : ceph-e36ghuwn-0 : ganesha.nfsd-2[svc_2] rpc :TIRPC :EVENT :handle_haproxy_header: 0x7f4148002760 fd 26 proxy header rest len failed header rlen = % (will set dead) Feb 17 04:07:23 ceph-e36ghuwn-0 ceph-3b6a1b36-5e55-5bfd-80a8-5d723433981e-nfs-cephfs-2-0-ceph-e36ghuwn-0-qppsbw[403604]: 17/02/2025 09:07:23 : epoch 67b2fc2f : ceph-e36ghuwn-0 : ganesha.nfsd-2[svc_6] rpc :TIRPC :EVENT :handle_haproxy_header: 0x7f414c0018f0 fd 26 proxy header rest len failed header rlen = % (will set dead) Feb 17 04:07:24 ceph-e36ghuwn-0 ceph-3b6a1b36-5e55-5bfd-80a8-5d723433981e-nfs-cephfs-2-0-ceph-e36ghuwn-0-qppsbw[403604]: 17/02/2025 09:07:24 : epoch 67b2fc2f : ceph-e36ghuwn-0 : ganesha.nfsd-2[svc_4] rpc :TIRPC :EVENT :handle_haproxy_header: 0x7f4134003d10 fd 26 proxy ignored for local Feb 17 04:07:25 ceph-e36ghuwn-0 systemd-coredump[403711]: [🡕] Process 403608 (ganesha.nfsd) of user 0 dumped core. Stack trace of thread 38: #0 0x00007f419e488536 n/a (/usr/lib64/libntirpc.so.6.0.1 + 0x22536) #1 0x0000000000000000 n/a (n/a + 0x0) #2 0x00007f419e492c90 n/a (/usr/lib64/libntirpc.so.6.0.1 + 0x2cc90) ELF object binary architecture: AMD x86-64 Subject: Process 403608 (ganesha.nfsd) dumped core Defined-By: systemd Support: https://access.redhat.com/support Documentation: man:core(5) '''' Version-Release number of selected component (if applicable): RHCS 8 container with the following ganesha packages: nfs-ganesha-selinux-6.0-8.1.el9cp.noarch nfs-ganesha-6.0-8.1.el9cp.x86_64 nfs-ganesha-rgw-6.0-8.1.el9cp.x86_64 nfs-ganesha-ceph-6.0-8.1.el9cp.x86_64 nfs-ganesha-rados-grace-6.0-8.1.el9cp.x86_64 nfs-ganesha-rados-urls-6.0-8.1.el9cp.x86_64 nfs-ganesha-utils-6.0-8.1.el9cp.x86_64 How reproducible: ``` ceph nfs cluster create cephfs '--placement=ceph-e36ghuwn-0;ceph-e36ghuwn-1;ceph-e36ghuwn-2' --ingress --virtual-ip=192.168.122.2 --ingress-mode=haproxy-protocol ``` ``` $ ceph orch ls NAME PORTS RUNNING REFRESHED AGE PLACEMENT crash 3/3 5m ago 2d * ingress.nfs.cephfs 192.168.122.2:2049,9049 6/6 5m ago 5h ceph-e36ghuwn-0;ceph-e36ghuwn-1;ceph-e36ghuwn-2 ingress.rgw.default 192.168.122.2:8080,8999 2/2 5m ago 2d count:1 mds.cephfs 3/3 5m ago 2d ceph-e36ghuwn-0;ceph-e36ghuwn-1;ceph-e36ghuwn-2 mgr 3/3 5m ago 2d ceph-e36ghuwn-0;ceph-e36ghuwn-1;ceph-e36ghuwn-2 mon 3/3 5m ago 2d ceph-e36ghuwn-0;ceph-e36ghuwn-1;ceph-e36ghuwn-2 nfs.cephfs ?:12049 0/3 5m ago 5h ceph-e36ghuwn-0;ceph-e36ghuwn-1;ceph-e36ghuwn-2 osd.default_drive_group 9 5m ago 2d ceph-e36ghuwn-0;ceph-e36ghuwn-1;ceph-e36ghuwn-2 rgw.rgw ?:8082 3/3 5m ago 2d ceph-e36ghuwn-0;ceph-e36ghuwn-1;ceph-e36ghuwn-2 ``` ``` [ceph: root@ceph-e36ghuwn-0 /]# ceph orch ps | grep -i nfs haproxy.nfs.cephfs.ceph-e36ghuwn-0.tijlzs ceph-e36ghuwn-0 *:2049,9049 running (5h) 6m ago 2d 10.6M - 2.4.22-f8e3218 73c7c53e4888 8119448a5c19 haproxy.nfs.cephfs.ceph-e36ghuwn-1.ttayqn ceph-e36ghuwn-1 *:2049,9049 running (5h) 6m ago 2d 13.6M - 2.4.22-f8e3218 73c7c53e4888 7ca849babbd3 haproxy.nfs.cephfs.ceph-e36ghuwn-2.befmsu ceph-e36ghuwn-2 *:2049,9049 running (5h) 6m ago 2d 9768k - 2.4.22-f8e3218 73c7c53e4888 097638a4c20e keepalived.nfs.cephfs.ceph-e36ghuwn-0.gzlbkw ceph-e36ghuwn-0 running (2d) 6m ago 2d 1644k - 2.2.8 c63687d7cfa0 39ccb21f414a keepalived.nfs.cephfs.ceph-e36ghuwn-1.hwnynw ceph-e36ghuwn-1 running (2d) 6m ago 2d 1640k - 2.2.8 c63687d7cfa0 55c0353ffb10 keepalived.nfs.cephfs.ceph-e36ghuwn-2.dozadz ceph-e36ghuwn-2 running (2d) 6m ago 2d 1640k - 2.2.8 c63687d7cfa0 f643614c1380 nfs.cephfs.0.0.ceph-e36ghuwn-1.awvbjw ceph-e36ghuwn-1 *:12049 error 6m ago 5h - - <unknown> <unknown> <unknown> nfs.cephfs.1.0.ceph-e36ghuwn-2.gohgiy ceph-e36ghuwn-2 *:12049 error 6m ago 5h - - <unknown> <unknown> <unknown> nfs.cephfs.2.0.ceph-e36ghuwn-0.qppsbw ceph-e36ghuwn-0 *:12049 error 6m ago 5h - - <unknown> <unknown> <unknown> ```
Please specify the severity of this bug. Severity is defined here: https://bugzilla.redhat.com/page.cgi?id=fields.html#bug_severity.
cc @ffilz : hi Frank, this looks like it was reported upstream as https://github.com/nfs-ganesha/ntirpc/pull/322 - could it be the same issue?