In Cockpit CI the recently updated nfs-utils times out on restart nfs-server. Reproducible: Always Steps to Reproduce: 1. systemctl restart nfs-server Actual Results: [root@ip-172-31-18-71 ~]# systemctl restart nfs-server A dependency job for nfs-server.service failed. See 'journalctl -xe' for details. [root@ip-172-31-18-71 ~]# journalctl -u nfs-server Nov 12 15:58:34 ip-172-31-18-71.us-east-2.compute.internal systemd[1]: Dependency failed for nfs-server.service - NFS server and services. Nov 12 15:58:34 ip-172-31-18-71.us-east-2.compute.internal systemd[1]: nfs-server.service: Job nfs-server.service/start failed with result 'dependency'. nfs-utils-2.8.1-1.rc1.fc42.x86_64
This version was also uploaded to Fedora 40 and 41, so moving target. See https://github.com/cockpit-project/cockpit-machines/issues/1903
I am not seeing this at all # rpm -q nfs-utils nfs-utils-2.8.1-1.rc1.fc42.x86_64 # systemctl restart nfs-server [355704.860001] NFSD: Using nfsdcld client tracking operations. [355704.862922] NFSD: no clients to reclaim, skipping NFSv4 grace period (net f0000000) # journalctl -u nfs-server Nov 13 09:40:10 steved-fedora systemd[1]: Stopped nfs-server.service - NFS serv> Nov 13 09:40:20 steved-fedora systemd[1]: Starting nfs-server.service - NFS ser> Nov 13 09:40:21 steved-fedora systemd[1]: Finished nfs-server.service - NFS ser> Nov 13 09:40:38 steved-fedora systemd[1]: Stopping nfs-server.service - NFS ser> Nov 13 09:40:38 steved-fedora systemd[1]: nfs-server.service: Deactivated succe> Nov 13 09:40:38 steved-fedora systemd[1]: Stopped nfs-server.service - NFS serv> Nov 13 09:40:48 steved-fedora systemd[1]: Starting nfs-server.service - NFS ser> Nov 13 09:40:48 steved-fedora systemd[1]: Finished nfs-server.service - NFS ser> Nov 13 09:44:22 steved-fedora systemd[1]: Reloading nfs-server.service - NFS se> Nov 13 09:44:22 steved-fedora systemd[1]: Reloaded nfs-server.service - NFS ser> what am I doing wrong?
Our /etc/exports looks like: ``` /home/foo 127.0.0.0/24(rw) /home/bar 127.0.0.0/24(rw) ``` But that is probably unrelated I found out that rpcbind just segfaults which probably blocks systemctl restart nfs-server Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd-coredump[3235]: [🡕] Process 3076 (rpc.mountd) of user 0 dumped core. Module libpcre2-8.so.0 from rpm pcre2-10.44-1.fc41.1.x86_64 Module libselinux.so.1 from rpm libselinux-3.7-6.fc42.x86_64 Module libcrypto.so.3 from rpm openssl-3.2.2-8.fc42.x86_64 Module libkeyutils.so.1 from rpm keyutils-1.6.3-4.fc41.x86_64 Module libkrb5support.so.0 from rpm krb5-1.21.3-3.fc42.x86_64 Module libcom_err.so.2 from rpm e2fsprogs-1.47.1-6.fc42.x86_64 Module libk5crypto.so.3 from rpm krb5-1.21.3-3.fc42.x86_64 Module libkrb5.so.3 from rpm krb5-1.21.3-3.fc42.x86_64 Module libgssapi_krb5.so.2 from rpm krb5-1.21.3-3.fc42.x86_64 Module liblzma.so.5 from rpm xz-5.6.3-2.fc42.x86_64 Module libz.so.1 from rpm zlib-ng-2.2.2-1.fc42.x86_64 Module libtirpc.so.3 from rpm libtirpc-1.3.6-0.rc1.fc42.x86_64 Module libuuid.so.1 from rpm util-linux-2.40.2-8.fc42.x86_64 Module libblkid.so.1 from rpm util-linux-2.40.2-8.fc42.x86_64 Module libxml2.so.2 from rpm libxml2-2.12.8-2.fc41.x86_64 Stack trace of thread 3076: #0 0x00007fb587f20328 pthread_sigmask.5 (libc.so.6 + 0x77328) #1 0x00007fb5880d7894 clnt_vc_call (libtirpc.so.3 + 0xd894) #2 0x00007fb5880d5006 rpcb_unset (libtirpc.so.3 + 0xb006) #3 0x000056073acdc3fd n/a (n/a + 0x0) #4 0x000056073accd0a8 n/a (n/a + 0x0) #5 0x000056073accc5a2 n/a (n/a + 0x0) #6 0x00007fb587eac448 __libc_start_call_main (libc.so.6 + 0x3448) #7 0x00007fb587eac50b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x350b) #8 0x000056073acccf55 n/a (n/a + 0x0) ELF object binary architecture: AMD x86-64 Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=nfs-mountd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed' Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump@1-3234-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success' Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Control process exited, code=dumped, status=6/ABRT Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Failed with result 'timeout'. Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: Failed to start nfs-mountd.service - NFS Mount Daemon. Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: Dependency failed for nfs-server.service - NFS server and services. Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-server.service: Job nfs-server.service/start failed with result 'dependency'. Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: systemd-coredump: Deactivated successfully.
When systemctl restart nfs-server hangs I get the following from systemd: journalctl -xe ░░ The process' exit code is 'dumped' and its exit status is 6. Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=nfs-mountd comm> Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Failed with result 'timeout'. ░░ Subject: Unit failed ░░ Defined-By: systemd ░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel ░░ ░░ The unit nfs-mountd.service has entered the 'failed' state with result 'timeout'. Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: Failed to start nfs-mountd.service - NFS Mount Daemon. ░░ Subject: A start job for unit nfs-mountd.service has failed ░░ Defined-By: systemd ░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel ░░ ░░ A start job for unit nfs-mountd.service has finished with a failure. ░░ ░░ The job identifier is 2277 and the job result is failed. Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: Dependency failed for nfs-server.service - NFS server and services. ░░ Subject: A start job for unit nfs-server.service has failed ░░ Defined-By: systemd ░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel ░░ ░░ A start job for unit nfs-server.service has finished with a failure. ░░ ░░ The job identifier is 2268 and the job result is dependency. Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-server.service: Job nfs-server.service/start failed with result 'dependency'. Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: systemd-coredump: Deactivated successfully. ░░ Subject: Unit succeeded ░░ Defined-By: systemd ░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel ░░ ░░ The unit systemd-coredump has successfully entered the 'dead' state. Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump> Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit: BPF prog-id=128 op=UNLOAD Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit: BPF prog-id=127 op=UNLOAD Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit: BPF prog-id=126 op=UNLOAD [root@fedora-rawhide-127-0-0-2-2201 ~]# journalctl -u nfs-mountd.service Nov 13 17:20:05 fedora-rawhide-127-0-0-2-2201 systemd[1]: Starting nfs-mountd.service - NFS Mount Daemon... Nov 13 17:20:50 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: start operation timed out. Terminating. Nov 13 17:21:35 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: State 'stop-sigterm' timed out. Aborting. Nov 13 17:21:35 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Killing process 1524 (rpc.mountd) with signal SIGABRT. Nov 13 17:22:05 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Control process exited, code=dumped, status=6/ABRT Nov 13 17:22:05 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Failed with result 'timeout'. Nov 13 17:22:05 fedora-rawhide-127-0-0-2-2201 systemd[1]: Failed to start nfs-mountd.service - NFS Mount Daemon. Nov 13 17:25:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: Starting nfs-mountd.service - NFS Mount Daemon... Nov 13 17:26:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: start operation timed out. Terminating. Nov 13 17:27:16 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: State 'stop-sigterm' timed out. Aborting. Nov 13 17:27:16 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Killing process 1636 (rpc.mountd) with signal SIGABRT. Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Control process exited, code=dumped, status=6/ABRT Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Failed with result 'timeout'. Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: Failed to start nfs-mountd.service - NFS Mount Daemon.
It appears the problem is with libtirpc-1.3.6-0.rc1... If you downgrade to libtirpc-1.3.6-0 the problem should go away... That's what I'm seeing and anybody verify?
FEDORA-2024-1ec4cd5fd3 (libtirpc-1.3.6-1.fc41) has been submitted as an update to Fedora 41. https://bodhi.fedoraproject.org/updates/FEDORA-2024-1ec4cd5fd3
FEDORA-2024-e8733f31e3 (libtirpc-1.3.6-1.fc40) has been submitted as an update to Fedora 40. https://bodhi.fedoraproject.org/updates/FEDORA-2024-e8733f31e3
FEDORA-2024-1ec4cd5fd3 has been pushed to the Fedora 41 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-1ec4cd5fd3` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-1ec4cd5fd3 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2024-e8733f31e3 has been pushed to the Fedora 40 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-e8733f31e3` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-e8733f31e3 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
Thanks! I confirmed that with nfs-utils-2.8.1-1.rc1.fc40 and libtirpc-1.3.6-1.fc40 update, things are happy again.
FEDORA-2024-e8733f31e3 (libtirpc-1.3.6-1.fc40) has been pushed to the Fedora 40 stable repository. If problem still persists, please make note of it in this bug report.
FEDORA-2024-1ec4cd5fd3 (libtirpc-1.3.6-1.fc41) has been pushed to the Fedora 41 stable repository. If problem still persists, please make note of it in this bug report.