Bug 2325556

Summary: [nfs-utils] nfs-server restart times out
Product: [Fedora] Fedora Reporter: Jelle van der Waa <jvanderwaa>
Component: libtirpcAssignee: Steve Dickson <steved>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: urgent    
Version: 40CC: luk.claes, mpitt, steved
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
URL: https://artifacts.dev.testing-farm.io/4d3c5370-6022-464c-961c-d820cd9a3065/
Whiteboard: CockpitTest
Fixed In Version: libtirpc-1.3.6-1.fc40 libtirpc-1.3.6-1.fc41 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-11-16 02:02:02 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jelle van der Waa 2024-11-12 16:15:39 UTC
In Cockpit CI the recently updated nfs-utils times out on restart nfs-server.




Reproducible: Always

Steps to Reproduce:
1. systemctl restart nfs-server
Actual Results:  
[root@ip-172-31-18-71 ~]# systemctl restart nfs-server
A dependency job for nfs-server.service failed. See 'journalctl -xe' for details.


[root@ip-172-31-18-71 ~]# journalctl -u nfs-server
Nov 12 15:58:34 ip-172-31-18-71.us-east-2.compute.internal systemd[1]: Dependency failed for nfs-server.service - NFS server and services.
Nov 12 15:58:34 ip-172-31-18-71.us-east-2.compute.internal systemd[1]: nfs-server.service: Job nfs-server.service/start failed with result 'dependency'.


nfs-utils-2.8.1-1.rc1.fc42.x86_64

Comment 1 Martin Pitt 2024-11-13 05:59:12 UTC
This version was also uploaded to Fedora 40 and 41, so moving target.

See https://github.com/cockpit-project/cockpit-machines/issues/1903

Comment 2 Steve Dickson 2024-11-13 14:52:51 UTC
I am not seeing this at all

# rpm -q nfs-utils
nfs-utils-2.8.1-1.rc1.fc42.x86_64

# systemctl restart nfs-server
[355704.860001] NFSD: Using nfsdcld client tracking operations.
[355704.862922] NFSD: no clients to reclaim, skipping NFSv4 grace period (net f0000000)


# journalctl -u nfs-server
Nov 13 09:40:10 steved-fedora systemd[1]: Stopped nfs-server.service - NFS serv>
Nov 13 09:40:20 steved-fedora systemd[1]: Starting nfs-server.service - NFS ser>
Nov 13 09:40:21 steved-fedora systemd[1]: Finished nfs-server.service - NFS ser>
Nov 13 09:40:38 steved-fedora systemd[1]: Stopping nfs-server.service - NFS ser>
Nov 13 09:40:38 steved-fedora systemd[1]: nfs-server.service: Deactivated succe>
Nov 13 09:40:38 steved-fedora systemd[1]: Stopped nfs-server.service - NFS serv>
Nov 13 09:40:48 steved-fedora systemd[1]: Starting nfs-server.service - NFS ser>
Nov 13 09:40:48 steved-fedora systemd[1]: Finished nfs-server.service - NFS ser>
Nov 13 09:44:22 steved-fedora systemd[1]: Reloading nfs-server.service - NFS se>
Nov 13 09:44:22 steved-fedora systemd[1]: Reloaded nfs-server.service - NFS ser>

what am I doing wrong?

Comment 4 Jelle van der Waa 2024-11-13 17:15:32 UTC
Our /etc/exports looks like:

```
/home/foo 127.0.0.0/24(rw)
/home/bar 127.0.0.0/24(rw)
```

But that is probably unrelated I found out that rpcbind just segfaults which probably blocks systemctl restart nfs-server

Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd-coredump[3235]: [🡕] Process 3076 (rpc.mountd) of user 0 dumped core.

                                                                      Module libpcre2-8.so.0 from rpm pcre2-10.44-1.fc41.1.x86_64
                                                                      Module libselinux.so.1 from rpm libselinux-3.7-6.fc42.x86_64
                                                                      Module libcrypto.so.3 from rpm openssl-3.2.2-8.fc42.x86_64
                                                                      Module libkeyutils.so.1 from rpm keyutils-1.6.3-4.fc41.x86_64
                                                                      Module libkrb5support.so.0 from rpm krb5-1.21.3-3.fc42.x86_64
                                                                      Module libcom_err.so.2 from rpm e2fsprogs-1.47.1-6.fc42.x86_64
                                                                      Module libk5crypto.so.3 from rpm krb5-1.21.3-3.fc42.x86_64
                                                                      Module libkrb5.so.3 from rpm krb5-1.21.3-3.fc42.x86_64
                                                                      Module libgssapi_krb5.so.2 from rpm krb5-1.21.3-3.fc42.x86_64
                                                                      Module liblzma.so.5 from rpm xz-5.6.3-2.fc42.x86_64
                                                                      Module libz.so.1 from rpm zlib-ng-2.2.2-1.fc42.x86_64
                                                                      Module libtirpc.so.3 from rpm libtirpc-1.3.6-0.rc1.fc42.x86_64
                                                                      Module libuuid.so.1 from rpm util-linux-2.40.2-8.fc42.x86_64
                                                                      Module libblkid.so.1 from rpm util-linux-2.40.2-8.fc42.x86_64
                                                                      Module libxml2.so.2 from rpm libxml2-2.12.8-2.fc41.x86_64
                                                                      Stack trace of thread 3076:
                                                                      #0  0x00007fb587f20328 pthread_sigmask.5 (libc.so.6 + 0x77328)
                                                                      #1  0x00007fb5880d7894 clnt_vc_call (libtirpc.so.3 + 0xd894)
                                                                      #2  0x00007fb5880d5006 rpcb_unset (libtirpc.so.3 + 0xb006)
                                                                      #3  0x000056073acdc3fd n/a (n/a + 0x0)
                                                                      #4  0x000056073accd0a8 n/a (n/a + 0x0)
                                                                      #5  0x000056073accc5a2 n/a (n/a + 0x0)
                                                                      #6  0x00007fb587eac448 __libc_start_call_main (libc.so.6 + 0x3448)
                                                                      #7  0x00007fb587eac50b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x350b)
                                                                      #8  0x000056073acccf55 n/a (n/a + 0x0)
                                                                      ELF object binary architecture: AMD x86-64
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=nfs-mountd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump@1-3234-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Control process exited, code=dumped, status=6/ABRT
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Failed with result 'timeout'.
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: Failed to start nfs-mountd.service - NFS Mount Daemon.
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: Dependency failed for nfs-server.service - NFS server and services.
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-server.service: Job nfs-server.service/start failed with result 'dependency'.
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: systemd-coredump: Deactivated successfully.

Comment 5 Jelle van der Waa 2024-11-13 17:29:40 UTC
When systemctl restart nfs-server hangs I get the following from systemd:

journalctl -xe
â–‘â–‘ The process' exit code is 'dumped' and its exit status is 6.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=nfs-mountd comm>
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Failed with result 'timeout'.
â–‘â–‘ Subject: Unit failed
â–‘â–‘ Defined-By: systemd
â–‘â–‘ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
â–‘â–‘
â–‘â–‘ The unit nfs-mountd.service has entered the 'failed' state with result 'timeout'.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: Failed to start nfs-mountd.service - NFS Mount Daemon.
â–‘â–‘ Subject: A start job for unit nfs-mountd.service has failed
â–‘â–‘ Defined-By: systemd
â–‘â–‘ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
â–‘â–‘
â–‘â–‘ A start job for unit nfs-mountd.service has finished with a failure.
â–‘â–‘
â–‘â–‘ The job identifier is 2277 and the job result is failed.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: Dependency failed for nfs-server.service - NFS server and services.
â–‘â–‘ Subject: A start job for unit nfs-server.service has failed
â–‘â–‘ Defined-By: systemd
â–‘â–‘ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
â–‘â–‘
â–‘â–‘ A start job for unit nfs-server.service has finished with a failure.
â–‘â–‘
â–‘â–‘ The job identifier is 2268 and the job result is dependency.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-server.service: Job nfs-server.service/start failed with result 'dependency'.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: systemd-coredump: Deactivated successfully.
â–‘â–‘ Subject: Unit succeeded
â–‘â–‘ Defined-By: systemd
â–‘â–‘ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
â–‘â–‘
â–‘â–‘ The unit systemd-coredump has successfully entered the 'dead' state.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump>
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit: BPF prog-id=128 op=UNLOAD
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit: BPF prog-id=127 op=UNLOAD
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit: BPF prog-id=126 op=UNLOAD
[root@fedora-rawhide-127-0-0-2-2201 ~]# journalctl -u nfs-mountd.service
Nov 13 17:20:05 fedora-rawhide-127-0-0-2-2201 systemd[1]: Starting nfs-mountd.service - NFS Mount Daemon...
Nov 13 17:20:50 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: start operation timed out. Terminating.
Nov 13 17:21:35 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: State 'stop-sigterm' timed out. Aborting.
Nov 13 17:21:35 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Killing process 1524 (rpc.mountd) with signal SIGABRT.
Nov 13 17:22:05 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Control process exited, code=dumped, status=6/ABRT
Nov 13 17:22:05 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Failed with result 'timeout'.
Nov 13 17:22:05 fedora-rawhide-127-0-0-2-2201 systemd[1]: Failed to start nfs-mountd.service - NFS Mount Daemon.
Nov 13 17:25:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: Starting nfs-mountd.service - NFS Mount Daemon...
Nov 13 17:26:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: start operation timed out. Terminating.
Nov 13 17:27:16 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: State 'stop-sigterm' timed out. Aborting.
Nov 13 17:27:16 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Killing process 1636 (rpc.mountd) with signal SIGABRT.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Control process exited, code=dumped, status=6/ABRT
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Failed with result 'timeout'.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: Failed to start nfs-mountd.service - NFS Mount Daemon.

Comment 6 Steve Dickson 2024-11-13 17:31:12 UTC
It appears the problem is with libtirpc-1.3.6-0.rc1... 

If you downgrade to libtirpc-1.3.6-0 the problem
should go away... That's what I'm seeing and
anybody verify?

Comment 7 Fedora Update System 2024-11-13 19:10:55 UTC
FEDORA-2024-1ec4cd5fd3 (libtirpc-1.3.6-1.fc41) has been submitted as an update to Fedora 41.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-1ec4cd5fd3

Comment 8 Fedora Update System 2024-11-13 19:10:58 UTC
FEDORA-2024-e8733f31e3 (libtirpc-1.3.6-1.fc40) has been submitted as an update to Fedora 40.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-e8733f31e3

Comment 9 Fedora Update System 2024-11-14 02:14:05 UTC
FEDORA-2024-1ec4cd5fd3 has been pushed to the Fedora 41 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-1ec4cd5fd3`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-1ec4cd5fd3

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 10 Fedora Update System 2024-11-14 03:32:21 UTC
FEDORA-2024-e8733f31e3 has been pushed to the Fedora 40 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-e8733f31e3`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-e8733f31e3

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 11 Martin Pitt 2024-11-14 06:21:30 UTC
Thanks! I confirmed that with nfs-utils-2.8.1-1.rc1.fc40 and libtirpc-1.3.6-1.fc40  update, things are happy again.

Comment 12 Fedora Update System 2024-11-16 02:02:02 UTC
FEDORA-2024-e8733f31e3 (libtirpc-1.3.6-1.fc40) has been pushed to the Fedora 40 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 13 Fedora Update System 2024-11-16 02:14:56 UTC
FEDORA-2024-1ec4cd5fd3 (libtirpc-1.3.6-1.fc41) has been pushed to the Fedora 41 stable repository.
If problem still persists, please make note of it in this bug report.