Bug 2325556 - [nfs-utils] nfs-server restart times out
Summary: [nfs-utils] nfs-server restart times out
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: libtirpc
Version: 40
Hardware: x86_64
OS: Linux
urgent
medium
Target Milestone: ---
Assignee: Steve Dickson
QA Contact: Fedora Extras Quality Assurance
URL: https://artifacts.dev.testing-farm.io...
Whiteboard: CockpitTest
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2024-11-12 16:15 UTC by Jelle van der Waa
Modified: 2024-11-16 02:14 UTC (History)
3 users (show)

Fixed In Version: libtirpc-1.3.6-1.fc40 libtirpc-1.3.6-1.fc41
Clone Of:
Environment:
Last Closed: 2024-11-16 02:02:02 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Jelle van der Waa 2024-11-12 16:15:39 UTC
In Cockpit CI the recently updated nfs-utils times out on restart nfs-server.




Reproducible: Always

Steps to Reproduce:
1. systemctl restart nfs-server
Actual Results:  
[root@ip-172-31-18-71 ~]# systemctl restart nfs-server
A dependency job for nfs-server.service failed. See 'journalctl -xe' for details.


[root@ip-172-31-18-71 ~]# journalctl -u nfs-server
Nov 12 15:58:34 ip-172-31-18-71.us-east-2.compute.internal systemd[1]: Dependency failed for nfs-server.service - NFS server and services.
Nov 12 15:58:34 ip-172-31-18-71.us-east-2.compute.internal systemd[1]: nfs-server.service: Job nfs-server.service/start failed with result 'dependency'.


nfs-utils-2.8.1-1.rc1.fc42.x86_64

Comment 1 Martin Pitt 2024-11-13 05:59:12 UTC
This version was also uploaded to Fedora 40 and 41, so moving target.

See https://github.com/cockpit-project/cockpit-machines/issues/1903

Comment 2 Steve Dickson 2024-11-13 14:52:51 UTC
I am not seeing this at all

# rpm -q nfs-utils
nfs-utils-2.8.1-1.rc1.fc42.x86_64

# systemctl restart nfs-server
[355704.860001] NFSD: Using nfsdcld client tracking operations.
[355704.862922] NFSD: no clients to reclaim, skipping NFSv4 grace period (net f0000000)


# journalctl -u nfs-server
Nov 13 09:40:10 steved-fedora systemd[1]: Stopped nfs-server.service - NFS serv>
Nov 13 09:40:20 steved-fedora systemd[1]: Starting nfs-server.service - NFS ser>
Nov 13 09:40:21 steved-fedora systemd[1]: Finished nfs-server.service - NFS ser>
Nov 13 09:40:38 steved-fedora systemd[1]: Stopping nfs-server.service - NFS ser>
Nov 13 09:40:38 steved-fedora systemd[1]: nfs-server.service: Deactivated succe>
Nov 13 09:40:38 steved-fedora systemd[1]: Stopped nfs-server.service - NFS serv>
Nov 13 09:40:48 steved-fedora systemd[1]: Starting nfs-server.service - NFS ser>
Nov 13 09:40:48 steved-fedora systemd[1]: Finished nfs-server.service - NFS ser>
Nov 13 09:44:22 steved-fedora systemd[1]: Reloading nfs-server.service - NFS se>
Nov 13 09:44:22 steved-fedora systemd[1]: Reloaded nfs-server.service - NFS ser>

what am I doing wrong?

Comment 4 Jelle van der Waa 2024-11-13 17:15:32 UTC
Our /etc/exports looks like:

```
/home/foo 127.0.0.0/24(rw)
/home/bar 127.0.0.0/24(rw)
```

But that is probably unrelated I found out that rpcbind just segfaults which probably blocks systemctl restart nfs-server

Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd-coredump[3235]: [🡕] Process 3076 (rpc.mountd) of user 0 dumped core.

                                                                      Module libpcre2-8.so.0 from rpm pcre2-10.44-1.fc41.1.x86_64
                                                                      Module libselinux.so.1 from rpm libselinux-3.7-6.fc42.x86_64
                                                                      Module libcrypto.so.3 from rpm openssl-3.2.2-8.fc42.x86_64
                                                                      Module libkeyutils.so.1 from rpm keyutils-1.6.3-4.fc41.x86_64
                                                                      Module libkrb5support.so.0 from rpm krb5-1.21.3-3.fc42.x86_64
                                                                      Module libcom_err.so.2 from rpm e2fsprogs-1.47.1-6.fc42.x86_64
                                                                      Module libk5crypto.so.3 from rpm krb5-1.21.3-3.fc42.x86_64
                                                                      Module libkrb5.so.3 from rpm krb5-1.21.3-3.fc42.x86_64
                                                                      Module libgssapi_krb5.so.2 from rpm krb5-1.21.3-3.fc42.x86_64
                                                                      Module liblzma.so.5 from rpm xz-5.6.3-2.fc42.x86_64
                                                                      Module libz.so.1 from rpm zlib-ng-2.2.2-1.fc42.x86_64
                                                                      Module libtirpc.so.3 from rpm libtirpc-1.3.6-0.rc1.fc42.x86_64
                                                                      Module libuuid.so.1 from rpm util-linux-2.40.2-8.fc42.x86_64
                                                                      Module libblkid.so.1 from rpm util-linux-2.40.2-8.fc42.x86_64
                                                                      Module libxml2.so.2 from rpm libxml2-2.12.8-2.fc41.x86_64
                                                                      Stack trace of thread 3076:
                                                                      #0  0x00007fb587f20328 pthread_sigmask.5 (libc.so.6 + 0x77328)
                                                                      #1  0x00007fb5880d7894 clnt_vc_call (libtirpc.so.3 + 0xd894)
                                                                      #2  0x00007fb5880d5006 rpcb_unset (libtirpc.so.3 + 0xb006)
                                                                      #3  0x000056073acdc3fd n/a (n/a + 0x0)
                                                                      #4  0x000056073accd0a8 n/a (n/a + 0x0)
                                                                      #5  0x000056073accc5a2 n/a (n/a + 0x0)
                                                                      #6  0x00007fb587eac448 __libc_start_call_main (libc.so.6 + 0x3448)
                                                                      #7  0x00007fb587eac50b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x350b)
                                                                      #8  0x000056073acccf55 n/a (n/a + 0x0)
                                                                      ELF object binary architecture: AMD x86-64
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=nfs-mountd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump@1-3234-0 comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Control process exited, code=dumped, status=6/ABRT
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Failed with result 'timeout'.
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: Failed to start nfs-mountd.service - NFS Mount Daemon.
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: Dependency failed for nfs-server.service - NFS server and services.
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-server.service: Job nfs-server.service/start failed with result 'dependency'.
Nov 13 17:14:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: systemd-coredump: Deactivated successfully.

Comment 5 Jelle van der Waa 2024-11-13 17:29:40 UTC
When systemctl restart nfs-server hangs I get the following from systemd:

journalctl -xe
░░ The process' exit code is 'dumped' and its exit status is 6.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=nfs-mountd comm>
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Failed with result 'timeout'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ The unit nfs-mountd.service has entered the 'failed' state with result 'timeout'.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: Failed to start nfs-mountd.service - NFS Mount Daemon.
░░ Subject: A start job for unit nfs-mountd.service has failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit nfs-mountd.service has finished with a failure.
░░
░░ The job identifier is 2277 and the job result is failed.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: Dependency failed for nfs-server.service - NFS server and services.
░░ Subject: A start job for unit nfs-server.service has failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit nfs-server.service has finished with a failure.
░░
░░ The job identifier is 2268 and the job result is dependency.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-server.service: Job nfs-server.service/start failed with result 'dependency'.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: systemd-coredump: Deactivated successfully.
░░ Subject: Unit succeeded
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ The unit systemd-coredump has successfully entered the 'dead' state.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-coredump>
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit: BPF prog-id=128 op=UNLOAD
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit: BPF prog-id=127 op=UNLOAD
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 audit: BPF prog-id=126 op=UNLOAD
[root@fedora-rawhide-127-0-0-2-2201 ~]# journalctl -u nfs-mountd.service
Nov 13 17:20:05 fedora-rawhide-127-0-0-2-2201 systemd[1]: Starting nfs-mountd.service - NFS Mount Daemon...
Nov 13 17:20:50 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: start operation timed out. Terminating.
Nov 13 17:21:35 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: State 'stop-sigterm' timed out. Aborting.
Nov 13 17:21:35 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Killing process 1524 (rpc.mountd) with signal SIGABRT.
Nov 13 17:22:05 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Control process exited, code=dumped, status=6/ABRT
Nov 13 17:22:05 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Failed with result 'timeout'.
Nov 13 17:22:05 fedora-rawhide-127-0-0-2-2201 systemd[1]: Failed to start nfs-mountd.service - NFS Mount Daemon.
Nov 13 17:25:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: Starting nfs-mountd.service - NFS Mount Daemon...
Nov 13 17:26:31 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: start operation timed out. Terminating.
Nov 13 17:27:16 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: State 'stop-sigterm' timed out. Aborting.
Nov 13 17:27:16 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Killing process 1636 (rpc.mountd) with signal SIGABRT.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Control process exited, code=dumped, status=6/ABRT
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: nfs-mountd.service: Failed with result 'timeout'.
Nov 13 17:27:46 fedora-rawhide-127-0-0-2-2201 systemd[1]: Failed to start nfs-mountd.service - NFS Mount Daemon.

Comment 6 Steve Dickson 2024-11-13 17:31:12 UTC
It appears the problem is with libtirpc-1.3.6-0.rc1... 

If you downgrade to libtirpc-1.3.6-0 the problem
should go away... That's what I'm seeing and
anybody verify?

Comment 7 Fedora Update System 2024-11-13 19:10:55 UTC
FEDORA-2024-1ec4cd5fd3 (libtirpc-1.3.6-1.fc41) has been submitted as an update to Fedora 41.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-1ec4cd5fd3

Comment 8 Fedora Update System 2024-11-13 19:10:58 UTC
FEDORA-2024-e8733f31e3 (libtirpc-1.3.6-1.fc40) has been submitted as an update to Fedora 40.
https://bodhi.fedoraproject.org/updates/FEDORA-2024-e8733f31e3

Comment 9 Fedora Update System 2024-11-14 02:14:05 UTC
FEDORA-2024-1ec4cd5fd3 has been pushed to the Fedora 41 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-1ec4cd5fd3`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-1ec4cd5fd3

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 10 Fedora Update System 2024-11-14 03:32:21 UTC
FEDORA-2024-e8733f31e3 has been pushed to the Fedora 40 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-e8733f31e3`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-e8733f31e3

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 11 Martin Pitt 2024-11-14 06:21:30 UTC
Thanks! I confirmed that with nfs-utils-2.8.1-1.rc1.fc40 and libtirpc-1.3.6-1.fc40  update, things are happy again.

Comment 12 Fedora Update System 2024-11-16 02:02:02 UTC
FEDORA-2024-e8733f31e3 (libtirpc-1.3.6-1.fc40) has been pushed to the Fedora 40 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 13 Fedora Update System 2024-11-16 02:14:56 UTC
FEDORA-2024-1ec4cd5fd3 (libtirpc-1.3.6-1.fc41) has been pushed to the Fedora 41 stable repository.
If problem still persists, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.