Bug 2328627
| Summary: | 2.8.1-1.rc2 regression: rpc.statd crashes with SIGABRT in nsm_atomic_write() | ||
|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Martin Pitt <mpitt> |
| Component: | nfs-utils | Assignee: | Steve Dickson <steved> |
| Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 40 | CC: | bojan, luk.claes, steved, terjeros |
| Target Milestone: | --- | Keywords: | Regression |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| URL: | https://cockpit-logs.us-east-1.linodeobjects.com/pull-0-c7d327f8-20241124-013228-fedora-40-updates-testing/log.html | ||
| Whiteboard: | CockpitTest | ||
| Fixed In Version: | nfs-utils-2.8.1-2.rc2.fc41 nfs-utils-2.8.1-2.rc2.fc40 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2024-12-03 02:51:53 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Martin Pitt
2024-11-25 08:08:46 UTC
This also affects nfs-utils-1:2.8.1-1.rc2.fc41.x86_64 in Fedora 41, but https://bodhi.fedoraproject.org/updates/FEDORA-2024-39db8155bf went into updates too fast. This is not just a cosmetical issue (even though such a crash by itself is bad enough). It breaks libvirt storage: https://cockpit-logs.us-east-1.linodeobjects.com/pull-0-3b01b184-20241125-022308-fedora-40-updates-testing/log.html#26-2 Very interesting. I'm not seeing the NFS server crashes on F41, even after restart. But, maybe I have a different config. I can not reproduce this regression in either rawhide, f41, or f40. Can you please explain the tests you are running? cat /etc/exports
#/home *.home.dicksonnet.net(rw,s2sc)
#/home *.home.dicksonnet.net(rw)
/home *(rw,sec=sys:krb5:krb5i:krb5p)
/tmp *(rw,fsid=666,all_squash)
/home/foo 127.0.0.0/24(rw)
/home/bar 127.0.0.0/24(rw)
f40# systemctl restart nfs-server
f40# systemctl status nfs-server
* nfs-server.service - NFS server and services
Loaded: loaded (/usr/lib/systemd/system/nfs-server.service; enabled; preset: disabled)
Drop-In: /usr/lib/systemd/system/service.d
`-10-timeout-abort.conf
/run/systemd/generator/nfs-server.service.d
`-order-with-mounts.conf
Active: active (exited) since Mon 2024-11-25 05:33:32 EST; 20s ago
Docs: man:rpc.nfsd(8)
man:exportfs(8)
Process: 48961 ExecStartPre=/usr/sbin/exportfs -r (code=exited, status=0/SUCCESS)
Process: 48962 ExecStart=/bin/sh -c /usr/sbin/nfsdctl autostart || /usr/sbin/rpc.nfsd (co>
Process: 48983 ExecStart=/bin/sh -c if systemctl -q is-active gssproxy; then systemctl re>
Main PID: 48983 (code=exited, status=0/SUCCESS)
CPU: 20ms
Nov 25 05:33:32 f40.home.dicksonnet.net systemd[1]: Starting nfs-server.service - NFS server >
Nov 25 05:33:32 f40.home.dicksonnet.net systemd[1]: Finished nfs-server.service - NFS server >
f40#
What do I need to do to reproduce this problem?
Reproducer out of thin air from a standard cloud image: curl -L -O https://download.fedoraproject.org/pub/fedora/linux/development/rawhide/Cloud/x86_64/images/Fedora-Cloud-Base-Generic-41-20241025.n.0.x86_64.qcow2 # nothing fancy, just admin:foobar and root:foobar curl -L -O https://github.com/cockpit-project/bots/raw/main/machine/cloud-init.iso qemu-system-x86_64 -cpu host -enable-kvm -nographic -m 2048 -drive file=Fedora-Cloud-Base-Generic-41-20241025.n.0.x86_64.qcow2,if=virtio -snapshot -cdrom cloud-init.iso -net nic,model=virtio -net user,hostfwd=tcp::2201-:22 Log into VT as root:foobar or admin:foobar, or for a more comfortable shell "ssh -p 2201 admin@localhost" and `sudo -i`. dnf install -y nfs-utils mkdir /home/foo /home/bar /mnt/test printf '/home/foo 127.0.0.0/24(rw)\n/home/bar 127.0.0.0/24(rw)\n' > /etc/exports systemctl restart nfs-server Then "systemctl status rpc.statd" shows the failed service, and "journalctl -b" shows the backtrace. nfs-server.service is indeed ok, but that's just an empty meta-unit. (In reply to Martin Pitt from comment #6) > Reproducer out of thin air from a standard cloud image: > > curl -L -O > https://download.fedoraproject.org/pub/fedora/linux/development/rawhide/ > Cloud/x86_64/images/Fedora-Cloud-Base-Generic-41-20241025.n.0.x86_64.qcow2 > # nothing fancy, just admin:foobar and root:foobar > curl -L -O > https://github.com/cockpit-project/bots/raw/main/machine/cloud-init.iso > qemu-system-x86_64 -cpu host -enable-kvm -nographic -m 2048 -drive > file=Fedora-Cloud-Base-Generic-41-20241025.n.0.x86_64.qcow2,if=virtio > -snapshot -cdrom cloud-init.iso -net nic,model=virtio -net > user,hostfwd=tcp::2201-:22 This qemu command hangs... Can you give me access to the cloud you are seeing this problem with. Because this is the only environment that is seeing this problem > > Log into VT as root:foobar or admin:foobar, or for a more comfortable shell > "ssh -p 2201 admin@localhost" and `sudo -i`. This ssh also hangs.... This is what I'm seeing https://paste.centos.org/view/2cd7e24f > This qemu command hangs... It's doing PXE boot because the disk failed. Did the Fedora-Cloud-Base-Generic-41-20241025.n.0.x86_64.qcow2 download actually work, i.e. does the file have a reasonable size? Because right now it's gone, today's image is https://ftp-stud.hs-esslingen.de/pub/fedora/linux/development/rawhide/Cloud/x86_64/images/Fedora-Cloud-Base-Generic-Rawhide-20241126.n.0.x86_64.qcow2 . Just grab the current one from https://ftp-stud.hs-esslingen.de/pub/fedora/linux/development/rawhide/Cloud/x86_64/images/ . Or use whichever testing image you have in your CI? > Can you give me access to the cloud you are seeing this problem with. It fails in Testing Farm, my laptop (where I ran the reproducer), and cockpit's CI on PSI OpenStack. This *really* isn't hardware specific. When you tried this in comment #5, can you (1) double-check that you have nfs-utils 2.8.1rc2 installed (*not* rc1), and did you check the journal and "systemctl status rpc.statd"? Sorry, I posted the geolocation redirection. Current image from https://download.fedoraproject.org/pub/fedora/linux/development/rawhide/Cloud/x86_64/images/ Or https://download.fedoraproject.org/pub/fedora/linux/development/41/Cloud/x86_64/images/ if you prefer to investigate on 41 instead of rawhide. commit 8fcddae4437510137baf108f477d116ce345ce80 (HEAD -> master)
Author: Benjamin Coddington <bcodding>
Date: Wed Nov 27 06:32:46 2024 -0500
libnsm: fix the safer atomic filenames fix
Local build with patch over included fixed the problem here. An update would be nice :-) commit ce17ca7f4093d1c760651a7fed92e3e741cb11aa (HEAD -> master)
Author: Benjamin Coddington <bcodding>
Date: Wed Nov 27 07:01:06 2024 -0500
libnsm(v2): fix the safer atomic filenames fix
f40-candidate build: https://koji.fedoraproject.org/koji/taskinfo?taskID=126395639 f41-candidate build: https://koji.fedoraproject.org/koji/taskinfo?taskID=126395734 FEDORA-2024-93dd1e473f (nfs-utils-2.8.1-2.rc2.fc40) has been submitted as an update to Fedora 40. https://bodhi.fedoraproject.org/updates/FEDORA-2024-93dd1e473f FEDORA-2024-e47c860a1a (nfs-utils-2.8.1-2.rc2.fc41) has been submitted as an update to Fedora 41. https://bodhi.fedoraproject.org/updates/FEDORA-2024-e47c860a1a FEDORA-2024-93dd1e473f has been pushed to the Fedora 40 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-93dd1e473f` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-93dd1e473f See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. FEDORA-2024-e47c860a1a has been pushed to the Fedora 41 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2024-e47c860a1a` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2024-e47c860a1a See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates. FEDORA-2024-e47c860a1a (nfs-utils-2.8.1-2.rc2.fc41) has been pushed to the Fedora 41 stable repository. If problem still persists, please make note of it in this bug report. FEDORA-2024-93dd1e473f (nfs-utils-2.8.1-2.rc2.fc40) has been pushed to the Fedora 40 stable repository. If problem still persists, please make note of it in this bug report. Steve, can you please upload this to rawhide as well? Thanks! (In reply to Martin Pitt from comment #22) > Steve, can you please upload this to rawhide as well? Thanks! it is see nfs-utils-2.8.1 (In reply to Steve Dickson from comment #23) > (In reply to Martin Pitt from comment #22) > > Steve, can you please upload this to rawhide as well? Thanks! > > it is see nfs-utils-2.8.1 actually it is nfs-utils-2.8.2 |