Description of problem: In Cockpit CI we noticed that the newest refresh of Fedora-36 image fails with nfs kdump. Version-Release number of selected component (if applicable): Likely this update: nfs-utils (1:2.6.1-2.rc5.fc36 -> 1:2.6.1-2.rc6.fc36) For all updated packages see at the end of the report. How reproducible: 1. Boot fedora-36 twice, to one I will refer as X1 and to other as X2. X1 is the machine on which kernel will crash, X2 is the NFS storage. X1 is on 10.111.113.1/24, X2 on 10.111.113.2/24 2. on X2: `echo -ne "/srv/kdump 10.111.113.0/24(rw,no_root_squash)\n" > /etc/exports` 3. on X2: `mkdir -p /srv/kdump/var/crash; firewall-cmd --add-service nfs; systemctl restart nfs-server` 4 on X1: `systemctl disable kdump` 5. on X1: `grubby --args=crashkernel=256M --update-kernel=ALL` 6. <reboot> 7. on X1: `echo -ne "auto_reset_crashkernel yes\ncore_collector makedumpfile -l --message-level 7 -d 31\nnfs 10.111.113.2:/srv/kdump" > /etc/kdump.conf` 8. on X1: `systemctl enable --now kdump` 8. on X1: `echo 1 > /proc/sys/kernel/sysrq` 9. on X1: `echo c > /proc/sysrq-trigger` 10. <boot X1 again> 11. on X2 `file /srv/kdump/var/crash/10.111.113.1*/vmcore` should show some content Actual results: Nothing in `/srv/kdump/var/crash/` Expected results: Crash dump in `/srv/kdump/var/crash/` Additional info: All changed packages: binutils (2.37-30.fc36 -> 2.37-31.fc36) binutils-gold (2.37-30.fc36 -> 2.37-31.fc36) cockpit (270-1.fc36 -> 271-1.fc36) cockpit-bridge (270-1.fc36 -> 271-1.fc36) cockpit-system (270-1.fc36 -> 271-1.fc36) cockpit-ws (270-1.fc36 -> 271-1.fc36) edk2-ovmf (20220526git16779ede2d36-1.fc36 -> 20220526git16779ede2d36-3.fc36) kernel-core (5.17.13-300.fc36 -> 5.17.14-300.fc36) libipa_hbac (2.7.1-1.fc36 -> 2.7.1-2.fc36) libnfsidmap (1:2.6.1-2.rc5.fc36 -> 1:2.6.1-2.rc6.fc36) libsss_autofs (2.7.1-1.fc36 -> 2.7.1-2.fc36) libsss_certmap (2.7.1-1.fc36 -> 2.7.1-2.fc36) libsss_idmap (2.7.1-1.fc36 -> 2.7.1-2.fc36) libsss_nss_idmap (2.7.1-1.fc36 -> 2.7.1-2.fc36) libsss_sudo (2.7.1-1.fc36 -> 2.7.1-2.fc36) nfs-utils (1:2.6.1-2.rc5.fc36 -> 1:2.6.1-2.rc6.fc36) ntfs-3g (2:2021.8.22-5.fc36 -> 2:2022.5.17-1.fc36) ntfs-3g-libs (2:2021.8.22-5.fc36 -> 2:2022.5.17-1.fc36) ntfs-3g-system-compression (1.0-8.fc36 -> 1.0-9.fc36) ntfsprogs (2:2021.8.22-5.fc36 -> 2:2022.5.17-1.fc36) python-srpm-macros (3.10-17.fc36 -> 3.10-18.fc36) python-unversioned-command (3.10.4-1.fc36 -> 3.10.5-2.fc36) python3 (3.10.4-1.fc36 -> 3.10.5-2.fc36) python3-libipa_hbac (2.7.1-1.fc36 -> 2.7.1-2.fc36) python3-libs (3.10.4-1.fc36 -> 3.10.5-2.fc36) python3-sss (2.7.1-1.fc36 -> 2.7.1-2.fc36) python3-sss-murmur (2.7.1-1.fc36 -> 2.7.1-2.fc36) python3-sssdconfig (2.7.1-1.fc36 -> 2.7.1-2.fc36) qemu-block-curl (2:6.2.0-10.fc36 -> 2:6.2.0-12.fc36) qemu-char-spice (2:6.2.0-10.fc36 -> 2:6.2.0-12.fc36) qemu-common (2:6.2.0-10.fc36 -> 2:6.2.0-12.fc36) qemu-device-usb-host (2:6.2.0-10.fc36 -> 2:6.2.0-12.fc36) qemu-device-usb-redirect (2:6.2.0-10.fc36 -> 2:6.2.0-12.fc36) qemu-guest-agent (2:6.2.0-10.fc36 -> 2:6.2.0-12.fc36) qemu-img (2:6.2.0-10.fc36 -> 2:6.2.0-12.fc36) qemu-kvm-core (2:6.2.0-10.fc36 -> 2:6.2.0-12.fc36) qemu-system-x86-core (2:6.2.0-10.fc36 -> 2:6.2.0-12.fc36) qemu-ui-opengl (2:6.2.0-10.fc36 -> 2:6.2.0-12.fc36) qemu-ui-spice-core (2:6.2.0-10.fc36 -> 2:6.2.0-12.fc36) sssd (2.7.1-1.fc36 -> 2.7.1-2.fc36) sssd-ad (2.7.1-1.fc36 -> 2.7.1-2.fc36) sssd-client (2.7.1-1.fc36 -> 2.7.1-2.fc36) sssd-common (2.7.1-1.fc36 -> 2.7.1-2.fc36) sssd-common-pac (2.7.1-1.fc36 -> 2.7.1-2.fc36) sssd-dbus (2.7.1-1.fc36 -> 2.7.1-2.fc36) sssd-ipa (2.7.1-1.fc36 -> 2.7.1-2.fc36) sssd-kcm (2.7.1-1.fc36 -> 2.7.1-2.fc36) sssd-krb5 (2.7.1-1.fc36 -> 2.7.1-2.fc36) sssd-krb5-common (2.7.1-1.fc36 -> 2.7.1-2.fc36) sssd-ldap (2.7.1-1.fc36 -> 2.7.1-2.fc36) sssd-nfs-idmap (2.7.1-1.fc36 -> 2.7.1-2.fc36) sssd-proxy (2.7.1-1.fc36 -> 2.7.1-2.fc36) sssd-tools (2.7.1-1.fc36 -> 2.7.1-2.fc36) xen-libs (4.16.1-1.fc36 -> 4.16.1-2.fc36) xen-licenses (4.16.1-1.fc36 -> 4.16.1-2.fc36)
Forgot to mention, while booting X1 we see: [ 2.203647] systemd[1]: Reached target remote-fs-pre.target - Preparation for Remote File Systems. [ 2.204734] systemd[1]: Mounting kdumproot.mount - /kdumproot... [ 2.205713] systemd[1]: dracut-pre-mount.service - dracut pre-mount hook was skipped because all trigger condition checks failed. [ 2.207193] systemd[1]: Reached target initrd-root-fs.target - Initrd Root File System. [ OK ] Reached target initrd-root…get - Initrd Root File System. [ 2.210848] systemd[1]: Starting initrd-parse-etc.service - Reload Configuration from the Real Root... Starting initrd-parse-etc.…onfiguration from the Real Root... [ 2.217995] systemd[1]: Reloading. [ 2.323573] FS-Cache: Loaded [ 2.281620] mount[417]: mount.nfs: No such device [ 2.288337] systemd[1]: /usr/lib/systemd/system/kdump-capture.service:23: Standard output type syslog is obsolete, automatically updating to journal. Please update your unit file, and consider removing the setting altogether. [ 2.290485] systemd[1]: /usr/lib/systemd/system/kdump-capture.service:24: Standard output type syslog+console is obsolete, automatically updating to journal+console. Please update your unit file, and consider removing the setting altogether. [ 2.330516] systemd[1]: kdumproot.mount: Mount process exited, code=exited, status=32/n/a [ 2.331437] systemd[1]: kdumproot.mount: Failed with result 'exit-code'. [ 2.334214] systemd[1]: Failed to mount kdumproot.mount - /kdumproot. [FAILED] Failed to mount kdumproot.mount - /kdumproot. See 'systemctl status kdumproot.mount' for details.
Hi Matej, the comment #1 looks like the kdump kernel bootup log. Could you attach the whole kernel log? It could be some networking issue, either in the scripts or some device driver issue, anyway kernel log will be helpful. Moved to kexec-tools component, we can move back to nfs if it is a nfs bug later.
nfs-utils added a new file "/usr/lib/modprobe.d/50-nfs.conf" in rc6 patch [1], which contains lines as: install sunrpc /sbin/modprobe --ignore-install sunrpc $CMDLINE_OPTS && /sbin/sysctl -q --pattern sunrpc --system However /sbin/sysctl is not exist in kdump initramfs image, which will fail during dracut-pre-udev: [ 8.834385] dracut-pre-udev[366]: sh: line 1: /sbin/sysctl: No such file or directory [ 8.844594] dracut-pre-udev[365]: modprobe: ERROR: libkmod/libkmod-module.c:990 command_do() Error running install command '/sbin/modprobe --ignore-install sunrpc && /sbin/sysctl -q --pattern sunrpc --system' for module sunrpc: retcode 127 [ 8.867523] dracut-pre-udev[365]: modprobe: ERROR: could not insert 'sunrpc': Invalid argument Thus nfs modules are not loaded before kdump-capture service starts. Here is the result of lsmod before kdump, we can see no nfs modules presents: [ 27.506599] kdump.sh[599]: Module Size Used by [ 27.514522] kdump.sh[599]: lockd 122880 0 [ 27.521524] kdump.sh[599]: grace 16384 1 lockd [ 27.529523] kdump.sh[599]: fscache 372736 0 [ 27.536530] kdump.sh[599]: netfs 57344 1 fscache [ 27.544533] kdump.sh[599]: crct10dif_pclmul 16384 1 [ 27.551530] kdump.sh[599]: crc32_pclmul 16384 0 [ 27.558531] kdump.sh[599]: crc32c_intel 24576 0 [ 27.565536] kdump.sh[599]: ghash_clmulni_intel 16384 0 [ 27.572534] kdump.sh[599]: ice 851968 0 [ 27.579550] kdump.sh[599]: tg3 192512 0 [ 27.586546] kdump.sh[599]: mgag200 40960 0 [ 27.593544] kdump.sh[599]: sunrpc 651264 2 lockd [ 27.601550] kdump.sh[599]: ipmi_devintf 20480 0 [ 27.608545] kdump.sh[599]: ipmi_msghandler 122880 1 ipmi_devintf [ 27.616546] kdump.sh[599]: overlay 151552 1 [ 27.623529] kdump.sh[599]: squashfs 69632 1 [ 27.630515] kdump.sh[599]: loop 32768 2 As a result, when mount.nfs without nfs modules, it will report errors as: mount.nfs: No such device A quick fix is to append the following line to kdump.conf, then kdump works fine: extra_bins /sbin/sysctl I will work out a better way for formal fix. [1]: https://src.fedoraproject.org/rpms/nfs-utils/c/d6281e4f6ed7560f723a9fbba5ecae7f329078f9?branch=rawhide
Hi Tao, Thanks! good finding. Sounds like sysctl is needed in dracut nfs module, so not only a kdump issue if people use nfs in initramfs they will have this bug, so probably the right component should be "dracut"?
(In reply to Dave Young from comment #2) > Hi Matej, the comment #1 looks like the kdump kernel bootup log. Could you > attach the whole kernel log? It could be some networking issue, either in > the scripts or some device driver issue, anyway kernel log will be helpful. > > Moved to kexec-tools component, we can move back to nfs if it is a nfs bug > later. Matej, since Tao has identified the root cause, so please ignore the above request.
(In reply to Dave Young from comment #4) > Hi Tao, > > Thanks! good finding. Sounds like sysctl is needed in dracut nfs module, > so not only a kdump issue if people use nfs in initramfs they will have this > bug, so probably the right component should be "dracut"? Yes, I agree it's better to be fixed in dracut, should the bz be assigned to dracut team? Thanks, Tao Liu
Sorry, un-needinfo from bhe
(In reply to ltao from comment #6) > (In reply to Dave Young from comment #4) > > Hi Tao, > > > > Thanks! good finding. Sounds like sysctl is needed in dracut nfs module, > > so not only a kdump issue if people use nfs in initramfs they will have this > > bug, so probably the right component should be "dracut"? > > Yes, I agree it's better to be fixed in dracut, should the bz be assigned to > dracut team? > Yes, just reassigned. thanks!
Possibly related to: https://github.com/dracutdevs/dracut/issues/1857
FEDORA-2022-38325154c4 has been submitted as an update to Fedora 36. https://bodhi.fedoraproject.org/updates/FEDORA-2022-38325154c4
This should be fixed in https://bodhi.fedoraproject.org/updates/FEDORA-2022-38325154c4 . See https://bugzilla.redhat.com/show_bug.cgi?id=2100668
FEDORA-2022-38325154c4 has been pushed to the Fedora 36 testing repository. Soon you'll be able to install the update with the following command: `sudo dnf upgrade --enablerepo=updates-testing --refresh --advisory=FEDORA-2022-38325154c4` You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2022-38325154c4 See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.
FEDORA-2022-38325154c4 has been pushed to the Fedora 36 stable repository. If problem still persists, please make note of it in this bug report.