Description of problem: When updating a kernel via a regular "dnf upgrade", DKMS modules are not automatically compiled until after the system reboots, causing many problems since sometimes critical functionality depends on the module existing at bootup. Version-Release number of selected component (if applicable): dkms-2.6.1-3.fc30 How reproducible: always Steps to Reproduce: 1. install a dkms module on an outdated version of the kernel 2. run dnf upgrade 3. reboot Actual results: if your boot depends on a dkms module, it will fail. if it doesnt, you will boot and notice that fedora is /compiling kernel modules in a systemd service depended upon by multi-user.target/, and the output from ./configure is spamming the system journal. As well, I believe since compilation can be a very cpu intensive process, it should not be run in the background. if you have any services that depend on a dkms module, they will likely fail since they are usually started before the module finishes compiling. Expected results: It is my personal belief that all compilation and setup should be initiated by the package manager or user, not run as a systemd service. I would expect dkms to run after every kernel upgrade. I would never expect it to run as part of the default target on systemd. However, if we wish to keep the current systemd based dkms setup, we should ensure that it runs before most other services. This would mean stalling the startup until a possibly very slow compilation process finishes. If out of tree root filesystems are in scope, we should ensure dkms is finished before we reboot to a new kernel. Additional info: For example, if you somehow install fedora on zfs (which is probably very unsupported), you would be suprised that booting to the latest kernel after an upgrade fails since the module for your root filesystem is missing. In my case, I have a systemd service that sets up a wireguard tunnel on startup, but since wireguard wasnt compiled until startup and DKMS is part of multi-user.target, the service attempts to run and fails before wireguard is finished compiling. I have looked at https://bugzilla.redhat.com/show_bug.cgi?id=702483, and believe since that bug is marked fixed and closed and very old that this is not a duplicate. I have provided some relevant journal messages below: During shutdown 1-2 minutes after dnf upgrade: systemd[1]: dkms.service: Succeeded. There is no other output mentioning dkms before shutdown During startup: systemd[1]: Starting firewalld - dynamic firewall daemon... sh[636]: Kernel preparation unnecessary for this kernel. Skipping... sh[636]: Building module: sh[636]: make -C /lib/modules/5.0.14-300.fc30.x86_64/build M=/var/lib/dkms/wireguard/0.0.20190406/build clean systemd[1]: Started PostgreSQL database server. [...] sh[636]: make: Leaving directory '/usr/src/kernels/5.0.14-300.fc30.x86_64' sh[636]: { make -j1 KERNELRELEASE=5.0.14-300.fc30.x86_64 -C /lib/modules/5.0.14-300.fc30.x86_64/build M=/var/lib/dkms/wireguard/0.0.20190406/build; } >> /var/lib/dkms/wireguard/0.0.20190406/build/make.log 2>&1 [...] systemd[1]: Starting WireGuard via wg-quick(8) for wg... wg-quick[1597]: [#] ip link add wg type wireguard wg-quick[1597]: Error: Unknown device type. wg-quick[1597]: Unable to access interface: Protocol not supported wg-quick[1597]: [#] ip link delete dev wg wg-quick[1597]: Cannot find device "wg" systemd[1]: wg-quick: Main process exited, code=exited, status=1/FAILURE systemd[1]: wg-quick: Failed with result 'exit-code'. systemd[1]: Failed to start WireGuard via wg-quick(8) for wg. [...] sh[636]: CLEAN /var/lib/dkms/wireguard/0.0.20190406/build sh[636]: CLEAN /var/lib/dkms/wireguard/0.0.20190406/build/.tmp_versions sh[636]: CLEAN /var/lib/dkms/wireguard/0.0.20190406/build/Module.symvers sh[636]: make: Leaving directory '/usr/src/kernels/5.0.14-300.fc30.x86_64' sh[636]: DKMS: build completed. sh[636]: wireguard.ko.xz: sh[636]: Running module version sanity check. sh[636]: - Original module sh[636]: - No original module exists within this kernel sh[636]: - Installation sh[636]: - Installing to /lib/modules/5.0.14-300.fc30.x86_64/extra/ sh[636]: Adding any weak-modules sh[636]: do_depmod 5.0.14-300.fc30.x86_64 I am very new to bugzilla, tell me if you need any other logs or information, or if this is the wrong spot to place this.
This is annoying behaviour for anything that's needed early in the boot process: - Network drivers (for NFS mounts) - Storage controller drivers in general - File systems I agree that this should be done during kernel install time to make sure the modules needed are present in the initrd.
For sure there is room for improvement. DKMS can probably be called with a %trigger in the SPEC file. http://ftp.rpm.org/api/4.4.2.2/triggers.html Both akmods and dkms module packages that I've seen around trigger the build and install at the moment the package is installed/upgraded: https://github.com/negativo17/dkms-nvidia/blob/master/dkms-nvidia.spec#L41-L49 The same approach can also be used in a %trigger section in the kernel module SPEC file. Patches are welcome.
I'm a bit puzzled on what the motivation for this change was. Aguably the new behaviour has worse user experience (even if all goes as planned), as the next reboot is delayed while DKMS does its thing. The old behaviour led to longer DNF runtimes, but the system could be used during that time.
This new build behaviour may affect ZFS users: https://github.com/zfsonlinux/zfs/issues/8763
This appears to be related to https://bugzilla.redhat.com/show_bug.cgi?id=1696202. Given that dkms on fc30 still installs hooks in /etc/kernel/postinst.d and /etc/kernel/prerm.d yet /usr/bin/kernel-install from systemd left it to /sbin/new-kernel-pkg from grubby to run the postinst and prerm hooks. It seems grubby has lost new-kernel-pkg in fc30. I suspect the install on boot is just dkms being resilient and this was never intended behavior. If systemds kernel-install is the correct place to call /etc/kernel/install.d/ hooks why isn't also required to call /etc/kernel/postinst.d and /etc/kernel/prerm.d hooks?
Not sure exactly how to solve it, considering that the DKMS page has not changed. Shall we had a %trigger in the SPEC file to trigger the DKMS autobuild every time a new kernel-devel package is installed?
This is all a bit convoluted, as far as I can figure out. The DKMS package drops a file into /etc/kernel/postinst.d/dkms. This file triggers the actual rebuild process when a new kernel gets installed. In F29 and earlier the call chain for this is: /bin/kernel-install (called from the %posttrans in kernel-core packages) /usr/lib/kernel/install.d/20-grubby.install (or /usr/lib/kernel/install.d/20-grub.install) /sbin/new-kernel-pkg /etc/kernel/postinst.d/dkms In F30 /sbin/new-kernel-pkg is no longer a thing, the file doesn't exist. So I guess what has to happen is that DKMS instead has to drop a file into /usr/lib/kernel/install.d/ which triggers the rebuild instead. I'm sure the calling conventions of that file will be different from the /etc/kernel/postinst.d/dkms one, and the order of the scripts in /usr/lib/kernel/install.d/ is important.
As I noted this is a dupe of https://bugzilla.redhat.com/show_bug.cgi?id=1696202 which was just closed with an update to grubby-8.40-31.fc30. I would suggest closing this bug as a dupe. My unit test of zfs modules show them getting updated (and removed) when kernels are added and removed.
*** This bug has been marked as a duplicate of bug 1696202 ***
(In reply to Ralf Ertzinger from comment #7) > This is all a bit convoluted, as far as I can figure out. > > The DKMS package drops a file into /etc/kernel/postinst.d/dkms. This file > triggers the actual rebuild process when a new kernel gets installed. > > In F29 and earlier the call chain for this is: > > /bin/kernel-install (called from the %posttrans in kernel-core packages) > /usr/lib/kernel/install.d/20-grubby.install (or > /usr/lib/kernel/install.d/20-grub.install) > /sbin/new-kernel-pkg > /etc/kernel/postinst.d/dkms You can still get the old behaviour by setting "GRUB_ENABLE_BLSCFG=false" in /etc/default/grub, and also install grubby-deprecated. > > In F30 /sbin/new-kernel-pkg is no longer a thing, the file doesn't exist. So > I guess what has to happen is that DKMS instead has to drop a file into > /usr/lib/kernel/install.d/ which triggers the rebuild instead. I'm sure the > calling conventions of that file will be different from the > /etc/kernel/postinst.d/dkms one, and the order of the scripts in > /usr/lib/kernel/install.d/ is important. akmods fixed a similar problem by installing the file /usr/lib/kernel/install.d/95-akmodsposttrans.install
(In reply to Villy Kruse from comment #10) > > In F30 /sbin/new-kernel-pkg is no longer a thing, the file doesn't exist. So > > I guess what has to happen is that DKMS instead has to drop a file into > > /usr/lib/kernel/install.d/ which triggers the rebuild instead. I'm sure the > > calling conventions of that file will be different from the > > /etc/kernel/postinst.d/dkms one, and the order of the scripts in > > /usr/lib/kernel/install.d/ is important. > > > akmods fixed a similar problem by installing the file > /usr/lib/kernel/install.d/95-akmodsposttrans.install Thanks, will look into it to see also if it can help for the bug when a new dkms package is installed with a new kernel in the same yum/dnf transaction.
I can confirm that installing grubby-8.40-31.fc30 fixes the issue, DKMS modules are built after new kernel install again,
I also confirm that the dkms modules are built when a new kernel is installed. When a kernel is removed the dkms modules for that kernel aren't cleaned up. Is a separate bug required to address that?
As far as I can tell that has never worked, so it's not new behaviour. That would be a separate bug report.