Bug 1474969 - VirtualBox keeps breaking after updates because akmods sometimes does not do anything
VirtualBox keeps breaking after updates because akmods sometimes does not do ...
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: akmods (Show other bugs)
27
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Nicolas Chauvet (kwizart)
Fedora Extras Quality Assurance
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-25 13:39 EDT by Basic Six
Modified: 2018-04-08 12:59 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-04-08 12:59:27 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
VirtualBox: Kernel driver not installed (rc=-1908) (25.86 KB, image/png)
2017-07-25 13:43 EDT, Basic Six
no flags Details
/var/cache/akmods/VirtualBox/5.1.24-1-for-4.11.11-300.fc26.x86_64.failed.log (26.41 KB, text/plain)
2017-07-27 19:01 EDT, Sergio Monteiro Basto
no flags Details
akmods fails to start build process (2.55 KB, text/plain)
2017-09-19 09:21 EDT, Basic Six
no flags Details
5.1.30-1-for-4.13.12-200.fc26.x86_64.failed.log (54.64 KB, text/plain)
2017-11-20 05:39 EST, Basic Six
no flags Details
akmods.log (3.83 KB, text/plain)
2017-11-20 05:39 EST, Basic Six
no flags Details

  None (edit)
Description Basic Six 2017-07-25 13:39:08 EDT
Description of problem:

This bug was reported for RPM Fusion, but the report was closed without solution because akmods has been moved to Fedora:
https://bugzilla.rpmfusion.org/show_bug.cgi?id=4485

Symptoms:
VirtualBox keeps breaking after updating/upgrading Fedora, if a new kernel has been installed. After updating and restarting the system, VirtualBox can't start vms anymore.
A regular non-admin user, who has been told to update his system every now and then to receive security fixes, would stop updating his system in the long run because it keeps breaking his virtualization software, meaning the computer is unusable (after updating) until a friend or colleague who is familiar with the command line has time to debug this.

Error: Kernel driver not installed (rc=-1908)

Suspected cause:
VirtualBox itself is fine, it is the update routine that does not work reliably. The peculiarity of VirtualBox on Linux is that the kernel driver needs to be rebuilt after updating the system. Many Internet resources advise to install "akmods" and "akmod-VirtualBox" in order to automate this process (because the prebuilt kmod packages are often out of sync, so it sometimes stops working until an updated package is available in the reposiroty a few days later).
However, sometimes, akmods decides not to rebuild the vboxdrv module, apparently because something might have gone wrong last time ("Ignoring VirtualBox-kmod as it failed earlier"). This is bad for two reasons:
1) The user does not see this warning when akmods is run automatically on reboot.
2) It does not make any sense. In this particular case, the last reboot was last week (unexpected cold reboot), the system is usually hibernated otherwise. Now, after upgrading to Fedora 26, akmods did not rebuild the kernel module because it thinks that something went wrong last week. Under which rare circumstances this makes any sense is unclear.

Workaround:
An admin user (not the regular user who is not familiar with the command line and/or does not have root access) needs to run akmods manually with the "--force" option, forcing akmods to do what it is supposed to do anyway.

Suggestion:
Whenever akmods is run automatically, it should just rebuild the module and not produce a hidden warning. Better yet, it shouldn't be checking at all if the last rebuild worked. It should just rebuild the module.



Version-Release number of selected component (if applicable):

Fedora 26
akmod-VirtualBox-5.1.22-1.fc26.x86_64
akmods-0.5.6-7.fc26.noarch
VirtualBox-5.1.22-1.fc26.x86_64



How reproducible:

Not always, but often. Might be triggered by a cold reboot (i.e., loss of power due to low battery and system not hibernating on low battery).



Steps to Reproduce:
1. Use VirtualBox on Fedora.
2. Update the system.
3. VirtualBox won't work anymore.



Actual results:

When trying to start a vm, an error message with this title is shown:
Error: Kernel driver not installed (rc=-1908)

This appears to be caused by akmods, which sometimes decides not to rebuild the vboxdrv kernel module.



Expected results:

VirtualBox should keep working after updating Fedora. Not just 3 or 4 times in a row, but always. Even after a power loss.



Additional info:

From an end user's perspective, this behavior is very bad, as any system update could render the virtualization software unusable. This would be a serious problem if virtualization is needed for work.

For the most part, Fedora appears to be very end user friendly (not counting the initial installation and upgrades which, for example, sometimes fail without showing an error message, although improvements have been made). However, VirtualBox's update routine needs some fixing because this has been going on for years and it would be great if VirtualBox would *always* keep working after updating the system.

Since this happens to many people, there are quite a few questions asking for help that can be found online. In many cases as well as in the case of the error message shown by VirtualBox, the suggested solution is to install "akmod-VirtualBox" and "kernel-devel" using the command line (again, not suitable for regular users), which does not help if those packages are already installed (probably have been since the installation of VirtualBox).
Comment 1 Basic Six 2017-07-25 13:43 EDT
Created attachment 1304368 [details]
VirtualBox: Kernel driver not installed (rc=-1908)

For the record and for Google, this is the error message (version 5.1.14 r112924).

VirtualBox: <html><b>Kernel driver not installed (rc=-1908)</b><br/><br/>The VirtualBox Linux kernel driver (vboxdrv) is probably not loaded.You may not have kernel driver installed for kernel that is runnig, if so you may do as root:  <font color=blue>dnf install akmod-VirtualBox kernel-devel-$(uname -r)</font>If you installed VirtualBox packages and don't want reboot the system, you may need load the kernel driver, doing as root:  <font color=blue>akmods; systemctl restart systemd-modules-load.service</font><br/><br/><br><br><!--EOM-->where: suplibOsInit#012what:  3#012VERR_VM_DRIVER_NOT_INSTALLED (-1908) - The support driver is not installed. On linux, open returned ENOENT.#012</html>
Comment 2 Basic Six 2017-07-25 13:44:31 EDT
Additional information, copied from the old bug report:



https://bugzilla.rpmfusion.org/show_bug.cgi?id=4485#c0

After a system update has installed a new kernel version, VirtualBox won't run again because the VirtualBox kernel modules have not been built automatically.

Error: Kernel driver not installed (rc=-1908)

This is not a new setup, it usually worked in the past and both "akmod-VirtualBox" and "kernel-devel" are installed.

However, log messages reveal that the akmods tool did not even start the build process. Instead, it logged an error:

# cat /var/log/messages | grep akmod
... systemd: Starting Builds and install new kmods from akmod packages...
... systemd: Starting Builds and install new kmods from akmod packages...
... akmods: Checking kmods exist for 4.9.13-201.fc25.x86_64[  OK  ]
... systemd: Started Builds and install new kmods from akmod packages.
... akmods: Ignoring VirtualBox-kmod as it failed earlier[WARNING]
... akmods: Hint: Some kmods were ignored or failed to build or install.
... akmods: You can try to rebuild and install them by by calling
... akmods: '/usr/sbin/akmods --force' as root.
... systemd: Started Builds and install new kmods from akmod packages.

# cat /var/cache/akmods/akmods.log
...
... akmods: Checking kmods exist for 4.9.13-201.fc25.x86_64
... akmods: Ignoring VirtualBox-kmod as it failed earlier
... akmods: Hint: Some kmods were ignored or failed to build or install.
... akmods: You can try to rebuild and install them by by calling
... akmods: '/usr/sbin/akmods --force' as root.
...

Running akmods without the --force option (or without uninstalling all kmod-VirtualBox-* packages) does indeed fail to start the build process:

# akmods
Ignoring VirtualBox-kmod as it failed earlier              [WARNING]

Using the --force option, it works:

# akmods --force
Checking kmods exist for 4.9.13-201.fc25.x86_64            [  OK  ]
Building and installing VirtualBox-kmod                    [  OK  ]

It looks like something might have to be changed so that akmods is forced to start the build process after booting into a new kernel. Or maybe removing this warning might be an option.

This automatic build system is great to hide all the work that is required when a new kernel is installed, but this is another flaw that destroys the user experience for regular users who cannot or should not use terminals.



https://bugzilla.rpmfusion.org/show_bug.cgi?id=4485#c3

As for your second question: I have checked the log file (akmods.log) and I've found the following entries, which are right above those quoted in my original message (03/21):
2017/03/13 11:27:25 akmods: Checking kmods exist for 4.9.13-201.fc25.x86_64
2017/03/13 11:27:25 akmods: Building and installing VirtualBox-kmod
2017/03/13 11:27:25 akmods: Building RPM using the command '/sbin/akmodsbuild --target x86_64 --kernels 4.9.13-201.fc25.x86_64 /usr/src/akmods/VirtualBox-kmod.latest'
2017/03/13 11:28:23 akmods: Installing newly built rpms
2017/03/13 11:28:23 akmods: DNF detected
2017/03/13 11:28:30 akmods: Could not install newly built RPMs. You can find them and the logfile
2017/03/13 11:28:30 akmods: 5.1.14-1-for-4.9.13-201.fc25.x86_64.failed.log in /var/cache/akmods/VirtualBox/

So if I understand this correctly, akmods did not rebuild the VirtualBox kernel modules (but instead produced a warning, which wasn't shown on the screen) on 03/21 because a dnf process happened to be running a week earlier?

And again, this decision of akmods led to the quoted error message shown by VirtualBox when trying to start a vm. Or, as a casual user might put it: VirtualBox on Linux is broken again after a reboot
(Note that such a casual user might not be familiar with the command line, they might not know how to become root and, for their own safety, they probably shouldn't. If such a user has a vm to run a piece of software that's not compatible with Linux, they would "randomly" not be able to work with that software anymore.)
Comment 3 Nicolas Chauvet (kwizart) 2017-07-26 10:58:31 EDT
Hans is working hard to clean the virtualbox guest drivers so it can get accepted into the kernel, so I'm re-assigning the bug to me.
FYI https://fedoraproject.org/wiki/Changes/VirtualBox_Guest_Integration


Quoting akmods.log
2017/03/13 11:28:30 akmods: Could not install newly built RPMs. You can find them and the logfile

(It was long time ago, btw).
So this means that the build succeeded but for some reason it timeout to install probably because some other process was locking dnf.
If akmods times out in this case, there is a need to verify the reason for the timeout and retry on the next run.

The message "Ignoring {{foo}} as it failed earlier" should only be seen when the kmod has failed to build, so it do not attempt to build again unless there is a kernel-devel update or an akmod-foo update.
Comment 4 Sergio Monteiro Basto 2017-07-26 11:02:39 EDT
IMHO , we should enable akmods-shutdown.service , as if it was build on shutdown 


the akmods-shutdown ( /usr/sbin/akmods-shutdown ) is a very simple script [1]
maybe just remove --force , I don't know . 




[1] 
echo "Building modules for all installed kernels."
for kernel in /usr/src/kernels/*; do
        kernel=$(basename $kernel)
        /usr/sbin/akmods --force --kernels $kernel
done
Comment 5 Nicolas Chauvet (kwizart) 2017-07-26 11:27:31 EDT
(In reply to Sergio Monteiro Basto from comment #4)
> IMHO , we should enable akmods-shutdown.service , as if it was build on
> shutdown 
This would not fix this issue, because if a build failure would occurs for a given akmod, the shutdown would take ages to end on every shutdown for no benefit.
Comment 6 Sergio Monteiro Basto 2017-07-26 12:15:48 EDT
we already discussed this some months ago, so we may remove the --force in akmods-shutdown script, but disable it, I don't agree .
Comment 7 Sergio Monteiro Basto 2017-07-27 19:01 EDT
Created attachment 1305690 [details]
/var/cache/akmods/VirtualBox/5.1.24-1-for-4.11.11-300.fc26.x86_64.failed.log

2017/07/27 22:57:02 akmodsbuild: Makefile:923: "Cannot use CONFIG_STACK_VALIDATION, please install libelf-dev, libelf-devel or elfutils-libelf-devel"

Thanks for the report I could reproduce the error .
Comment 8 Sergio Monteiro Basto 2017-07-27 19:23:09 EDT
akmods --force
Checking kmods exist for 4.11.11-300.fc26.x86_64           [  OK  ]
Building and installing VirtualBox-kmod                    [  OK  ]

so akmods doesn't build on boot , even if you ma miss load-modules , and if you miss it , and vboxservice fail on guest systems (needs kmods) , so enable akmods-shutdown, is necessary IMHO. 
Also after install akmods , does not run automatically akmods ... (this one was tricky but I think that we have it ) 
also we have  /etc/kernel/postinst.d/akmodsposttrans that should create kmods for new kernels ... .

But I have some ideas to write akmods version 3 , where we eliminate kernel-buildsys, we build akmods in main package and remove another loops on akmods2 design.  I prefer do a new akmods that try to fix this one . A make a factory of kmods packages and if we need we may generate new akmods with kernels patches.
Comment 9 Nicolas Chauvet (kwizart) 2017-07-28 03:16:21 EDT
(In reply to Sergio Monteiro Basto from comment #7)
...
> 2017/07/27 22:57:02 akmodsbuild: Makefile:923: "Cannot use
> CONFIG_STACK_VALIDATION, please install libelf-dev, libelf-devel or
> elfutils-libelf-devel"

I don't think that's the error. But it would means elfutils-libelf-devel needs to be installed by akmods, a shutdown service would not fix it. 

Also does the compiler segfault is reproducible ? or only if the elfutils-libelf-devel package is not installed ?
Comment 10 Sergio Monteiro Basto 2017-07-28 07:17:14 EDT
(In reply to Nicolas Chauvet (kwizart) from comment #9)
> (In reply to Sergio Monteiro Basto from comment #7)
> ...
> > 2017/07/27 22:57:02 akmodsbuild: Makefile:923: "Cannot use
> > CONFIG_STACK_VALIDATION, please install libelf-dev, libelf-devel or
> > elfutils-libelf-devel"
> 
> I don't think that's the error. But it would means elfutils-libelf-devel
> needs to be installed by akmods, a shutdown service would not fix it. 

But I built it on terminal without any modification , just akmods --force . 

> Also does the compiler segfault is reproducible ? or only if the
> elfutils-libelf-devel package is not installed ?
Comment 11 Nicolas Chauvet (kwizart) 2017-07-28 09:19:18 EDT
(In reply to Sergio Monteiro Basto from comment #10)
...
> But I built it on terminal without any modification , just akmods --force . 

Sure, but that' doesn't mean we can use "--force" everywhere.

"--force" was set for a basic reason you still fail to understand. We don't want a module failing to build from source at one time to be constantly retried on each boot, shutdown or else. So a dedicate --force option is needed to try when a given akmod package failed to build.

But in this very specific case it's needed, and there is a need to understand why.
My understanding is that some gcc issue made the compiler to segfault.
If we retry, it's more likely that we will hit the same compiler issue.
So we need to check if the gcc version has changes.

The other reason is if the compiler segfault can be reproduced. If it cannot, then it means there is a need to retry twice before to avoid a later attempt (that --force will override).

Then there is a need to verify that for a given kernel module a new akmod-foo update is retried. In a similar way, for a given akmod-foo version, a new kernel-devel is attempted.
Comment 12 Sergio Monteiro Basto 2017-07-28 10:27:47 EDT
(In reply to Nicolas Chauvet (kwizart) from comment #11)

akmods builds well (as usually ) and without --force on terminal and on shutdown , but fails on startup , I think that is the main problem . 

I leave I computer shutdown down in home I will test it better on evening .
Comment 13 Richard Shaw 2017-07-28 11:04:04 EDT
(In reply to Sergio Monteiro Basto from comment #12)
> (In reply to Nicolas Chauvet (kwizart) from comment #11)
> 
> akmods builds well (as usually ) and without --force on terminal and on
> shutdown , but fails on startup , I think that is the main problem . 

If the builds are specifically failing on startup then there must be something unavailable that we need to add as a dependency to prevent akmods from running too early during bootup.

In either case, the module should always get built after the dnf transaction completes (or be appended to the transaction with a dnf plugin). Running akmods on startup or on shutdown are both workarounds.

If the problem is that the user reboots after an update before the module gets installed (not a build error) we should inhibit shutdown and present the error to the user. Systemd has some capability here but I haven't had time to investigate it further.

We previously talked about changing what akmods considers a failure on RPM Fusion, such that a failure to install would not be considered a failure to akmods so the install would be attempted again (pretty much the only reason to keep the startup service).

If the problem is the module fails to build, we should also have a way to notify the user, again perhaps systemd could be useful here.
Comment 14 Nicolas Chauvet (kwizart) 2017-07-28 11:05:55 EDT
(In reply to Sergio Monteiro Basto from comment #12)
> (In reply to Nicolas Chauvet (kwizart) from comment #11)
> 
> akmods builds well (as usually ) and without --force on terminal and on
akmods doesn't build anything, it's a framework. Akmods builds modules. And modules sometimes fails to build for some reasons even for a short period of time. Virtualbox is not an exception.

Please "try" to remember that the normal process is to build modules on rpm post installation. building at boot is the least inconvenient given many users when asking for shutdown really means "Shut...Down"!! and they will force power if not executed in a timely manner. building on shutdown is pointless given that fact.
Comment 15 Sergio Monteiro Basto 2017-07-31 07:39:49 EDT
Hello, I had experience hard disk failures this weekend , so I couldn't report it soon .
Seems with new kernel 4.12 this problem is gone , i.e. akmods built well vbox kmod on boot.
Comment 16 Nicolas Chauvet (kwizart) 2017-07-31 08:19:35 EDT
One can reproduce easily by killing gcc while building the kmod.
Comment 17 Nicolas Chauvet (kwizart) 2017-08-12 17:23:38 EDT
What I've just tested today:
- Remove the kmod built from akmod-VirtualBox.
dnf remove kmod-VirtualBox-4.11.11-300.fc26.x86_64-5.1.26-1.fc26.x86_64
- just akmods (not --force)
- killall gcc
- The prebuilt kmod has failed and a /var/cache/akmods/VirtualBox/.last.log has been produced. (with a failure notice).
- rebooted
- akmod-VirtualBox produced a kmod-VirtualBox for my current kernel
-  lsmod |grep vbox
vboxpci                24576  0
vboxnetadp             28672  0
vboxnetflt             28672  0
vboxdrv               466944  3 vboxnetadp,vboxnetflt,vboxpci


So everything was fine, even after recovering after a failure. (using akmods-0.5.6-10).
Older akmods version, specially older than akmods-0.5.6-7 might have such issue.

@reporter.
Are you able to reproduce the issue or to give additional informations
(was this after a fedora upgrade ?)
Comment 18 Basic Six 2017-09-05 06:14:33 EDT
First of all, it's really great that someone is working on VirtualBox integration.

I tried what you did, I uninstalled kmod-VirtualBox* and rebooted to let the shutdown service rebuild the modules. This took a while but it worked.
However, this wasn't what I did when VirtualBox broke in the past.

You are right, it did happen after upgrading Fedora (probably after every upgrade). Maybe the akmods service didn't run after the upgrade routine installed a new kernel.

Another time it happened was after a cold reboot that happened because the laptop battery was low and the system did not hibernate (or maybe failed to hibernate) even though it's configured to hibernate on low battery and hibernating manually works too. Before that happened, an update must have installed a new kernel, so after that cold reboot, VirtualBox did not have a kmod module for the new kernel and thus failed to start.
In this case, you could argue that a cold reboot is generally bad and can lead to data loss or other issues, in other words it's the user's fault (except in this case, when the system did not hibernate and just shut off instead). But apart from all the other bad things that could happen after a cold reboot, this particular issue could be prevented. For example, the kernel modules could be built on startup, which should always work.

The first scenario, however, is certainly not the user's fault.
Comment 19 Nicolas Chauvet (kwizart) 2017-09-05 08:05:36 EDT
(In reply to Basic Six from comment #18)
[...]
> cold reboot, this particular issue could be prevented. For example, the
> kernel modules could be built on startup, which should always work.
Right, thx for your confirmation. Build on startup is the ultimate fix if anything broke for any reason. This is what's currently implemented nowadays.

So I consider this issue as fully fixed.
Comment 20 Basic Six 2017-09-19 09:21 EDT
Created attachment 1327943 [details]
akmods fails to start build process

After updating my system, I now have akmods 0.5.6 (akmods-0.5.6-10.fc26.noarch, akmod-VirtualBox-5.1.26-1.fc26.x86_64) installed and it happened again. (Previously, the system froze, apparently due to high memory usage caused by Firefox and Chrome, a cold reboot was necessary.)

Same story: VirtualBox wouldn't start vms, akmods successfully built the kernel module when called manually with the "--force" option.

An excerpt from the log file (/var/cache/akmods/akmods.log) is attached.

I thought this was already fixed. Do I have to wait for Fedora 27 to get the fix?
Comment 21 Sergio Monteiro Basto 2017-09-19 11:04:22 EDT
2017/09/18 15:20:17 akmods: Installing newly built rpms
2017/09/18 15:20:17 akmods: DNF detected
2017/09/18 15:20:19 akmods: Could not install newly built RPMs. You can find them and the logfile

dnf -y install --disablerepo='*' $(find "${tmpdir}results" -type f -name '*.rpm' | grep -v debuginfo) >> "${kmodlogfile}" 2>&1

failed for some reason , i.e. we should have rpms built in /var/cache/akmods/VirtualBox but dnf install failed ... 

you may do :
ls -trl /var/cache/akmods/VirtualBox and install manually the latest rpms .

Thanks.
Comment 22 Basic Six 2017-11-20 05:38:29 EST
I'm sorry, maybe there's something I've missed, but it's still happening. I upgraded to Fedora 27 (using gnome-software).
After the upgrade has finished, VirtualBox didn't work anymore.
Almost half a year has passed since I opened the bug and VirtualBox still keeps breaking after every major update.

I haven't done any debugging myself this time. I'll attach two log files.
Comment 23 Basic Six 2017-11-20 05:39 EST
Created attachment 1355647 [details]
5.1.30-1-for-4.13.12-200.fc26.x86_64.failed.log
Comment 24 Basic Six 2017-11-20 05:39 EST
Created attachment 1355660 [details]
akmods.log
Comment 25 Sergio Monteiro Basto 2017-11-20 05:52:07 EST
dnf failed [1] , thanks for the report 


Running transaction
Failed to obtain the transaction lock (logged in as: root).
Error: Could not run transaction.
Comment 26 Basic Six 2017-11-20 06:17:39 EST
It seems to me like there is another aspect to this problem:
After "systemctl restart akmods", it still didn't work:

akmods: Checking kmods exist for 4.13.12-300.fc27.x86_64
akmods: Ignoring VirtualBox-kmod as it failed earlier
akmods: Hint: Some kmods were ignored or failed to build or install.
akmods: You can try to rebuild and install them by by calling
akmods: '/usr/sbin/akmods --force' as root.

Even if dnf failed to install the module earlier, it should now work, after restarting the service or rebooting the computer. But it doesn't work, because akmods still won't do what it's supposed to do if the last attempt failed.
I still think that this behavior does not make any sense, which is why I opened this bug report in July. Why should akmods not do anything if the previous attempt failed? I don't get what the benefit is, but I do see that this peculiarity is the reason why VirtualBox (which broke after upgrading Fedora) still won't work after a subsequent reboot.
Again, this is my original complaint from half a year ago.

This makes a major difference. Regular users, who cannot and should not use the command line, could simply reboot (to let akmods do its work) and that should fix it. But since it'll refuse to install the modules because it "failed earlier", VirtualBox would be broken permanently until a friend who is familiar with the command line has time to fix it. Note that VirtualBox is an essential piece of software for many people. Also note that this is not a purely theoretical thing. I know people who stopped updating their Fedora because VirtualBox keeps breaking. Other people stopped using VirtualBox because they didn't have the time to debug it after every update. That way, more users will move to other virtualization platforms.
Comment 27 Richard Shaw 2017-11-20 11:56:43 EST
A workaround would be to copy the service file and add the "--force" option to ExecStart...

# cp /usr/lib/systemd/system/akmods.service /etc/systemd/system/
(edit ExecStart)
# systemctl daemon-reload
# systemctl restart akmods
Comment 28 Nicolas Chauvet (kwizart) 2017-11-20 12:25:49 EST
(In reply to Richard Shaw from comment #27)
> A workaround would be to copy the service file and add the "--force" option
> to ExecStart...
There would be no reason to have a --force option.
Then if a given (non essential) module would fails, It would be constantly slowing down the system boot.

What is appropriate is to correctly detect which situation is failing (disk space, OOM killer, kernel-devel version change,gcc change) and if it worth to test again or not.
Comment 29 Richard Shaw 2017-11-20 12:29:36 EST
I did specify that it was a workaround not a solution...
Comment 30 Basic Six 2017-11-21 07:40:21 EST
I was going to ask why akmods would rebuild all modules every single time (--force), including those that didn't fail. So I looked at akmods again, wondering how it figures out how it determines what has failed.
Running akmods (now, after it failed) does check if the VirtualBox module already exists and it prints "OK" (the second line reads "Ignoring VirtualBox-kmod as it failed earlier").
In my case, this is the module for kernel "4.13.12-300.fc27.x86_64" (this one is "OK"). This suggests that the module is already there, but it's not.
This indicates that akmods does have a module for my kernel but it wouldn't install it, because it thinks it failed. And indeed, this file exists:
/var/cache/akmods/VirtualBox/kmod-VirtualBox-4.13.12-300.fc27.x86_64-5.1.30-1.fc27.x86_64.rpm

This is confirmed by the log file:
/var/cache/akmods/VirtualBox/5.1.30-1-for-4.13.12-300.fc27.x86_64.failed.log
Failed to obtain the transaction lock (logged in as: root).
Error: Could not run transaction.
akmods: Could not install newly built RPMs. You can find them and the logfile

So akmods does have the module, but it's not installed. The installation failed for some reason. At this point, I don't really care at all why it failed.

VirtualBox thinks that its module is missing because it has not been installed. akmods thinks that the module already exists because it sees this leftover (rpm file) from last time, not knowing that it hasn't been installed.

Why is this file not deleted when an error occurs, if it's not going to be installed anyway?

If akmods would clean up afterwards (i.e., delete the file that couldn't be installed), it could later reliably determine if a module exists (i.e., has been built successfully) or not. Each existing file indicates a successful build. If one is missing, it needs to be built. Not on every reboot, but possibly when booting a new kernel.
Comment 31 Stephen Gallagher 2017-11-28 20:42:32 EST
How are you doing the updates? Are these happening with simple rpm/dnf updates in a live system, followed by a reboot? Or are you using the PackageKit/GNOME Software offline updates, where the update process occurs within a special minimalistic boot environment?

I was seeing this same behavior with the nVidia proprietary driver and it occurred to me that the failure might be caused by the akmod builds attempting to be built during the minimal environment (thus missing some critical service that is needed to complete successfully).
Comment 32 Basic Six 2017-12-06 15:34:03 EST
(In reply to Stephen Gallagher from comment #31)
> How are you doing the updates?

Since dnfdragora sometimes crashes in the middle of the process (Bug 1491367) and gnome-software always wants to reboot, I often use "dnf update".

However, VirtualBox also always broke after an upgrade and I've used gnome-software for that.

Anyway, that's not important. The akmods mechanism is flawed, as outlined in my previous comment. Please fix it.
Comment 33 Martin Thain 2018-01-04 08:32:40 EST
 (In reply to Basic Six from comment #32)
> (In reply to Stephen Gallagher from comment #31)
> > How are you doing the updates?
> 
> Since dnfdragora sometimes crashes in the middle of the process (Bug
> 1491367) and gnome-software always wants to reboot, I often use "dnf update".
> 
> However, VirtualBox also always broke after an upgrade and I've used
> gnome-software for that.
> 
> Anyway, that's not important. The akmods mechanism is flawed, as outlined in
> my previous comment. Please fix it.

I have also had nVidia problems.

I just updated the kernel (via dnf update) and the system fell back to nouveau.

Workaround 

sudo /usr/sbin/akmods --force
[sudo] password for martin: 
Checking kmods exist for 4.14.11-300.fc27.x86_64           [  OK  ]
Building and installing nvidia-kmod                        [  OK  ]

I have only had this issue since I started using f27 in late December.
Comment 34 Knut J BJuland 2018-03-14 01:37:50 EDT
I found a workaround I remove the old nvidia kernel driver before it installed the when I upgraded to 390.42.

For some reason it did not install the kernel driver when i wrote akmods --force
Comment 35 Sergio Monteiro Basto 2018-03-15 13:02:27 EDT
(In reply to Knut J BJuland from comment #34)
> I found a workaround I remove the old nvidia kernel driver before it
> installed the when I upgraded to 390.42.
> 
> For some reason it did not install the kernel driver when i wrote akmods
> --force

Please add logs like in comment #23 and #24 , to understand what happened in your case thanks
Comment 36 Knut J BJuland 2018-03-16 02:09:43 EDT
I think I got my error because I change the version to 0. Because I want my own rpmto be upgrade when a new rpmfusion rpm is relsease for nvidia 390.42
Comment 37 Basic Six 2018-03-20 09:06:57 EDT
I suggest opening a separate bug ticket for other problems with akmods (like "--force didn't work") or nvidia etc.

Is there any progress? As described above, the akmods mechanism appears to be flawed. Has anyone thought about that, how it could be improved?
This is why I opened this bug ticket and the previous one (a *year* ago), which was closed, and I don't see any confirmation from a dev that the problem is understood and the design will be changed (I'm assuming my understanding of akmods is correct as nobody has questioned my description in Comment 30).

I'd like to repeat that this has been randomly breaking VirtualBox on Fedora for years. I don't understand why it's not being fixed. There are people out there who need VirtualBox and they can't work anymore when their VirtualBox setup breaks.
Comment 38 Knut J BJuland 2018-03-20 09:11:34 EDT
What about moving to dkms?
Comment 39 Nicolas Chauvet (kwizart) 2018-03-20 10:18:57 EDT
(In reply to Basic Six from comment #37)
> I suggest opening a separate bug ticket for other problems with akmods (like
> "--force didn't work") or nvidia etc.
> 
> Is there any progress? As described above, the akmods mechanism appears to
Progress on what in particular ?
I think the original issue was fixed long time ago on this...
Also last akmods update is from 2018/01 and fixed an interaction issue with "offline-updates", so everything should be perfect when using gnome-software.

Please consider using a separate issue if you still experience issue because the current wording of the original report isn't helpful unfortunately.
Comment 40 Sergio Monteiro Basto 2018-03-20 11:38:30 EDT
Please add logs like in comment #23 and #24 , to understand what happened in your case thanks
Comment 41 Sergio Monteiro Basto 2018-03-20 11:42:51 EDT
(In reply to Basic Six from comment #37)
> I'd like to repeat that this has been randomly breaking VirtualBox on Fedora
> for years. I don't understand why it's not being fixed. There are people out
> there who need VirtualBox and they can't work anymore when their VirtualBox
> setup breaks.

Sometimes the build breaks when we have kernel major update , nothing that we can fix on akmods because it is a compile modules problem , but without logs I can say what it was , I'm not a wizard .
Comment 42 Martin Thain 2018-03-20 15:57:25 EDT
Regarding akmods and nvidia - I have had no problems for some weeks and would have no problem with this bug report being closed. Perhaps comment 39 applies in my case.
Comment 43 Nicolas Chauvet (kwizart) 2018-03-26 08:38:49 EDT
(In reply to Basic Six from comment #30)
Reading your comment again, and looking into others akmods issues. I don't think we can reliably determine the reason of the failure. For example, the rpmbuild output error code can be the same if the module fails to build or if a gcc got oom killed or disk space went full, gcc missmatch, etc.

Of course, if there is already an existing kmod built, then it should probably be safe to bypass a new compilation and just install it.

I've set two fixes for theses issue anyway:
- There is now a inhibitor that will prevent a reboot/shutdown if the akmods@.service is running (after a new kernel has been installed). Which should prevent any users from rebooting too fast just after a kernel update.
- The alwaystry value is set by default, so akmods will bypass any previous failure. It means that akmods will now retry on every boot if there is a build failure. (that's until a better error detection can be worked on).

Maybe I will revisit the alwaystry enforcement on f29+ if it can be better handled.

Please test with akmods-0.5.6-15 currently been pushed in all branches.
Comment 44 Nicolas Chauvet (kwizart) 2018-04-08 12:59:27 EDT
Please try to reproduce any issue with akmods-0.5.6-15.
(and report a new bug if relevant).
Basically, there should be no need to use --force anymore.

Note You need to log in before you can comment on or make changes to this bug.