Bug 2012209 - Nvidia Akmod Fails to build against custom kernel
Summary: Nvidia Akmod Fails to build against custom kernel
Keywords:
Status: CLOSED DUPLICATE of bug 1729460
Alias: None
Product: Fedora
Classification: Fedora
Component: akmods
Version: rawhide
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Nicolas Chauvet (kwizart)
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-10-08 15:00 UTC by Ayush Singh
Modified: 2022-05-12 21:15 UTC (History)
7 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2022-05-12 21:15:56 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Akmod Nviidia Log (1.34 KB, text/plain)
2021-10-08 15:00 UTC, Ayush Singh
no flags Details

Description Ayush Singh 2021-10-08 15:00:23 UTC
Created attachment 1830884 [details]
Akmod Nviidia Log

I am compiling Linux Kernel 5.15.0-rc3 with LLVM instead of Gcc. The akmod fails to build for the custom kernel, which isn't unexpected on it's own to be honest. The problem is that there is not much information in the logs to fix the problem.

One possibility is that Akmods, just like DKMS simply doesn't work with LLVM/Clang without manually changing the Compiler or something. Other may be that there is a problem with kernel source under /usr/src/kernels. However, as I said, I simply don't have enough information to find out since there are almost no docs about akmods.

The logs do contain an error:
```
error: Package has no %description: kmod-nvidia-5.15.0-rust+
```
However, I am not really sure if this causes failure in the buildhttps://github.com/Rust-for-Linux/linux


Version-Release number of selected component (if applicable):
Nvidia Driver Version: 470.74-1
akmods version: 0.5.6


Steps to Reproduce:
Try building akmods-nvidia for a Kernel compiled with LLVM instead of Gnu Utils.


Actual results:
Akmods Nvidia fails to build. The logs have been attached below:


Additional info:
Custom Kernel Source: https://github.com/Rust-for-Linux/linux

Comment 1 Sergio Basto 2021-10-08 15:49:53 UTC
please attach /var/cache/akmods/nvidia/470.74-1-for-5.15.0-rust+.failed.log

Comment 2 Ayush Singh 2021-10-08 16:02:45 UTC
(In reply to Sergio Basto from comment #1)
> please attach /var/cache/akmods/nvidia/470.74-1-for-5.15.0-rust+.failed.log

Those logs are already attached as "Akmod Nviidia Log"
Here is the link https://bugzilla-attachments.redhat.com/attachment.cgi?id=1830884

Comment 3 Nicolas Chauvet (kwizart) 2021-10-08 16:36:45 UTC
I don't think it's the purpose of akmod/dkms to be compiler gnostic. akmod will pick whatever is set from its environement with no respect to whatever is set from the developper env perspective.

The correct fix for this would be to somehow insert or "hardcode" the required compiler on kernel-devel. And there is no way around.

Comment 4 Ayush Singh 2021-10-08 16:53:56 UTC
(In reply to Nicolas Chauvet (kwizart) from comment #3)
> I don't think it's the purpose of akmod/dkms to be compiler gnostic. akmod
> will pick whatever is set from its environement with no respect to whatever
> is set from the developper env perspective.
> 
> The correct fix for this would be to somehow insert or "hardcode" the
> required compiler on kernel-devel. And there is no way around.

I know, it's just that the error log isn't clear if the build fails because of the different compiler or something else entirely.
Also, I don't know about akmods but dkms kinda resets the environment variables to defaults before starting the build. So even changing the environment variables has no effect on a dkms build (there are some hacks I think, but personally haven't tried them).

Comment 5 Ayush Singh 2021-10-08 16:54:23 UTC
(In reply to Nicolas Chauvet (kwizart) from comment #3)
> I don't think it's the purpose of akmod/dkms to be compiler gnostic. akmod
> will pick whatever is set from its environement with no respect to whatever
> is set from the developper env perspective.
> 
> The correct fix for this would be to somehow insert or "hardcode" the
> required compiler on kernel-devel. And there is no way around.

I know, it's just that the error log isn't clear if the build fails because of the different compiler or something else entirely.
Also, I don't know about akmods but dkms kinda resets the environment variables to defaults before starting the build. So even changing the environment variables has no effect on a dkms build (there are some hacks I think, but personally haven't tried them).

Comment 6 Nicolas Chauvet (kwizart) 2021-10-08 17:59:25 UTC
> akmods ... kinda resets the environment variables to defaults before starting the build. 
How comes ? How have you set the compiler in the first step ?

Comment 7 Sergio Basto 2021-10-08 19:06:22 UTC
error: Package has no %description: kmod-nvidia-5.15.0-rust+

Comment 8 Ayush Singh 2021-10-09 05:36:19 UTC
(In reply to Nicolas Chauvet (kwizart) from comment #6)
> > akmods ... kinda resets the environment variables to defaults before starting the build. 
> How comes ? How have you set the compiler in the first step ?

As I said, that's the case for dkms. I'm not really sure if akmods does the same thing or not.
Here is the thread about dkms with clang: https://github.com/dell/dkms/issues/124

Comment 9 Nicolas Chauvet (kwizart) 2021-11-04 11:22:52 UTC
1/ Can you try to build a kmod-nvidia directly (without using akmods ?)
See also https://rpmfusion.org/Packaging/KernelModules/Kmods2

2/ Error is that error: Package has no %description: kmod-nvidia-5.15.0-rust+

Can you attach the output of on your system ?
kmodtool --target x86_64 --repo rpmfusion --kmodname nvidia-kmod  --akmod --for-kernels 5.15.0-rust+

Comment 10 eric 2021-11-18 06:36:30 UTC
I was having similar issues and this is what I get when I run the mod tool command:

```
/usr/bin/kmodtool: line 200: 5.12.19-zen2+: syntax error: invalid arithmetic operator (error token is ".12.19-zen2+")
```

Comment 11 Nicolas Chauvet (kwizart) 2021-12-10 09:48:29 UTC
Seems like we have an issue with parsing your kernel name

Comment 12 nicolas.vieville 2021-12-10 14:57:23 UTC
Hello,

The shell script of rawhide kmodtool branch seems to work correctly.

As an attempt to fix locally this issue, maybe you should replace 
line 200 of /usr/bin/kmodtool (make a copy of this file before as 
root) from:

local kernel_uname_r_rel_plus_one=$(( kernel_uname_r_rel+1 ))

to these lines:

if $(echo "${kernel_uname_r_rel}" | grep -qE '^[0-9]+$') ; then
	kernel_uname_r_rel_plus_one=$(( kernel_uname_r_rel+1 ))
else
	kernel_uname_r_rel_plus_one=""
fi

It is also possible to comment line 200 of /usr/bin/kmodtool file then
just add after this line the replacement suggested above.
Note: the suggested replacement is taken from rawhide branch of kmodtool.

Any comment are welcome,

Cordially,


-- 
NVieville

Comment 13 Nicolas Chauvet (kwizart) 2022-03-15 17:44:12 UTC
@Nicolas
Basically this is a non-kabi kernel where kernel_uname_r_rel_plus_one doesn't need to be computed. Can we just whitelist the kernel from rhel that we know to support kABI for sure ? and stop computing the variable that aren't relevant for non-kabi cases ?


I think that's the way forward.

Comment 14 nicolas.vieville 2022-03-16 09:55:11 UTC
(In reply to Nicolas Chauvet (kwizart) from comment #13)
> @Nicolas
> Basically this is a non-kabi kernel where kernel_uname_r_rel_plus_one
> doesn't need to be computed. Can we just whitelist the kernel from rhel that
> we know to support kABI for sure ? and stop computing the variable that
> aren't relevant for non-kabi cases ?
> 
> 
> I think that's the way forward.

Hello Nicolas,

Thank you for your comment.
For the record, the proposed fix in comment #12 is already in rawhide (main 
branch) of kmodtool. 
See https://src.fedoraproject.org/rpms/kmodtool/c/4852d351abeb85341ec90af552ff7e9cb0fce4e2?branch=rawhide

I think that this commit (#4852d35) already does what you suggested, even
if the distinction between kABI vs non-kABI kernels is not made directly 
by the kmodtool shell script. Here we hit the difficulty of the separate 
computation of values between the shell script and the rpm instructions or
macros.

For the moment, I don't see how to implement correctly your suggestion: 
dealing with kABI vs non-kABI kernels at the shell script level. Have to 
think about it a while before any attempt. If you have any suggestion on 
this point, feel free to comment.

Cordially,


-- 
NVieville

Comment 15 Ben Cotton 2022-05-12 16:01:58 UTC
This message is a reminder that Fedora Linux 34 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 34 on 2022-06-07.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '34'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 34 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 16 Sergio Basto 2022-05-12 21:12:11 UTC
still need to be reviewed

Comment 17 Sergio Basto 2022-05-12 21:15:56 UTC

*** This bug has been marked as a duplicate of bug 1729460 ***


Note You need to log in before you can comment on or make changes to this bug.