Bug 1667014
Summary: | Support ostree/silverblue builds | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Alexander Larsson <alexl> | ||||||||||||
Component: | akmods | Assignee: | Nicolas Chauvet (kwizart) <kwizart> | ||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||||
Priority: | unspecified | ||||||||||||||
Version: | rawhide | CC: | debarshir, hdegoede, hobbes1069, jlebon, kwizart, leigh123linux, negativo17, nicolas.vieville, sergio | ||||||||||||
Target Milestone: | --- | ||||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | Unspecified | ||||||||||||||
OS: | Unspecified | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | akmods-0.5.6-19.fc29 | Doc Type: | If docs needed, set a value | ||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2019-03-01 12:09:14 UTC | Type: | Bug | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Attachments: |
|
Description
Alexander Larsson
2019-01-17 08:55:01 UTC
Created attachment 1521238 [details]
Add akmods-post, supporting akmods in rpm-ostree
Created attachment 1521239 [details]
Patch to kmodtool to call akmods-post
This patch changes kmodtool to call out to akmods-post (if it exists).
We could alternatively drop the optional call-out and instead bump the requirement to a version of akmods that has akmods-post.
Created attachment 1521243 [details]
Handle custom %dist in kmodtool
This patch to kmodtool is not directly related, but it is useful when testing this. If you build a kmod module targeting a different fedora version then we currently don't correctly handle the dist.
With all these, and some custom hacky patch to build the nvidia driver on the 5.0 kernel I was able to layer and build the xorg-x11-drv-nvidia driver and akmod from rpmfusion. Unfortunately its not actually *loading* the nvidia driver on boot, so I have some more debugging to do. Created attachment 1521253 [details]
Call depmod in akmods-post
This adds a call to depmod to the %post. This is enough for me to make nvidia work if the nouveau driver is disabled (which i unfortunately have to do manually atm).
I'm not sure to understand the change needed for dist in kmodtool It's usually a given that akmods build the kmod on the same fedora version than the target kernel. Specially it may lead to issue if any kernel module are built with a different version of gcc than the kernel. Is there any issue to use posttrans in ostree ? (instead of post?) It will be better if you can hook into "nohup /usr/sbin/akmods --from-akmod-posttrans --akmod ${kmodname}" instead. (than creating yet another akmods script). Or even to use a systemd service (the same way it's done with /etc/kernel/postinst.d/akmodsposttrans) (we've keepted nohup for compatibility with el6 system for info). For the kmodtools change, its easiest if i use explicit versions. I'm on f29 regular, and i'm building rpms that will be installed on f30 silverblue. I build nvidia-kmod on f29 with fedpkg --release f30 local (--release because i'm in a branch that fedpkg can't figure out the dist for). This will produce a rpm called akmod-nvidia-xxx.fc30.x86_64.rpm (without the patch even). However, as part of building the akmod package the expanded macro in the specfile launches "rpmbuild -bs" to build a src.rpm that will be included in the akmods package. Before this patch, the generated srpm would be named /usr/src/akmods/nvidia-kmod-xxx.fc29.src.rpm, because the dist i passed to fedpkg is not forwarded so it picks up the host default. Later the %post operation would not find the file, because when %dist is expanded during the akmods build it is (correctly) fc30. Now, in a koji/rpmfusion style build this would not be a problem, because the host default dist value would be right. Still, if it is set, why not forward it to be more correct. (In reply to Nicolas Chauvet (kwizart) from comment #6) > Is there any issue to use posttrans in ostree ? (instead of post?) I don't think there is anything particularly problematic with it in general. However, for the akmods we already had a posttrans (see below) and I didn't want to mix them up. > It will be better if you can hook into "nohup /usr/sbin/akmods > --from-akmod-posttrans --akmod ${kmodname}" instead. (than creating yet > another akmods script). We can't use this, because it spawns in the background. This will not work for rpm-ostree, it will detect that the script exited immediately and think it is done. If we want to keep it in posttrans we could do both calls in there. Something like: %posttrans -n akmod-${kmodname} /usr/sbin/akmods-post ${kmodname} %{_usrsrc}/akmods/${kmodname}-kmod-%{version}-%{release}.src.rpm nohup /usr/sbin/akmods --from-akmod-posttrans --akmod ${kmodname} &> /dev/null & Either would work for me. > Or even to use a systemd service (the same way it's done with > /etc/kernel/postinst.d/akmodsposttrans) > (we've keepted nohup for compatibility with el6 system for info). I'm not sure how you imagine that this would work as a systemd service? In rpm-ostree the post scripts are run in a very minimal sandbox. There is no systemd access, and when we later boot there is no way to write to /usr. Also, even if we *had* systemd access it would be the wrong one, for the currently running host rather than the chroot of the new-image-to-be. (In reply to Alexander Larsson from comment #8) > If we want to keep it in posttrans we could do both calls in there. > Something like: > > %posttrans -n akmod-${kmodname} > /usr/sbin/akmods-post ${kmodname} > %{_usrsrc}/akmods/${kmodname}-kmod-%{version}-%{release}.src.rpm > nohup /usr/sbin/akmods --from-akmod-posttrans --akmod ${kmodname} &> > /dev/null & We could even move the are-we-on-ostree check here and spawn either of the two. Because the nohup one is useless under rpm-ostree. Also, i should note that I initially tried to make the script part of akmods rather than a separate script, but the shared code is basically nothing, and the akmods script had all sorts of dependencies that were not in silverblue (like including non-existing /etc/rc.d/init.d/functions and various system checks that fail in the rpm-ostree post sandbox). Ping? Can we get some review of this? ``` + if ! grep -q OSTREE_VERSION= /etc/os-release; then ``` Note this isn't entirely reliable. It assumes that the tree was composed with the mutate-os-release option turned on (see https://rpm-ostree.readthedocs.io/en/latest/manual/treefile/). We could probably add a e.g. RPMOSTREE_ENV to key off of. Though actually... is this functionality (running at %post time rather than bootup time) relevant in other use cases? Might make sense to have a file in /etc or something instead of being rpm-ostree specific. Having something else more specific would be nice to key off. I'm not sure we actually want to run this code in a non-rpm-ostree %post ever, because it works a bit differently than how akmods otherwise works. Normally what happens is that the akmod build constructs a kmod rpm which is then installed as a regular rpm. However, in the rpm-ostree %post case we can't install an rpm like that (or even reference the rpm db!), so we just uncompress the files in the right place and rely on rpm-ostree to manage those files as part of the "result of %post" layer in the image. Doing that in a non-ostree world seems wrong. OK, I opened https://github.com/projectatomic/rpm-ostree/pull/1750. But as far as Silverblue is concerned, the `OSTREE_VERSION=` hack will work for now since we do turn on mutate-os-release there. IOW, I'm not suggesting this patch be held up by this rpm-ostree PR. Created attachment 1527435 [details] akmods-post: Fix check for ostree This patch adds, in addition to the old check for OSTREE_VERSION in /etc/os-release a check for the new /run/ostree-booted that will be added in later versions of rpm-ostree as per: https://github.com/projectatomic/rpm-ostree/pull/1750 It also changes the "return 0" to "exit 0" so that we actually exit when we're not running under rpm-ostree. Hi, So over all this looks good to me. There is just one issue which I noticed, with that fixed I'm happy to merge this and do new akmods + kmodtools build with this added for F30+. The one issue which I noticed is that the new akmods-post has this: +# This is an ostree build, so do build for all +# deployed kernels in the %post +kernels="$(ls /lib/modules)" And then in a later patch adds this: +depmod -v 2>&1 But that will only do depmod for the currently running kernel, so I think that needs to become: for i in $kernels; do depmod -v $i 2>&1 done I don't have a silverblue setup myself, so if you can provided a tested (incremental) patch doing that, then I will merge all the patches and do new akmods + kmodtools builds for F30+. Regards, Hans p.s. I realize that the using %post vs %posttrans discussion is not entirely resolved yet. For now I believe it is fine to just go with the patches as is. If someone later wants to fold the the akmods-post call into %posttrans then we can fix that with a later release. @hansg Thx for the review comments Additional notes on the %post/%posttrans is that since there is an early check (now that the information that running ostree can be exposed by /run/ostree), it would be even better to wrap the %post call with if /run/ostree. I understand that the rpmcpio installation is a little hacky and aim to make install faster on ostree OS, but then it gives few issues: - Who is responsible to clean-up the old un-owned kernel modules for older kmod ? - How to check if the given kmod is already installed (or needs update, like moving from nvidia 410.33 -> 418.43) ? - Given it can take up to 5 minutes to build a given kmod, can we avoid building them for kernels if they already are already available. - What if a new kernel is in the transaction for the next ostree update (Is there a need to reboot to get the new kernel/kernel-devel for akmods to build the kmod) ? My understanding is that using rpmcpio can speed-up installation, but it would be better to still install the resulting kmods. FYI, there is another patch serie for kmodtool/akmods that will add "some" secure-boot support. In the case of an organisation using a host to build/sign kmods (with a corporate secureboot key), can you verify that using only pre-built kmod works as appropriate. As I really would have expected to look at the pre-built kmod to be the first step before to look at akmods for ostree. Also I'm not sure to understand how (which component and when) takes care of rebuilding the new kmod-foo on a later ostree update ? In the current situation the "normal" distro will prefer to trigger akmods on kernel post-installation, but if anything it can also build on the next boot. I guess the akmods service doesn't need to enabled on boot for ostree (should nvidia-fallback ?) (I confirm that I will run short of time until middle of March. Testers and Review comments are welcomed, but don't expect timely answer on my side unfortunately). Ok, I've gone ahead and made prepared an update with: 1) All patches from this bug 2) The suggested change to call depmod on all kernels 3) akmods-post renamed to akmods-ostree-post to make clear it is only for ostree, and so that if we can move the /run/ostree-booted to the %post (or %posttrans) call it does not look weird 4) A fix for bug 1680121 so that akmods will work without the initscripts pkg Here is a F30 scratch-build of the prepared update for akmods: https://koji.fedoraproject.org/koji/taskinfo?taskID=33107834 And for kmodtool: https://koji.fedoraproject.org/koji/taskinfo?taskID=33107942 Alex, can you please test these and let me know if they are ok, then I will kick-off official builds for F30+. Alex, I plan to add the fix for bug 1680121 to F29 too, since this might be biting users there too. Ideally I would just rebase (fast-forward) the F29 version to the F30 version, that would mean including the silverblue support, is that ok ? Hi Nicolas, Thank you for your comments, (In reply to Nicolas Chauvet (kwizart) from comment #17) > Additional notes on the %post/%posttrans is that since there is an early > check (now that the information that running ostree can be exposed by > /run/ostree), it would be even better to wrap the %post call with if > /run/ostree. I believe that for now we need both the grep and the /run/ostree check, once we can be sure all supported versions have /run/ostree I agree we can move the check to %post, or maybe even turn %posttrans into: %posttrans -n akmod-${kmodname} if test -f /run/ostree-booted; then /usr/sbin/akmods-ostree-post ${kmodname} %{_usrsrc}/akmods/${kmodname}-kmod-%{version}-%{release}.src.rpm else nohup /usr/sbin/akmods --from-akmod-posttrans --akmod ${kmodname} &> /dev/null & fi > I understand that the rpmcpio installation is a little hacky and aim to make > install faster on ostree OS This is not about speed, the problem is that in ostree we cannot run in the background to wait for the rpm transaction to exit, since then the ostree sandbox which builds the overlay (*) will then exit immediately and we cannot install an rpm while in the rpm transaction. In a sense dkms might be a better fit, but since rpmfusion does not do dkms, Alex has adapted akmods to work with this. *) Installing things as non flatpacks on ostree creates an overlay AFAIK > but then it gives few issues: I guess the cleanup is done by the overlay being thrown away and a new one build on pkg upgrades, but that is just I guess to I will leave answering these up to Alex who knows this a lot better then me. > FYI, there is another patch serie for kmodtool/akmods that will add "some" > secure-boot support. In the case of an organisation using a host to > build/sign kmods (with a corporate secureboot key), can you verify that > using only pre-built kmod works as appropriate. As I really would have > expected to look at the pre-built kmod to be the first step before to look > at akmods for ostree. Pre-built kmods for the kmods we are most interested in are not distributable... > I understand that the rpmcpio installation is a little hacky and aim to make install faster on ostree OS, but then it gives few issues: I think there is a misunderstanding of how this works. Here is how a regular rpm-ostree update works. 1. Pull the new image into the local ostree repo 2. Check it out in /ostree/deploy/ next to the current checkout 3. Switch the boot symlink from the old to the new version 4. Reboot However, if your configuration contains some extra rpms layered on top of the basic image some more things happens between 1 and 2: 1. Download latest yum metadata 2. Resolve the list of layered rpms into a set of rpms (the layered ones and dependencies of them that are also not in the original base image). 3. Download all the rpms 4. Extract the rpms on top of the new ostree image (this is done using ostree ops, so its smarter than it sounds) 5. Check out the resulting image into a temporary location 6. Run the %post scripts from the extracted rpms in a sandbox with the temporary location as root 7. Collect the set of changes in the temporary root and commit them to ostree 8. This is the new image that we check out as above. The akmods post runs as step 6 above, and the result of that build is extracted into the temporary location and will thus be part of the read-only image that we boot into. The next time we update this will all be redone, and eventually (after yet another update) the image is not kept anymore and it (and the included kmod modules) is thrown away. > - Who is responsible to clean-up the old un-owned kernel modules for older kmod ? The extracted result is not unowned, it is part of the new image and will be thrown away with it. > - How to check if the given kmod is already installed (or needs update, like moving from nvidia 410.33 -> 418.43) ? rpms are not updated separately on silverblue. This only happens when you do an rpm-ostree update, at which point the layered rpm name "nvidia-akmod" will be resolved to the latest version. > - Given it can take up to 5 minutes to build a given kmod, can we avoid building them for kernels if they already are already available. Its tricky. We can certainly layer a prebuilt kmod, but at any random time the user runs update there might not be one, and then the package layering will fail and the update will abort. To make this pick either-one depending on the status of the repo needs changes to the rpm-ostree logic. > - What if a new kernel is in the transaction for the next ostree update (Is there a need to reboot to get the new kernel/kernel-devel for akmods to build the kmod) ? The %post is built in the sandbox with the new image as rootfs, so headers and whatnot will be for the next kernel. So, as long as building the module doesn't depend on the new kernel *running* this should be fine. (In reply to Hans de Goede from comment #16) > And then in a later patch adds this: > > +depmod -v 2>&1 > > But that will only do depmod for the currently running kernel, so I think > that needs to become: > > for i in $kernels; do > depmod -v $i 2>&1 > done Depmod manpage says this is the default: -a, --all Probe all modules. This option is enabled by default if no file names are given in the command-line. And, given that it did work for me I assume this is true. However, the loop works also. I'll try the new packages (In reply to Alexander Larsson from comment #21) > Depmod manpage says this is the default: > > -a, --all > Probe all modules. This option is enabled by default if no file > names are given in the command-line. Ah I understand that that is how you interpreted the manpage, but that is not how depmod actually works, from the intro of the manpage: "depmod creates a list of module dependencies by reading each module under /lib/modules/version" The -a option is to tell it to generate deps for all modules in that versioned directory, alternatively you can specify specifically which .ko files you want to generate deps for. Things likely worked for you because either the old deps were still good (deps often don't change on nvidia driver updates); or you were only testing the kernel you were running at the time the %post executed. > I'll try the new packages Thanks. (In reply to Hans de Goede from comment #23) > (In reply to Alexander Larsson from comment #21) > Things likely worked for you because either the old deps were still good > (deps often don't change on nvidia driver updates); or you were only testing > the kernel you were running at the time the %post executed. Well, things failed without the depmod call, so i guess it was the second. Whew, that was a pain. There were some issues with recent silverblue images being newer than the rawhide yum repo due to failed composes, so i had to back-rev to an older version. Then it failed to build the nvidia driver due to GPL incompat with the fedora kernel due to it being built with mutex debugging. So i had to work around that (don't ask..). But, after that it worked fine! We just need to land this in f30, and then have rpmfusion rebuild nvidia-kmod with the new kmodtools. Thanks hans! akmods and kmodtool matching the scratch builds has been build for F30+ now, so this bug can be closed now. Alex, usually for akmods and kmodtool we keep the package the same across all supported Fedora versions, so I would like to also build the new version for F29, is that ok or do you expect that that would cause issues for ostree/silverblue? (In reply to Hans de Goede from comment #26) > > Alex, usually for akmods and kmodtool we keep the package the same across > all supported Fedora versions, so I would like to also build the new version > for F29, is that ok or do you expect that that would cause issues for > ostree/silverblue? No, i think that would work fine. kmodtool-1-33.fc29 akmods-0.5.6-19.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2019-22ad9a6b27 akmods-0.5.6-19.fc29, kmodtool-1-33.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-22ad9a6b27 Alex, the following comment got added to: https://bodhi.fedoraproject.org/updates/FEDORA-2019-22ad9a6b27 """ akmods is calling dnf or yum to install built rpms on silverblue? 2019/03/05 12:20:45 akmodsbuild: Checking for unpackaged file(s): /usr/lib/rpm/check-files /tmp/akmodsbuild.5KFhAYt5/BUILDROOT/nvidia-kmod-418.43-1.fc29.x86_64 2019/03/05 12:20:45 akmodsbuild: Wrote: /tmp/akmodsbuild.5KFhAYt5/RPMS/x86_64/kmod-nvidia-4.20.13-200.fc29.x86_64-418.43-1.fc29.x86_64.rpm 2019/03/05 12:20:45 akmodsbuild: Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.z3TTeo 2019/03/05 12:20:45 akmodsbuild: + umask 022 2019/03/05 12:20:45 akmodsbuild: + cd /tmp/akmodsbuild.5KFhAYt5//BUILD 2019/03/05 12:20:45 akmodsbuild: + cd nvidia-kmod-418.43 2019/03/05 12:20:45 akmodsbuild: + /usr/bin/rm -rf /tmp/akmodsbuild.5KFhAYt5/BUILDROOT/nvidia-kmod-418.43-1.fc29.x86_64 2019/03/05 12:20:45 akmodsbuild: + exit 0 2019/03/05 12:20:45 akmodsbuild: Executing(--clean): /bin/sh -e /var/tmp/rpm-tmp.xuZ5BD 2019/03/05 12:20:45 akmodsbuild: + umask 022 2019/03/05 12:20:45 akmodsbuild: + cd /tmp/akmodsbuild.5KFhAYt5//BUILD 2019/03/05 12:20:45 akmodsbuild: + rm -rf nvidia-kmod-418.43 2019/03/05 12:20:45 akmodsbuild: + exit 0 2019/03/05 12:20:45 akmods: Installing newly built rpms 2019/03/05 12:20:45 akmods: DNF not found, using YUM instead. /usr/sbin/akmods: line 300: yum: command not found 2019/03/05 12:20:45 akmods: Could not install newly built RPMs. You can find them and the logfile in: 2019/03/05 12:20:45 akmods: /var/cache/akmods/nvidia/418.43-1-for-4.20.13-200.fc29.x86_64.failed.log """ @hans, I think it's using rpm2cpio So the message probably comes from situation where a non-silverblue script is running and assumed either DNF/YUM is present. I disagreed with the design of this change, I don't think we should have changed anything to the interfaces, but to properly conditionalize the script to the new given assumptions. > Pre-built kmods for the kmods we are most interested in are not distributable...
BTW This is not relevant, the framework allows to prepare (build,sign) kmods that can be installed on a foreign system from a same organisation.
akmods-0.5.6-19.fc29, kmodtool-1-33.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report. |