Bug 1576573

Summary: Directory /tmp is removed when installing kernel-core package
Product: [Fedora] Fedora Reporter: larchunix
Component: grub2Assignee: Peter Jones <pjones>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 28CC: fmartine, frederick, lkundrak, olivier.lahaye1, pjones, pkliczew, rjones
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-05-28 22:25:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 910269    

Description larchunix 2018-05-09 19:47:44 UTC
Dear maintainer,

Description of problem:

Script /usr/lib/kernel/install.d/20-grub.install removes /tmp directory.

Version-Release number of selected component (if applicable):

grub2-common-2.02-34.fc28.noarch

How reproducible:

Always

Steps to Reproduce:
1. Setup a docker container using fedora latest image
2. Install kernel-core package in the container

For instance you could use the following simple Dockerfile:

FROM fedora:latest
RUN dnf install -qy kernel-core
USER user
WORKDIR /home/user
ADD --chown=user:user . .
CMD ["/bin/bash"]

and create a new container using docker build.

Actual results:

kernel-core post-installation scripts results in /tmp being removed from the container

Expected results:

/tmp should be left intact

Additional info:

Here is the wrong sequence of post-installation scripts

/bin/kernel-install
├─ BOOT_DIR_ABS=/tmp/kernel-install.nmFVb
├─ [snip]
├─ /usr/lib/kernel/install.d/20-grub.install add 4.16.6-302.fc28.x86_64 /tmp/kernel-install.nmFVb /lib/modules/4.16.6-302.fc28.x86_64/vmlinuz
   ├─ BOOT_DIR_ABS=/tmp/kernel-install.nmFVb
   ├─ rm -rf "${BOOT_DIR_ABS%/*}"  <== expanded to rm -rf /tmp
   └─ [snip]

git blame gives the following commit as the culprit which introduced this change: https://src.fedoraproject.org/rpms/grub2/c/4a0d9d8

In most installations /tmp is a tmpfs mountpoint so it does not get unlinked, but in my container /tmp is not a mountpoint so it is completely removed.

Kind regards

Comment 1 larchunix 2018-05-09 19:58:36 UTC
By the way dnf produces the following output:

dracut: No '/dev/log' or 'logger' included for syslog logging
findmnt: can't read (null): No such file or directory
rmdir: failed to remove '/tmp/kernel-install.BMCvb': No such file or directory
Warning: In kernel-install plugins, requiring BOOT_DIR_ABS to be preset is deprecated.
         All plugins should not put anything in BOOT_DIR_ABS if the environment
         variable KERNEL_INSTALL_MACHINE_ID is empty.
Exception ignored in: <bound method _TemporaryFileCloser.__del__ of <tempfile._TemporaryFileCloser object at 0x7f50ea250400>>
Traceback (most recent call last):
  File "/usr/lib64/python3.6/tempfile.py", line 450, in __del__
    self.close()
  File "/usr/lib64/python3.6/tempfile.py", line 446, in close
    unlink(self.name)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp9jmr1i31'

Comment 2 Frederick Ding 2018-06-15 01:24:55 UTC
I've gotten the same with kernel-core-4.16.15-300.fc28.x86_64 . A big list of errors is outputted because the container is unprivileged.

This has been reproducible for me on a CentOS docker host running unmodified kernel-3.10.0-862.3.2.el7.x86_64 , XFS with overlay2, and Docker 1.13.1.

Comment 3 Javier Martinez Canillas 2018-06-15 11:17:17 UTC
(In reply to larchunix from comment #0)
> 
> Expected results:
> 
> /tmp should be left intact
> 
> Additional info:
> 
> Here is the wrong sequence of post-installation scripts
> 
> /bin/kernel-install
> ├─ BOOT_DIR_ABS=/tmp/kernel-install.nmFVb
> ├─ [snip]
> ├─ /usr/lib/kernel/install.d/20-grub.install add 4.16.6-302.fc28.x86_64
> /tmp/kernel-install.nmFVb /lib/modules/4.16.6-302.fc28.x86_64/vmlinuz
>    ├─ BOOT_DIR_ABS=/tmp/kernel-install.nmFVb
>    ├─ rm -rf "${BOOT_DIR_ABS%/*}"  <== expanded to rm -rf /tmp
>    └─ [snip]
> 

I see, $BOOT_DIR_ABS usually is set to /boot/$MACHINE_ID/$KERNEL_VERSION (or /boot/efi/$MACHINE_ID/$KERNEL_VERSION for EFI) by kernel-install and is only used by systemd-boot. So that command attempts to remove the /boot/$MACHINE_ID directory that's not used by GRUB 2. But $BOOT_DIR_ABS is set to a /tmp/kernel-install.XXXXX if /etc/machine-id is empty, which is the case in the Fedora docker image:

$ docker run -it fedora:latest wc -c /etc/machine-id
0 /etc/machine-id

Looking at previous versions, machine-id was set until F25:

$ for fver in {25..28}; do echo "fedora version $fver" && docker run -ti fedora:$fver cat /etc/machine-id; done
fedora version 25
b794b17bf0f14468a60a7fcc4dfe221b
fedora version 26
fedora version 27
fedora version 28

Not sure if that's expected or not.

Now going back to this bug, the kernel-install scripts assume that there is a $MACHINE_ID set and most scripts exit if that's not the case.

So probably we should do the same for 20-grub.install and that will prevent this bug to happen. That means the kernel images won't be copied to /boot. Would that be a problem?

I guess that it won't since there are other errors when attempting to install a kernel on F27 anyways (but of course /tmp is not removed which is a regression).

Out of interest, what's the goal of installing a kernel in a container image?

Comment 4 larchunix 2018-06-15 15:35:54 UTC
(In reply to Javier Martinez Canillas from comment #3)
> 
> Now going back to this bug, the kernel-install scripts assume that there is
> a $MACHINE_ID set and most scripts exit if that's not the case.
> 
> So probably we should do the same for 20-grub.install and that will prevent
> this bug to happen. That means the kernel images won't be copied to /boot.
> Would that be a problem?
> 
> I guess that it won't since there are other errors when attempting to
> install a kernel on F27 anyways (but of course /tmp is not removed which is
> a regression).
> 
> Out of interest, what's the goal of installing a kernel in a container image?

Thanks for having a look !

The use case was to install and run flatpak/flatpak-builder in a fedora container for CI purpose.

kernel-core just happens to be pulled by flatpak dependencies.

Since the kernel is not used in that case, not having it copied to /boot seems reasonable to me.

Comment 5 Richard W.M. Jones 2018-06-28 10:27:45 UTC
This bug affects installing virt-v2v in a container, and therefore
adding virt-v2v to KubeVirt.

Comment 6 Javier Martinez Canillas 2018-06-28 10:43:26 UTC
A fix is already in Pagure and will be part of the next grub2 update:

https://src.fedoraproject.org/rpms/grub2/c/46dcf6afe968e261a0a2e53f991c058630afaf1b?branch=f28

Comment 7 Olivier LAHAYE 2018-09-11 09:45:19 UTC
> Out of interest, what's the goal of installing a kernel in a container image?

It's mandatory for me as I require a valid kernel in /boot to build systemimager rpm.
(I'm using a docker container to build it and while it works fine on centos-6 and centos-7, it fails on fedora due to the fact that /boot has no kernel installed and dracut also fails.

Will create a bug for that.
(https://github.com/finley/SystemImager/tree/initrd-from-imageserver-and-dont-package-initrd)

Comment 8 Richard W.M. Jones 2018-09-13 10:07:44 UTC
> Out of interest, what's the goal of installing a kernel in a container image?

libguestfs needs an actual kernel.

However it doesn't require the kernel to be copied to /boot (it'll
use the kernel from /lib/modules/*/vmlinuz) so I believe the fix
proposed here is fine.

Comment 9 Javier Martinez Canillas 2018-09-13 11:04:05 UTC
(In reply to Olivier LAHAYE from comment #7)
> > Out of interest, what's the goal of installing a kernel in a container image?
> 
> It's mandatory for me as I require a valid kernel in /boot to build
> systemimager rpm.
> (I'm using a docker container to build it and while it works fine on
> centos-6 and centos-7, it fails on fedora due to the fact that /boot has no
> kernel installed and dracut also fails.
> 
> Will create a bug for that.
> (https://github.com/finley/SystemImager/tree/initrd-from-imageserver-and-
> dont-package-initrd)

For the kernel part, it will work if you install grub2's kernel-install script and generate a machine ID. Dracut still will fail but I'm not sure what's needed to make it work inside a container.

So something like the following will lead to a kernel installed in /boot.

$ podman run -it --rm fedora:rawhide bash

[root@eae22e299aa9 /]# systemd-machine-id-setup
Initializing machine ID from random generator.
[root@eae22e299aa9 /]# cat /etc/machine-id 
f84abc5f66114715b42fd5be94ebe2ae
[root@eae22e299aa9 /]# dnf install grub2-common grub2-tools grub2-tools-minimal
[root@eae22e299aa9 /]# ls /lib/kernel/install.d/20-grub.install 
/lib/kernel/install.d/20-grub.install
[root@eae22e299aa9 /]# ls /boot/vmlinuz*
/boot/vmlinuz-4.19.0-0.rc3.git0.1.fc30.x86_64

Comment 10 Javier Martinez Canillas 2018-09-13 11:07:42 UTC
(In reply to Richard W.M. Jones from comment #8)
> > Out of interest, what's the goal of installing a kernel in a container image?
> 
> libguestfs needs an actual kernel.
> 
> However it doesn't require the kernel to be copied to /boot (it'll
> use the kernel from /lib/modules/*/vmlinuz) so I believe the fix
> proposed here is fine.

Yes, the kernel package will install the kernel in /lib/modules/*/vmlinuz. Is just the kernel-install scripts that copy them to /boot that won't work without a /etc/machine-id.

$ podman run -it --rm fedora:rawhide bash

[root@18ea66d6590f /]# dnf install kernel
[root@18ea66d6590f /]# ls /lib/modules/4.19.0-0.rc3.git0.1.fc30.x86_64/vmlinuz 
/lib/modules/4.19.0-0.rc3.git0.1.fc30.x86_64/vmlinuz
[root@18ea66d6590f /]# ls /boot/
[root@18ea66d6590f /]#

Comment 11 Ben Cotton 2019-05-02 20:56:24 UTC
This message is a reminder that Fedora 28 is nearing its end of life.
On 2019-May-28 Fedora will stop maintaining and issuing updates for
Fedora 28. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora 'version' of '28'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 28 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 12 Ben Cotton 2019-05-28 22:25:03 UTC
Fedora 28 changed to end-of-life (EOL) status on 2019-05-28. Fedora 28 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.