Bug 2154939

Summary: ppc64le kernel file is larger than on other architectures
Product: [Fedora] Fedora Reporter: Dusty Mabe <dustymabe>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 37CC: acaringi, adscvr, airlied, alciregi, bskeggs, bugproxy, dan, efuller, ellerman, hdegoede, hpa, jarodwilson, jforbes, jglisse, jlebon, josef, kernel-maint, kkiwi, lgoncalv, linville, masami256, mchehab, michael, ptalbert, rravanel, sarwrigh, steved, walters
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-12-06 11:45:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1071880    

Description Dusty Mabe 2022-12-19 17:22:40 UTC
1. Please describe the problem:

Kernels on ppc64le appear to be much larger than on say x86_64:

On Fedora CoreOS nodes on `37.20221215.20.0` I see:

- `x86_64`:

```
$ ls -lh /boot/ostree/fedora-coreos-cbe65104658d968ba9257535af887e57369292356928a2f0aa19de9183ac9e9e/
total 86M
-rw-r--r--. 1 root root 74M Dec 15 22:20 initramfs-6.0.12-300.fc37.x86_64.img
-rwxr-xr-x. 1 root root 13M Dec 15 22:20 vmlinuz-6.0.12-300.fc37.x86_64
```

- `ppc64le`:

```
$ ls -lh /boot/ostree/fedora-coreos-6ccef70b6f4af574fc3b2486258f527111f66dee29e9004b5a18fa97332c04c5/
total 113M
-rw-r--r--. 1 root root 70M Dec 15 23:25 initramfs-6.0.12-300.fc37.ppc64le.img
-rwxr-xr-x. 1 root root 43M Dec 15 23:25 vmlinuz-6.0.12-300.fc37.ppc64le
```

So the kernel is a whole `30M` larger on `ppc64le`.


2. What is the Version-Release number of the kernel:

kernel-6.0.12-300.fc37


3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

Not sure. This is the first I've looked at this problem.


4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Yes. Just look at the files in the RPMs.


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

Yes.


6. Are you running any modules that not shipped with directly Fedora's kernel?:

No.


7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Not needed since this can be observed easily without booting.

Comment 1 Renata Ravanelli 2023-01-12 13:29:22 UTC
Mike Ellerman, are you able to help us understand why?

Comment 2 Dan Horák 2023-01-13 10:42:57 UTC
IIRC we had the discussion earlier also with dzickus@rh, vmlinux vs vmlinuz and being related to the usefulness of the separate debuginfo for the kernel. But I might recall it wrong :-)

Comment 3 IBM Bug Proxy 2023-01-13 14:50:19 UTC
------- Comment From ellerman.com 2023-01-13 04:46 EDT-------
I don't know about CoreOS, but I do have regular Fedora which I assume is similar.

On x86 the "vmlinuz" is actually a bzImage, ie. compressed.

On powerpc the "vmlinuz" is actually a vmlinux, ie. *not* compressed.

You can see with "file", eg:

# file /boot/vmlinuz-6.0.18-200.fc36.x86_64
/boot/vmlinuz-6.0.18-200.fc36.x86_64: Linux kernel x86 boot executable bzImage, version
6.0.18-200.fc36.x86_64 (mockbuild.fedoraproject.org) #1 SMP PREEMPT_DYNAMIC
Sat Jan 7 17:08:48 UTC 2023, RO-rootFS, swap_dev 0XC, Normal VGA

# file /boot/vmlinuz-6.0.12-200.fc36.ppc64le
/boot/vmlinuz-6.0.12-200.fc36.ppc64le: ELF 64-bit LSB executable, 64-bit PowerPC or cisco
7500, OpenPOWER ELF V2 ABI, version 1 (SYSV), statically linked,
BuildID[sha1]=448a34e76e7ef15d7cb653792501ca6628e6bb0b, stripped

Grub supports loading gzipped files, so you can just gzip the vmlinux in /boot and grub
will still boot from it just fine, eg:

# ls -lh vmlinuz-6.0.12-200.fc36.ppc64le
-rwxr-xr-x. 1 root root 43M Dec  8 12:15 vmlinuz-6.0.12-200.fc36.ppc64le

# gzip vmlinuz-6.0.12-200.fc36.ppc64le
# mv vmlinuz-6.0.12-200.fc36.ppc64le.gz vmlinuz-6.0.12-200.fc36.ppc64le

# ls -lh vmlinuz-6.0.12-200.fc36.ppc64le
-rwxr-xr-x. 1 root root 14M Dec  8 12:15 vmlinuz-6.0.12-200.fc36.ppc64le

# reboot
...

Comment 4 Dusty Mabe 2023-01-13 15:41:21 UTC
(In reply to IBM Bug Proxy from comment #3)
> ------- Comment From ellerman.com 2023-01-13 04:46 EDT-------
> I don't know about CoreOS, but I do have regular Fedora which I assume is
> similar.

Indeed. It uses the exact same kernel RPMs that are built for Fedora.

> 
> On x86 the "vmlinuz" is actually a bzImage, ie. compressed.
> 
> On powerpc the "vmlinuz" is actually a vmlinux, ie. *not* compressed.

Right. This is why I opened this issue. I want to understand why it's
not compressed by default on this architecture versus the others and
if we can change it to be compressed by default. It obviously can work
(as you show below).

For full context on why I'm interested in the answer to this question
see: https://github.com/coreos/fedora-coreos-tracker/issues/1247#issuecomment-1355314761

> 
> You can see with "file", eg:
> 
> # file /boot/vmlinuz-6.0.18-200.fc36.x86_64
> /boot/vmlinuz-6.0.18-200.fc36.x86_64: Linux kernel x86 boot executable
> bzImage, version
> 6.0.18-200.fc36.x86_64 (mockbuild.fedoraproject.org) #1 SMP
> PREEMPT_DYNAMIC
> Sat Jan 7 17:08:48 UTC 2023, RO-rootFS, swap_dev 0XC, Normal VGA
> 
> # file /boot/vmlinuz-6.0.12-200.fc36.ppc64le
> /boot/vmlinuz-6.0.12-200.fc36.ppc64le: ELF 64-bit LSB executable, 64-bit
> PowerPC or cisco
> 7500, OpenPOWER ELF V2 ABI, version 1 (SYSV), statically linked,
> BuildID[sha1]=448a34e76e7ef15d7cb653792501ca6628e6bb0b, stripped
> 
> Grub supports loading gzipped files, so you can just gzip the vmlinux in
> /boot and grub
> will still boot from it just fine, eg:
> 
> # ls -lh vmlinuz-6.0.12-200.fc36.ppc64le
> -rwxr-xr-x. 1 root root 43M Dec  8 12:15 vmlinuz-6.0.12-200.fc36.ppc64le
> 
> # gzip vmlinuz-6.0.12-200.fc36.ppc64le
> # mv vmlinuz-6.0.12-200.fc36.ppc64le.gz vmlinuz-6.0.12-200.fc36.ppc64le
> 
> # ls -lh vmlinuz-6.0.12-200.fc36.ppc64le
> -rwxr-xr-x. 1 root root 14M Dec  8 12:15 vmlinuz-6.0.12-200.fc36.ppc64le
> 
> # reboot
> ...

Comment 5 Dan Horák 2023-01-30 10:26:49 UTC
I was able to find the old thread which seems to be captured in the linuxppc-dev archives, see
https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-June/thread.html#244234

Comment 6 Colin Walters 2023-01-30 19:52:58 UTC
> Grub supports loading gzipped files, so you can just gzip the vmlinux in /boot and grub
will still boot from it just fine, eg:

We could probably forcibly compress the kernel (via an opt-in) in rpm-ostree builds.  

> I was able to find the old thread which seems to be captured in the linuxppc-dev archives, see
https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-June/thread.html#244234

That thread seems to be about having the kernel build process generate a compressed image and this apparently bypasses the build-id or other processing.

But if we only care about booting from grub, then it seems at least on FCOS and derivatives we could just switch to compressing the kernel today?

Comment 7 Dusty Mabe 2023-02-01 04:25:35 UTC
(In reply to Colin Walters from comment #6)
> > Grub supports loading gzipped files, so you can just gzip the vmlinux in /boot and grub
> will still boot from it just fine, eg:
> 
> We could probably forcibly compress the kernel (via an opt-in) in rpm-ostree
> builds.

Yeah. It's not ideal but would get us past this.

> 
> > I was able to find the old thread which seems to be captured in the linuxppc-dev archives, see
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2022-June/thread.html#244234
> 
> That thread seems to be about having the kernel build process generate a
> compressed image and this apparently bypasses the build-id or other
> processing.
> 
> But if we only care about booting from grub, then it seems at least on FCOS
> and derivatives we could just switch to compressing the kernel today?

So is it the kernel build scripts/tooling that are deficient here? Could we just
an extra step to the RPM build that compresses the built file after the `make`
from the kernel sources finish? i.e. we're not asking the kernel build scripts
to compress the kernel (so nothing lost as mentioned in the mailing list thread),
but doing it after the kernel build is finished but before the RPM picks up the file.

Comment 8 Colin Walters 2023-02-01 16:20:39 UTC
> So is it the kernel build scripts/tooling that are deficient here? 

I had to read the linked thread a few times but if I'm understanding correctly, the reason ppc64le kernel isn't compressed by default is that it would break booting via OpenFirmware directly, but we always boot from grub today (again, AFAIK).

Comment 9 Justin M. Forbes 2023-02-02 17:02:09 UTC
(In reply to Colin Walters from comment #8)
> > So is it the kernel build scripts/tooling that are deficient here? 
> 
> I had to read the linked thread a few times but if I'm understanding
> correctly, the reason ppc64le kernel isn't compressed by default is that it
> would break booting via OpenFirmware directly, but we always boot from grub
> today (again, AFAIK).

We do not always boot from grub today. While that is the majority of users, we also have to at least boot with petitboot.

There are not a whole lot of users for Fedora ppc, even less outside of IBM.  How many FCOS users are there? And how much of an issue is an extra 30MB really? It this an actual problem, or just something noticed?

Comment 10 Colin Walters 2023-02-02 21:34:21 UTC
> There are not a whole lot of users for Fedora ppc, even less outside of IBM.  How many FCOS users are there?

Note that FCOS is the upstream of RHEL CoreOS, which is the default node for OpenShift 4 which is a Red Hat product that very definitely has users on ppc64le.  And we do share code across FCOS and RHCOS...this leads to the next:

> And how much of an issue is an extra 30MB really? It this an actual problem, or just something noticed?

Yes, we're here because for $historical reasons we chose to have a relatively small 384MB /boot by default - across both FCOS and RHCOS.
And since the kernel configuration is also the same here in RHEL, this is causing us problems across both operating systems.

(Now, due to general growth we will potentially in theory start to run out of space on other architectures at some point, but the 30MB here actually really pushes just ppc64le over the edge *now*)

If it helps, we can re-file a corresponding RHEL bug, but I think the OP (Dusty) wanted to start upstream.

Comment 11 Dusty Mabe 2023-02-03 01:56:27 UTC
(In reply to Colin Walters from comment #10)
> > There are not a whole lot of users for Fedora ppc, even less outside of IBM.  How many FCOS users are there?
> 
> Note that FCOS is the upstream of RHEL CoreOS, which is the default node for
> OpenShift 4 which is a Red Hat product that very definitely has users on
> ppc64le.  And we do share code across FCOS and RHCOS...this leads to the
> next:
> 
> > And how much of an issue is an extra 30MB really? It this an actual problem, or just something noticed?
> 
> Yes, we're here because for $historical reasons we chose to have a
> relatively small 384MB /boot by default - across both FCOS and RHCOS.
> And since the kernel configuration is also the same here in RHEL, this is
> causing us problems across both operating systems.

Right. We keep around 2 sets of kernel/initrd on the system and while a new one is getting installed
we need space (temporarily) for 3 sets of kernel/initrd. This is where the 30M difference adds up
because it's actually 3*30M.

You can definitely make an argument for why we should change the size of our boot partition and we've
considered that too and will possibly do that in the future. The goal of this ticket was mostly to
find out why things are the way they are today because no one seemed to know when I initially asked.

Of course, now that we know why (which I believe is so that firmwares other than GRUB can boot the
kernel because they don't have support for a compressed kernel) the next question is (or could be):
is there a path to changing it? The answer doesn't have to be "yes", but it's worth having the discussion.

Comment 12 Dusty Mabe 2023-02-03 02:04:00 UTC
(In reply to Justin M. Forbes from comment #9)
<snip>
> 
> There are not a whole lot of users for Fedora ppc, even less outside of IBM.
> How many FCOS users are there? And how much of an issue is an extra 30MB
> really? It this an actual problem, or just something noticed?

Specifically on the users of FCOS on ppc64le front. The answer is none. I've been
blocking us releasing it to end users because of this problem [1]. We run a ppc64le
build server for FCOS and at times it can't auto update without intervention because
of this issue, so we have held back shipping it (i.e. via the website download page),
though we are building it and running CI on it.

[1] https://github.com/coreos/fedora-coreos-tracker/issues/987#issuecomment-1281123396

Comment 13 Justin M. Forbes 2023-02-03 19:17:41 UTC
(In reply to Dusty Mabe from comment #11)
> Of course, now that we know why (which I believe is so that firmwares other
> than GRUB can boot the
> kernel because they don't have support for a compressed kernel) the next
> question is (or could be):
> is there a path to changing it? The answer doesn't have to be "yes", but
> it's worth having the discussion.

I am sure there is a path.  I am not horribly familiar with petitboot, but I suppose the path to getting things changed is to either verify that petitboot can actually boot a compressed kernel.  If it can, we could probably compress ourselves in the spec. If it can't, I believe that functionality would have to be added. I suppose another possibility is to make something in the FCOS ecosystem compress the kernel image specifically for FCOS users, and then say that FCOS does not support petitboot.

Comment 14 Dan Horák 2023-02-03 19:34:43 UTC
and because petitboot uses kexec to boot the new kernel, we can transform the problem to "if kexec supports compressed kernel", which should be easier to test

Comment 15 Jonathan Lebon 2023-02-03 19:51:45 UTC
Just a clarification: note that we do support petitboot for booting RHCOS. I'm not sure what the proportion of the customer base it represents, but we've definitely accommodated it in the past (e.g. https://github.com/coreos/coreos-assembler/pull/2005).

Comment 16 Renata Ravanelli 2023-02-07 14:14:42 UTC
My understanding it that we will always use petitboot for Power, petitboot will start and then calls Grub. 

I'm not familiar with OpenFirmware, but I guess that's the default firmware for Power? Maybe other models such as OPAL may have a different firmware?

In any case, if we can't boot in OpenFirmware, that's sound a huge impact. 

Klaus, 

Any chance you can help us to understand the impact for compressing the Kernel and how it would affect the firmware?

Do you suggest some server models where we can test/validate it?

Comment 17 Justin M. Forbes 2023-02-07 17:33:05 UTC
From a Fedora standpoint, we should be able to boot on any of Little Endian IBM power systems, and I believe we have a number of Raptor PPC users.  In fact, in terms of community users who aren't working for IBM, I would guess we have more Raptor than IBM PPC for Fedora.

Comment 18 Michael Ellerman 2023-02-08 12:27:56 UTC
I'm afraid I don't really see a good solution.

There is a powerpc zImage, like the x86 bzImage, but it has several down sides.

When booting on IBM PowerVM LPARs (all that RHEL supports), there is a limit on how much memory is available during early boot. Using the zImage requires more memory during boot, because you need to have space for the zImage as well as space to decompress the zImage. If there's not enough space booting fails. So switching to the zImage is likely to break booting on some systems.

On powernv we boot with petitboot (not grub), petitboot can't boot a zImage at all.

petitboot also can't boot a gzipped vmlinux.

Getting petitboot updated so it can boot a gzipped vmlinux could be done, but AFAIK petitboot is mostly unmaintained these days.
Then there is the problem that users would need to flash a new petitboot on their system in order to boot newer kernels.
Finally I believe the Raptor systems ship with a fork of petitboot, I'm not sure how easy it would be to get changes into that version.


I don't know if it's possible, but one option would be to gzip the vmlinux by default - which works when booting with grub on pseries, and then have the install script ungzip it when installing on powernv.

Comment 19 Aoife Moloney 2023-11-23 00:48:06 UTC
This message is a reminder that Fedora Linux 37 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 37 on 2023-12-05.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '37'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 37 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 20 Aoife Moloney 2023-12-06 11:45:35 UTC
Fedora Linux 37 entered end-of-life (EOL) status on None.

Fedora Linux 37 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.