Bug 2109194 - dracut now builds an initramfs that will not boot the kernel
Summary: dracut now builds an initramfs that will not boot the kernel
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: dracut
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: dracut-maint-list
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-07-20 15:41 UTC by stan
Modified: 2022-08-07 17:00 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: ---
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-07 17:00:24 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
rc4 initramfs that works, created with dracut 56-2 on 20220703, though when I downgraded to that version, it now builds failing initramfs' in current rawhide. (311.03 KB, text/plain)
2022-07-20 15:41 UTC, stan
no flags Details
rc7 initramfs that doesn't work, created with dracut 57-1 on 20220720 (282.70 KB, text/plain)
2022-07-20 15:42 UTC, stan
no flags Details
this is the initramfs that works, (311.03 KB, text/plain)
2022-07-28 05:22 UTC, stan
no flags Details
this is then new initramfs that doesn't work. (285.42 KB, text/plain)
2022-07-28 05:23 UTC, stan
no flags Details
these are the entries in the working initramfs that aren't in the failing initramfs (11.90 KB, text/plain)
2022-07-28 05:24 UTC, stan
no flags Details
These are the entries in the failing initramfs that aren't in the working initramfs (89 bytes, text/plain)
2022-07-28 05:25 UTC, stan
no flags Details

Description stan 2022-07-20 15:41:00 UTC
Created attachment 1898306 [details]
rc4 initramfs that works, created with dracut 56-2 on 20220703, though when I downgraded to that version, it now builds failing initramfs' in current rawhide.

Description of problem:
When building a custom kernel, or installing the stock fedora kernel, the initramfs does not boot the kernel.  In the first case, it hangs with no indication.  In the second case, the kernel panics when it tries to use /usr/lib/systemd/libsystemd-core-251.3-1.so  This worked on July 3, 2022 when I built the last rc4 kernel of the 5.19 kernel series.  It failed on July 17, 2022 when I tried to build the rc5 kernel of the 5.19 series, and on July 19, 2022 when I tried to build the rc7 kernel of the 5.19 kernel series.  It also failed on July 18, 2022 with the rc6 stock fedora kernel, and on July 19, 2022 with the rc7 stock fedora kernel.  If I regenerate the initramfs for a previously working kernel, that kernel then refuses to boot with the same symptoms.


Version-Release number of selected component (if applicable):
dracut-057-1.fc37.x86_64
but also now with the 
dracut-056-2.fc37.x86_64
that used to work.

How reproducible:
every time


Steps to Reproduce:
1.  Have an up to date rawhide install
2.  Install a new kernel
3.  Reboot with the new kernel

Actual results:
Hangs with my custom kernel
or kernel panics with an error message, all of the following was on one line, wrapped here
/init error while loading shared libraries
libsystemd-core-251-3-1.fc37.so cannot open shared object file:
no such file or directory.


Expected results:
kernel boots


Additional info:
This is the first part of a difference between the unpacked rc4 initramfs that works, and an unpacked rc7 initramfs that doesn't.

1c1
< Image: /boot/initramfs-5.19.0-0.rc4.20220701gita175eca0f3d7.36.20220703.fc37.x86_64.img: 48M
---
> Image: /boot/initramfs-5.19.0-0.rc7.53.20220719.fc37.x86_64.img: 32M
3c3
< Version: dracut-056-2.fc37
---
> Version: dracut-057-1.fc37
37,57c37,57
< drwxr-xr-x  12 root     root            0 Apr 19 16:17 .
< crw-r--r--   1 root     root       5,   1 Apr 19 16:17 dev/console
< crw-r--r--   1 root     root       1,  11 Apr 19 16:17 dev/kmsg
< crw-r--r--   1 root     root       1,   3 Apr 19 16:17 dev/null
< crw-r--r--   1 root     root       1,   8 Apr 19 16:17 dev/random
< crw-r--r--   1 root     root       1,   9 Apr 19 16:17 dev/urandom
< lrwxrwxrwx   1 root     root            7 Apr 19 16:17 bin -> usr/bin
< drwxr-xr-x   2 root     root            0 Apr 19 16:17 dev
< drwxr-xr-x  13 root     root            0 Apr 19 16:17 etc
< drwxr-xr-x   2 root     root            0 Apr 19 16:17 etc/authselect
< -rw-r--r--   1 root     root          703 Apr 19 16:17 etc/authselect/nsswitch.conf
< drwxr-xr-x   2 root     root            0 Apr 19 16:17 etc/cmdline.d
< drwxr-xr-x   2 root     root            0 Apr 19 16:17 etc/conf.d
< -rw-r--r--   1 root     root          124 Apr 19 16:17 etc/conf.d/systemd.conf
< drwxr-xr-x   7 root     root            0 Apr 19 16:17 etc/dbus-1
< drwxr-xr-x   2 root     root            0 Apr 19 16:17 etc/dbus-1/interfaces
< drwxr-xr-x   2 root     root            0 Apr 19 16:17 etc/dbus-1/services
< -rw-r--r--   1 root     root          838 Mar 10 02:47 etc/dbus-1/session.conf
< drwxr-xr-x   2 root     root            0 Apr 19 16:17 etc/dbus-1/session.d
< -rw-r--r--   1 root     root          833 Mar 10 02:47 etc/dbus-1/system.conf
< drwxr-xr-x   2 root     root            0 Apr 19 16:17 etc/dbus-1/system.d
---
> drwxr-xr-x  12 root     root            0 Jul 18 04:21 .
> crw-------   1 root     root       5,   1 Jul 18 04:21 dev/console
> crw-------   1 root     root       1,  11 Jul 18 04:21 dev/kmsg
> crw-------   1 root     root       1,   3 Jul 18 04:21 dev/null
> crw-------   1 root     root       1,   8 Jul 18 04:21 dev/random
> crw-------   1 root     root       1,   9 Jul 18 04:21 dev/urandom
> lrwxrwxrwx   1 root     root            7 Jul 18 04:21 bin -> usr/bin
> drwxr-xr-x   2 root     root            0 Jul 18 04:21 dev
> drwxr-xr-x  13 root     root            0 Jul 18 04:21 etc
> drwxr-xr-x   2 root     root            0 Jul 18 04:21 etc/authselect
> -rw-r--r--   1 root     root          703 Jul 18 04:21 etc/authselect/nsswitch.conf
> drwx------   2 root     root            0 Jul 18 04:21 etc/cmdline.d
> drwx------   2 root     root            0 Jul 18 04:21 etc/conf.d
> -rw-------   1 root     root          124 Jul 18 04:21 etc/conf.d/systemd.conf
> drwxr-xr-x   7 root     root            0 Jul 18 04:21 etc/dbus-1
> drwxr-xr-x   2 root     root            0 Jul 18 04:21 etc/dbus-1/interfaces
> drwxr-xr-x   2 root     root            0 Jul 18 04:21 etc/dbus-1/services
> -rw-r--r--   1 root     root          838 Jul 12 09:59 etc/dbus-1/session.conf
> drwxr-xr-x   2 root     root            0 Jul 18 04:21 etc/dbus-1/session.d
> -rw-r--r--   1 root     root          833 Jul 12 09:59 etc/dbus-1/system.conf
> drwxr-xr-x   2 root     root            0 Jul 18 04:21 etc/dbus-1/system.d

I am attaching both of those unpacked initramfs files.

Comment 1 stan 2022-07-20 15:42:20 UTC
Created attachment 1898307 [details]
rc7 initramfs that doesn't work, created with dracut 57-1 on 20220720

Comment 2 stan 2022-07-20 15:45:54 UTC
Note the difference in permissions between the old initramfs and the new initramfs.  Also note the large difference in size of the generated initramfs.

Comment 3 stan 2022-07-20 17:09:37 UTC
A quick visual comparison of the two unloaded initramfs' reveals that almost then entire size discrepancy is because of missing libraries in the new initramfs created by dracut.  I am thinking that one of those missing libraries is critical to successful boot.

Comment 4 stan 2022-07-20 18:27:19 UTC
When I run the command to generate the initramfs with version 57-1, I get the following error as output.
/usr/bin/ldd: line 160: /lib/ld-linux.so.2: cannot execute binary file: Exec format error

I'm trying to add some of the libraries missing in the latest version to the generated initramfs using the install_items+="" option in /etc/dracut.conf.d/.  I'm not having any success.  It isn't clear from the man page what the format of those file names should be.  Does it include the path?  Is the path relative to /boot or / or ???  Are the names generic or do they include the version suffix?  It would be great to have an example there.

Comment 5 stan 2022-07-26 21:52:22 UTC
The mass rebuild is finished for rawhide, and I successfully updated.  I was hoping that that would fix this issue.  But, no luck, the initramfs that is built is still too small, and the system still hangs during boot when it tries to find the root partition.  I did downgrade both dracut and systemd to the versions that were present for the last successful build of the initramfs for the kernel, and they didn't build an initramfs that was successful either.  I think that means that this issue isn't caused by dracut or systemd, but by a configuration change somewhere else that they are responding to.  I don't know where to start looking for such a change.  Could it be a change in the kernel?  Or some boot option being removed?

Comment 6 stan 2022-07-28 05:21:27 UTC
I rebuilt the last working 5.19 kernel, rc4, with the exact same configuration as previously, but with rawhide updated to the latest packages.  It failed to boot.  So, I compared the two initramfs', the one that works, and the one that fails, finding what was different between them.  In short, the libraries that were in the working initramfs weren't in the failing initramfs.

I will be attaching the files of the complete unpacked initramfs, and of the unique items in each initramfs.

Comment 7 stan 2022-07-28 05:22:56 UTC
Created attachment 1899849 [details]
this is the initramfs that works,

Comment 8 stan 2022-07-28 05:23:42 UTC
Created attachment 1899850 [details]
this is then new initramfs that doesn't work.

Comment 9 stan 2022-07-28 05:24:36 UTC
Created attachment 1899851 [details]
these are the entries in the working initramfs that aren't in the failing initramfs

Comment 10 stan 2022-07-28 05:25:20 UTC
Created attachment 1899852 [details]
These are the entries in the failing initramfs that aren't in the working initramfs

Comment 11 stan 2022-07-28 05:29:34 UTC
Why does this matter?  Well, one of the executables in the initramfs is findmnt.  Its dependencies are shown below.  How can that command work without libblkid and libuuid, both missing from the failing initramfs?
# dnf deplist /usr/bin/findmnt
Last metadata expiration check: 2:37:37 ago on Wed 27 Jul 2022 07:49:13 PM MST.
package: util-linux-core-2.38-5.fc37.x86_64
  dependency: glibc >= 2.35.9000-29
   provider: glibc-2.35.9000-31.fc37.i686
   provider: glibc-2.35.9000-31.fc37.x86_64
  dependency: ld-linux-x86-64.so.2()(64bit)
   provider: glibc-2.35.9000-31.fc37.x86_64
  dependency: ld-linux-x86-64.so.2(GLIBC_2.3)(64bit)
   provider: glibc-2.35.9000-31.fc37.x86_64
  dependency: libblkid = 2.38-5.fc37
   provider: libblkid-2.38-5.fc37.i686
   provider: libblkid-2.38-5.fc37.x86_64
  dependency: libblkid.so.1()(64bit)
   provider: libblkid-2.38-5.fc37.x86_64
  dependency: libblkid.so.1(BLKID_1.0)(64bit)
   provider: libblkid-2.38-5.fc37.x86_64
  dependency: libblkid.so.1(BLKID_2.15)(64bit)
   provider: libblkid-2.38-5.fc37.x86_64
  dependency: libblkid.so.1(BLKID_2.17)(64bit)
   provider: libblkid-2.38-5.fc37.x86_64
  dependency: libblkid.so.1(BLKID_2.18)(64bit)
   provider: libblkid-2.38-5.fc37.x86_64
  dependency: libblkid.so.1(BLKID_2.20)(64bit)
   provider: libblkid-2.38-5.fc37.x86_64
  dependency: libblkid.so.1(BLKID_2.21)(64bit)
   provider: libblkid-2.38-5.fc37.x86_64
  dependency: libblkid.so.1(BLKID_2.25)(64bit)
   provider: libblkid-2.38-5.fc37.x86_64
  dependency: libblkid.so.1(BLKID_2.30)(64bit)
   provider: libblkid-2.38-5.fc37.x86_64
  dependency: libblkid.so.1(BLKID_2_37)(64bit)
   provider: libblkid-2.38-5.fc37.x86_64
  dependency: libc.so.6(GLIBC_2.36)(64bit)
   provider: glibc-2.35.9000-31.fc37.x86_64
  dependency: libmount = 2.38-5.fc37
   provider: libmount-2.38-5.fc37.i686
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libmount.so.1()(64bit)
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libmount.so.1(MOUNT_2.19)(64bit)
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libmount.so.1(MOUNT_2.20)(64bit)
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libmount.so.1(MOUNT_2.21)(64bit)
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libmount.so.1(MOUNT_2.22)(64bit)
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libmount.so.1(MOUNT_2.23)(64bit)
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libmount.so.1(MOUNT_2.24)(64bit)
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libmount.so.1(MOUNT_2.25)(64bit)
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libmount.so.1(MOUNT_2.30)(64bit)
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libmount.so.1(MOUNT_2.33)(64bit)
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libmount.so.1(MOUNT_2.34)(64bit)
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libmount.so.1(MOUNT_2_35)(64bit)
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libmount.so.1(MOUNT_2_37)(64bit)
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libmount.so.1(MOUNT_2_38)(64bit)
   provider: libmount-2.38-5.fc37.x86_64
  dependency: libselinux.so.1()(64bit)
   provider: libselinux-3.4-5.fc37.x86_64
  dependency: libselinux.so.1(LIBSELINUX_1.0)(64bit)
   provider: libselinux-3.4-5.fc37.x86_64
  dependency: libsmartcols = 2.38-5.fc37
   provider: libsmartcols-2.38-5.fc37.i686
   provider: libsmartcols-2.38-5.fc37.x86_64
  dependency: libsmartcols.so.1()(64bit)
   provider: libsmartcols-2.38-5.fc37.x86_64
  dependency: libsmartcols.so.1(SMARTCOLS_2.25)(64bit)
   provider: libsmartcols-2.38-5.fc37.x86_64
  dependency: libsmartcols.so.1(SMARTCOLS_2.27)(64bit)
   provider: libsmartcols-2.38-5.fc37.x86_64
  dependency: libsmartcols.so.1(SMARTCOLS_2.28)(64bit)
   provider: libsmartcols-2.38-5.fc37.x86_64
  dependency: libsmartcols.so.1(SMARTCOLS_2.29)(64bit)
   provider: libsmartcols-2.38-5.fc37.x86_64
  dependency: libsmartcols.so.1(SMARTCOLS_2.33)(64bit)
   provider: libsmartcols-2.38-5.fc37.x86_64
  dependency: libsmartcols.so.1(SMARTCOLS_2.38)(64bit)
   provider: libsmartcols-2.38-5.fc37.x86_64
  dependency: libsystemd.so.0()(64bit)
   provider: systemd-libs-251.3-2.fc37.x86_64
  dependency: libsystemd.so.0(LIBSYSTEMD_209)(64bit)
   provider: systemd-libs-251.3-2.fc37.x86_64
  dependency: libtinfo.so.6()(64bit)
   provider: ncurses-libs-6.3-3.20220501.fc37.x86_64
  dependency: libudev.so.1()(64bit)
   provider: systemd-libs-251.3-2.fc37.x86_64
  dependency: libudev.so.1(LIBUDEV_183)(64bit)
   provider: systemd-libs-251.3-2.fc37.x86_64
  dependency: libuuid = 2.38-5.fc37
   provider: libuuid-2.38-5.fc37.i686
   provider: libuuid-2.38-5.fc37.x86_64
  dependency: libuuid.so.1()(64bit)
   provider: libuuid-2.38-5.fc37.x86_64
  dependency: libuuid.so.1(UUID_1.0)(64bit)
   provider: libuuid-2.38-5.fc37.x86_64
  dependency: rtld(GNU_HASH)
   provider: glibc-2.35.9000-31.fc37.i686
   provider: glibc-2.35.9000-31.fc37.x86_64

Comment 12 stan 2022-08-02 20:08:19 UTC
Update.

The latest stock fedora kernel for rawhide, 5.19.0-65.fc37.x86_64, exhibits the same problem as I reported above, except now the library that can't be found is libsystemd-core-251-3-2.fc37.so.
$ ls -nZ /usr/lib/systemd/libsystemd-core-251.3-2.fc37.so
-rwxr-xr-x. 1 0 0 system_u:object_r:lib_t:s0 2126024 Jul 23 03:09 /usr/lib/systemd/libsystemd-core-251.3-2.fc37.so
And the system has a panic and hangs.

I've been trying to get the manually built initramfs' to work using different configurations, or at least put out some meaningful information that will point to the problem.  I even went so far as to build an earlier version 55 dracut from f36 to see if it would make a difference to the initramfs being built.  No.

My latest effort is to put the following line,
rd.cmdline="root=UUID=e8ff5cac-7c2f-422c-96bd-cbf1824ee177 rootfstype=ext4 ro LANG=en_US.UTF-8 rd.dm=0 rd.md=0 rd.lvm=0 rd.luks=0 rd.peerdns=0 rd.shell rd.info rd.debug"
in a file in /etc/dracut.conf.d/ called 99_add_missing_libraries.conf.  And I am building the initramfs' with the --no-compress option, in case it is a matter of being unable to decompress the file.  And it is still the same symptom, it boots, it does a single read of the disk (the light flashes), and then it hangs.  I thought this was supposed to drop to a shell, and put out debug information?  Does the lack of shell and debug info mean that dracut isn't even getting invoked?

Any way, I've been trying to create a kernel that doesn't require an initramfs to boot, but the Fedora kernels have the initramfs hard coded.  I moved the initramfs, thinking it would try the kernel if there was no initramfs, but it turns out that the kernel builds in a rudimentary backup initramfs, and it is invoked instead of just directly booting the kernel.

Meanwhile the kernel I built and installed on 20220703 continues to boot and run flawlessly, thank goodness.

Comment 13 stan 2022-08-07 17:00:24 UTC
This has been resolved.  When I added all the missing libraries back into the initramfs using install_optional_items+="  ", e.g. install_optional_items+=" /usr/lib/systemd/libsystemd-core-251.3-2 "the stock Fedora kernel booted successfully.  Still not booting my locally built custom kernel tuned to my hardware, but I think that will just take tweaking to get working.  

I did not find why dracut is not installing those libraries in the initramfs on my system.  When I posted to fedora-test about this, Adam Williamson said that his rawhide version was updating successfully with no boot problems, so this is probably a corner case of some sort.


Note You need to log in before you can comment on or make changes to this bug.