Bug 1405025

Summary: dracut generates initramfs with error messages when adding kernel modules
Product: Red Hat Enterprise Linux 7 Reporter: youpaooo <youpaooo>
Component: dracutAssignee: Lukáš Nykrýn <lnykryn>
Status: CLOSED ERRATA QA Contact: Release Test Team <release-test-team-automation>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 7.1CC: 386114578, chorn, dracut-maint-list, dvacek, harald, jstodola, khorenko
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: dracut-033-495.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-01 21:06:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Implementation of the approach in #16, fails quite fast in my KVM guest here none

Description youpaooo 2016-12-15 12:18:33 UTC
Description of problem:

I'm trying to create a initramfs in a chroot environment which using mount --bind proc, dev, sys on the host. Using command "mkinitrd initramfs-3.10.0-327.28.56.6.x86_64.img 3.10.0-327.28.56.6.x86_64 --force" to generate with error messages when adding kernel modules, shown that:

/sbin/dracut: line 1108: host_modules["$m"]: bad array subscript
dracut module 'fcoe' depends on 'network', which can't be installed
dracut module 'fcoe' depends on 'network', which can't be installed
cat: write error: Broken pipe

Copy the new initramfs to boot partition for booting, then the system can not start, because the dm-mod module is not integrated into the new initramfs. Extract the initramfs, found that some drivers are missing.

Version-Release number of selected component (if applicable):
dracut-033-360

Steps to Reproduce:
Reproducing conditions not found


Additional info:

According to the error message, modify the code for reliability.

--- base/dracut	2016-12-15 20:06:09.000000000 +0800
+++ new/dracut	2016-12-15 20:06:56.000000000 +0800
@@ -1105,6 +1105,7 @@
     # check /proc/modules
     declare -A host_modules
     while read m rest; do
+        [ -z "$m" ] && continue
         host_modules["$m"]=1
     done </proc/modules
 fi

Comment 6 Konstantin Khorenko 2017-07-20 16:07:46 UTC
This is easily reproduced with scripts run in parallel:

1) # while : ; do modprobe loop; rmmod loop; done
./read-modules.orig.sh: line 7: host_modules["$m"]: bad array subscript
./read-modules.orig.sh: line 7: host_modules["$m"]: bad array subscript
./read-modules.orig.sh: line 7: host_modules["$m"]: bad array subscript


2) # cat read-modules.orig.sh
#!/bin/bash

declare -A host_modules

while read m rest; do
        host_modules["$m"]=1
done </proc/modules

Comment 7 errata-xmlrpc 2017-08-01 21:06:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2076

Comment 8 386114578 2017-10-09 10:23:55 UTC
Does the solution make any sense?

At my point of view, when we insert/remove a kernel module during reading /proc/modules line by line, the read results are messed up.SKIPPING EMPTY LINES caused by this doesnot seem like a good idea.

Any one agree with me??

thks in advance

Comment 9 386114578 2017-10-14 09:21:34 UTC
From 36515156e69cb29beb0ca85c750e791c284d0568 Mon Sep 17 00:00:00 2001
From: Fedora dracut team <dracut-maint>
Date: Sat, 14 Oct 2017 17:11:00 +0800
Subject: [PATCH] dracut.sh:fix kernel modules missing in initramfs updating kernel

---
 dracut.sh | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/dracut.sh b/dracut.sh
index f4e8509..88b2489 100755
--- a/dracut.sh
+++ b/dracut.sh
@@ -1135,11 +1135,15 @@ if [[ $hostonly ]]; then
     done </proc/crypto

     # check /proc/modules
+    # make a copy to read from
+    _tmp_mod_copy="$(mktemp -p "$TMPDIR/" -t host_modules.XXXXXX)"
+    cat /proc/modules >$_tmp_mod_copy
     declare -A host_modules
     while read m rest; do
         [ -z "$m" ] && continue
         host_modules["$m"]=1
-    done </proc/modules
+    done <$_tmp_mod_copy
+    [[ $keep ]] || rm -f $_tmp_mod_copy
 fi

 unset m
--
1.8.3.1

Comment 10 Harald Hoyer 2017-10-18 12:52:51 UTC
Whats the difference? the "cat" is only making the time window smaller.

Comment 11 Konstantin Khorenko 2017-10-19 08:27:20 UTC
i highly support 386114578, the window becomes MUCH smaller, for example my reproducer does not trigger anymore after similar patch while checking for zero lines brought us corrupted initrd images in _real life_.

/proc/modules prints data with module_mutex taken, so race happens only in case "cat" reads several times. Unfortunately /proc/modules is big enough and yes, "cat" has to read 2-3 times. Comparing to 100+ times when you read line by line.

If you want even more safe method to get a consistent /proc/modules, you can read twice and compare results: if both files are unique - it is consistent, otherwise, read /proc/modules again until you get 2 reads which produce same result.

Comment 12 Harald Hoyer 2017-10-19 09:22:38 UTC
any objections against this one?

diff --git a/dracut.sh b/dracut.sh
index f4e85095..3e31e15b 100755
--- a/dracut.sh
+++ b/dracut.sh
@@ -1136,10 +1136,11 @@ if [[ $hostonly ]]; then
 
     # check /proc/modules
     declare -A host_modules
-    while read m rest; do
+    ls -1 /sys/module > "$DRACUT_TMPDIR/host_modules"
+    while read m _; do
         [ -z "$m" ] && continue
         host_modules["$m"]=1
-    done </proc/modules
+    done < "$DRACUT_TMPDIR/host_modules"
 fi
 
 unset m

Comment 13 Konstantin Khorenko 2017-10-19 11:47:15 UTC
This is a good way!

1. i believe "# check /proc/modules" comment should be changed as well

2. in my case /sys/module contains more lines than in /proc/modules
[root@localhost ~]# cat /proc/modules | wc
    127     762    6728
[root@localhost ~]# ls -1 /sys/module | wc
    193     193    1844

/sys/module contains some entries which are not modules like "battery",
they are just compiled into the kernel AFAIK.

If dracut handles correctly module names which wont be found later - i'm fine with the patch.

Comment 14 Harald Hoyer 2017-10-20 15:18:55 UTC
/sys/module also contains compiled-in modules, so it's even better, because the new kernel could have changed from builtin to real modules.

Comment 15 Christian Horn 2018-05-15 04:57:23 UTC
bz1578222 has been opened, overlapping partly with this bz, also looking at the behaviour of #12 .

Comment 16 Daniel Vacek 2018-05-29 18:26:19 UTC
(In reply to Harald Hoyer from comment #14)
> /sys/module also contains compiled-in modules, so it's even better, because
> the new kernel could have changed from builtin to real modules.

How is that relevant here??? Anyways, it's not needed really. 386114578 is right. Moreover, the window with using cat is not just much smaller. There is no window at all as the content of /proc/modules file fits just fine in one go (cat uses 64k buffer which should be enough for everyone). If the size of the output was even bigger, we can still use 'dd bs=128k if=/proc/modules' (or bs=2M or whatever we consider safe). I came up with following solution before reading this BZ:

~~~~
--- /bin/dracut	2017-06-28 16:55:08.000000000 +0200
+++ /bin/dracut.fixed	2018-05-29 17:54:26.557506759 +0200
@@ -1137,9 +1137,8 @@
     # check /proc/modules
     declare -A host_modules
     while read m rest; do
-        [ -z "$m" ] && continue
         host_modules["$m"]=1
-    done </proc/modules
+    done < <(cat /proc/modules)
 fi
 
 unset m
~~~~

The trick here is to read the file from kernel in one go instead of reading it line by line while it changes.

The pipe kinda emulates a temporal copy, which *is* then read line by line by shell, but it does not change anymore as the copy is static, right?

This should be safe and make things stable. No need to use 'ls -1 /sys/module/' which actually lists also additional modules that are compiled in to kernel.

Comment 17 Konstantin Khorenko 2018-05-30 07:31:03 UTC
(In reply to Daniel Vacek from comment #16)
> There is no window at all as the content of /proc/modules file
> fits just fine in one go (cat uses 64k buffer which should be enough for
> everyone).

Kernel provides data per-page for /proc files, so cat reads several times:

strace -f -o cat_strace.log cat /proc/modules

2981  open("/proc/modules", O_RDONLY)   = 3
2981  fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
2981  fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
2981  read(3, "binfmt_misc 17468 1 - Live 0xfff"..., 65536) = 4050
2981  write(1, "binfmt_misc 17468 1 - Live 0xfff"..., 4050) = 4050
2981  read(3, "syscopyarea 12529 1 drm_kms_help"..., 65536) = 1250
2981  write(1, "syscopyarea 12529 1 drm_kms_help"..., 1250) = 1250
2981  read(3, "", 65536)                = 0

Comment 18 Konstantin Khorenko 2018-05-30 07:36:18 UTC
(In reply to Daniel Vacek from comment #16)
> This should be safe and make things stable. No need to use 'ls -1
> /sys/module/' which actually lists also additional modules that are compiled
> in to kernel.

Although usually it does not matter, handling modules which are compiled to the kernel during initramfs creation for _another_ kernel _may be_ useful.

Imagine you have a kernel booted with all modules compiled in - including disk driver $DRIVER you need to boot the system.
And now you install a kernel with $DRIVER compiled as a module.

You assume $DRIVER is put to initramfs image for new kernel, but... if you use /proc/modules, you don't have $DRIVER listed there and thus it won't be put into initramfs for new kernel.
And if you read /sys/module, $DRIVER will be found and put into initramfs.

Though it's rare case i agree.

Comment 19 Christian Horn 2018-05-30 07:50:06 UTC
Created attachment 1445708 [details]
Implementation of the approach in #16, fails quite fast in my KVM guest here

Looked at an implementation of "done < <(cat /proc/modules)", it fails much faster than an implementation of "ls -1 /sys/module".

Just for completeness: also an implementation with "ls -1 /sys/module" has a slight chance to fail, as appropriate fixes this seems to work: 
upstream https://lkml.org/lkml/2018/5/21/148 / RHEL bz1581565

Comment 20 Daniel Vacek 2018-05-30 09:21:48 UTC
(In reply to Konstantin Khorenko from comment #17)
> (In reply to Daniel Vacek from comment #16)
> > There is no window at all as the content of /proc/modules file
> > fits just fine in one go (cat uses 64k buffer which should be enough for
> > everyone).
> 
> Kernel provides data per-page for /proc files, so cat reads several times:
> 
> strace -f -o cat_strace.log cat /proc/modules
> 
> 2981  open("/proc/modules", O_RDONLY)   = 3
> 2981  fstat(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
> 2981  fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
> 2981  read(3, "binfmt_misc 17468 1 - Live 0xfff"..., 65536) = 4050
> 2981  write(1, "binfmt_misc 17468 1 - Live 0xfff"..., 4050) = 4050
> 2981  read(3, "syscopyarea 12529 1 drm_kms_help"..., 65536) = 1250
> 2981  write(1, "syscopyarea 12529 1 drm_kms_help"..., 1250) = 1250
> 2981  read(3, "", 65536)                = 0

Oh. That's unfortunate. When I was testing I only saw one 64k read. But the content fitted into the first page on my test VM, so I didn't notice kernel will split it to fit a page. So I'm taking it back, using /proc/modules is fundamentally flawed for this purpose (unless you read it multiple times until you get two matching outputs) and /sys/module sounds a safer approach.

(In reply to Konstantin Khorenko from comment #18)
> (In reply to Daniel Vacek from comment #16)
> > This should be safe and make things stable. No need to use 'ls -1
> > /sys/module/' which actually lists also additional modules that are compiled
> > in to kernel.
> 
> Although usually it does not matter, handling modules which are compiled to
> the kernel during initramfs creation for _another_ kernel _may be_ useful.
> 
> Imagine you have a kernel booted with all modules compiled in - including
> disk driver $DRIVER you need to boot the system.
> And now you install a kernel with $DRIVER compiled as a module.
> 
> You assume $DRIVER is put to initramfs image for new kernel, but... if you
> use /proc/modules, you don't have $DRIVER listed there and thus it won't be
> put into initramfs for new kernel.
> And if you read /sys/module, $DRIVER will be found and put into initramfs.
> 
> Though it's rare case i agree.

You're again right, indeed. I was not considering new kernel installation as we were discussing the kdump initramfs case. That one is usually built for running kernel, so it always matches. My bad, sorry.