Bug 1473419 - dracut generates initramfs with error messages when adding kernel modules: race with modules unload
dracut generates initramfs with error messages when adding kernel modules: ra...
Status: NEW
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: dracut (Show other bugs)
7.3
All Linux
unspecified Severity medium
: rc
: ---
Assigned To: Lukáš Nykrýn
Release Test Team
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-20 14:43 EDT by Konstantin Khorenko
Modified: 2017-07-24 07:28 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Konstantin Khorenko 2017-07-20 14:43:13 EDT
Description of problem:

RHEL7.3 latest

Creating an initramfs after kernel creation sometimes trigger dracut error:
===
Jul 19 16:25:05 bgr-st003 bash: /sbin/dracut: line 1135: host_modules["$m"]: bad array subscript
===

This is easily reproducible with the script:
# cat read-modules.sh
#!/bin/bash

declare -A host_modules

while read m rest; do
        host_modules["$m"]=1
done </proc/modules

============
[shell 1] # while : ; do modprobe loop; rmmod loop; done

[shell 2] # while : ; do ./read-modules.sh; done
./read-modules.orig.sh: line 7: host_modules["$m"]: bad array subscript
./read-modules.orig.sh: line 7: host_modules["$m"]: bad array subscript
./read-modules.orig.sh: line 7: host_modules["$m"]: bad array subscript
...


i've found the bug for this and the proposed patch:
https://bugzilla.redhat.com/show_bug.cgi?id=1405025

--- base/dracut	2016-12-15 20:06:09.000000000 +0800
+++ new/dracut	2016-12-15 20:06:56.000000000 +0800
@@ -1105,6 +1105,7 @@
     # check /proc/modules
     declare -A host_modules
     while read m rest; do
+        [ -z "$m" ] && continue
         host_modules["$m"]=1
     done </proc/modules
 fi


The place is definitely correct, but the patch does not truly fix the problem,
even with this patch a race with modules load/unload can easily led to some modules
mistakenly skipped by dracut which can lead to the node failure to boot later.

Why modules can be skipped? Because of the nature how bash reads data from the file:

# strace -f -s 1024 -o strace.log /bin/bash -c  "while : ; do ./read-modules.sh ; done "


165340 read(0, "mbcache 15006 1 ext4, Live 0xffffffffa01d6000\njbd2 102945 1 ext4, Live 0xffffffffa01db000\nsd_mod 46322 4 - Live 0xffffffffa01c90", 128) = 128
165340 lseek(0, -82, SEEK_CUR)          = 4396
165340 ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffc0e4a17b0) = -1 ENOTTY (Inappropriate ioctl for device)
165340 lseek(0, 0, SEEK_CUR)            = 4396
165340 read(0, "mbcache 15006 1 ext4, Live 0xffffffffa01d6000\njbd2 102945 1 ext4, Live 0xffffffffa01db000\nsd_mod 46322 4 - Live 0xffffffffa01c90", 128) = 128
165340 lseek(0, -82, SEEK_CUR)          = 4442
165340 ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffc0e4a17b0) = -1 ENOTTY (Inappropriate ioctl for device)
165340 lseek(0, 0, SEEK_CUR)            = 4442
165340 read(0, "_mod 46322 4 - Live 0xffffffffa01c9000\ncrc_t10dif 12714 1 sd_mod, Live 0xffffffffa0083000\ncrct10dif_generic 12647 1 - Live 0xfff", 128) = 128
165340 lseek(0, -89, SEEK_CUR)          = 4481
165340 ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffc0e4a17b0) = -1 ENOTTY (Inappropriate ioctl for device)
165340 lseek(0, 0, SEEK_CUR)            = 4481
165340 read(0, "\nsd_mod 46322 4 - Live 0xffffffffa01c9000\ncrc_t10dif 12714 1 sd_mod, Live 0xffffffffa0083000\ncrct10dif_generic 12647 1 - Live 0x", 128) = 128
165340 lseek(0, -127, SEEK_CUR)         = 4482
...
165340 write(2, "./read-modules.sh: line 7: host_modules[\"$m\"]: bad array subscript\n", 72) = 72

Please implement reading whole /proc/modules file at once to avoid possible races.

Thank you.

--
Virtuozzo Kernel Team
Comment 2 Konstantin Khorenko 2017-07-20 14:49:02 EDT
A proposed patch from Denis Silakov:

diff -Naur dracut-033.orig/dracut.sh dracut-033/dracut.sh
--- dracut-033.orig/dracut.sh	2017-07-20 19:38:28.107109060 +0300
+++ dracut-033/dracut.sh	2017-07-20 19:38:56.232107914 +0300
@@ -1131,9 +1131,13 @@
 
     # check /proc/modules
     declare -A host_modules
+    cp /proc/modules /tmp/mod$$
+
     while read m rest; do
         host_modules["$m"]=1
-    done </proc/modules
+    done </tmp/mod$$
+
+    rm -f /tmp/mod$$
 fi
 
 unset m

Note You need to log in before you can comment on or make changes to this bug.