Bug 1911038

Summary: kexec-tools causes 'Broken pipe' error msgs when rpm executes kernel-core posttrans to make dracut generate initramfs
Product: [Fedora] Fedora Reporter: Thorsten Leemhuis <fedora>
Component: kexec-toolsAssignee: Coiby <coxu>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 33CC: agurenko, antoine, bhe, c.fierst, cristian.ciupitu, dominik, dracut-maint-list, emelenas, gabriello.ramirez, glyffa, gustavo, horsley1953, j8tjuh, jjb, joe, jonathan, kasong, lijiang, marco.guazzone, mihai, mteixeira, napping_lunar, pwhalen, robatino, ruyang, rwright, samuel-rhbugs, setha45, sjoerd, stealthcipher, stefan.hoelldampf, stefano.biagiotti, sudo, troy, voj-tech, zbyszek
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kexec-tools-2.0.21-4.fc33 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-01-24 01:26:09 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Commenter 4 attachment none

Description Thorsten Leemhuis 2020-12-27 09:21:53 UTC
Description of problem:
I'm seeing 'Broken pipe' messages when installing `kernel-core` packages. 

Version-Release number of selected component (if applicable):
dracut-050-64.git20200529.fc33.x86_64

How reproducible:
Always – and I heard from other people they noticed the same problem, but I have no idea what triggers it. Google found a report from rawhide, too:
http://fedora.12.x6.nabble.com/some-Rawhide-feedback-td5101465.html

Steps to Reproduce:
```
$ LC_ALL=C sudo rpm -ivh kernel-core-5.9.16-200.fc33.x86_64.rpm 
Verifying...                          ########################################
Preparing...                          ########################################
Updating / installing...
kernel-core-5.9.16-200.fc33           ########################################
sort: fflush failed: 'standard output': Broken pipe
sort: write error

gzip: stdout: Broken pipe

gzip: stdout: Broken pipe
sort: write failed: 'standard output': Broken pipe
sort: write error
```

That error does *not* happen if I call the kernel-core posttrans script from a shell manually.

Additional info:
Tried to track this down and found that it's coming from lsinitramfs. While debugging I added a simple "ls /boot/" to `list_files()`, which resulted in a `ls: write error: Broken pipe` when installing the rpm. That's when I was out of ideas and came here to file the bug.

Disclaimer: I'm having kernels from the kernel-vanilla repo for Fedora (https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories) on my system, but I booted a official Fedora 5.9.16 kernel before filing this bug to see if it happens there, too (which is does).

Comment 1 Troy Curtis 2021-01-05 02:47:51 UTC
Looks like the same bug got reported against the kernel https://bugzilla.redhat.com/show_bug.cgi?id=1911364 but this one seems like it might be closer to the underlying cause. I also encountered this issue when installing kernel-core-5.10.4-200.fc33.x86_64.

Comment 2 Brian Naasz 2021-01-09 02:00:18 UTC
I have been hitting this as well.  Ran 'dnf reinstall kernel-core --rpmverbosity=debug' to get more info on whats actually breaking but I cant take it any farther:


Header V4 RSA/SHA256 Signature, key ID 9570ff31: OK
Header SHA256 digest: OK
Header SHA1 digest: OK
D: Plugin: calling hook psm_pre in selinux plugin
D: %posttrans(kernel-core-5.9.16-200.fc33.x86_64): scriptlet start
D: %posttrans(kernel-core-5.9.16-200.fc33.x86_64): execv(/bin/sh) pid 405489
D: Plugin: calling hook scriptlet_fork_post in selinux plugin
D: setexecfilecon: (/bin/sh, rpm_script_t) 
+ /bin/kernel-install add 5.9.16-200.fc33.x86_64 /lib/modules/5.9.16-200.fc33.x86_64/vmlinuz
sort: fflush failed: 'standard output': Broken pipe
sort: write error

gzip: stdout: Broken pipe

gzip: stdout: Broken pipe
sort: write failed: 'standard output': Broken pipe
sort: write error
D: %posttrans(kernel-core-5.9.16-200.fc33.x86_64): waitpid(405489) rc 405489 status 0

Comment 3 Thorsten Leemhuis 2021-01-13 06:56:57 UTC
Fun fact, TWIMC: seems this bug also made it into RHEL/centos stream; see Bug 1904646, which was marked as dup of Bug 1904066 , which sadly is private.

Comment 4 CoenFierst 2021-01-14 14:24:52 UTC
Dear all,
Still applies to today's Fedora 33 workstation update (kernel-core-5.10.6-200.fc33.x86_64).

I can reboot and use the kernel just fine. Fwiw, attaching update ouput next.

Kind regards,

Coen

Comment 5 CoenFierst 2021-01-14 14:25:30 UTC
Created attachment 1747442 [details]
Commenter 4 attachment

Comment 6 Magnus Glantz 2021-01-15 07:17:35 UTC
(In reply to Thorsten Leemhuis from comment #3)
> Fun fact, TWIMC: seems this bug also made it into RHEL/centos stream; see
> Bug 1904646, which was marked as dup of Bug 1904066 , which sadly is private.

There's no resolution in that bug as of now. I do see see someone found a way to trigger the bug outside of dnf though.

```
Other way to trigger outside of dnf install:
# sh -c 'trap "" PIPE; dracut -f /boot/initramfs-4.18.0-258.el8.x86_64.img 4.18.0-258.el8.x86_64'

gzip: stdout: Broken pipe

gzip: stdout: Broken pipe
sort: write failed: 'standard output': Broken pipe
sort: write error
```

Comment 7 Magnus Glantz 2021-01-15 07:20:48 UTC
(In reply to Thorsten Leemhuis from comment #3)
> Fun fact, TWIMC: seems this bug also made it into RHEL/centos stream; see
> Bug 1904646, which was marked as dup of Bug 1904066 , which sadly is private.

Actually, I see it be written there that the source of the error message likely is Dracut.

More specifically here:
```
  lsinitrd "${DRACUT_TMPDIR}/initramfs.img"| ddebug

...
+ lsinitrd /var/tmp/dracut.mWrMMW/initramfs.img
+ ddebug
+ set +x

gzip: stdout: Broken pipe

gzip: stdout: Broken pipe
sort: write failed: 'standard output': Broken pipe
sort: write error
```

Comment 8 j8tjuh 2021-01-15 09:34:52 UTC
I can confirm that I get this error as well with kernel-core 5.10.6

```
sudo dnf reinstall kernel-core-5.10.6
Last metadata expiration check: 0:04:20 ago on Fri 15 Jan 2021 10:23:01 AM CET.
Dependencies resolved.
===============================================================================================================================================================================================================================================================
 Package                                                        Architecture                                              Version                                                             Repository                                                  Size
===============================================================================================================================================================================================================================================================
Reinstalling:
 kernel-core                                                    x86_64                                                    5.10.6-200.fc33                                                     updates                                                     33 M

Transaction Summary
===============================================================================================================================================================================================================================================================

Total download size: 33 M
Installed size: 74 M
Is this ok [y/N]: y
Downloading Packages:
kernel-core-5.10.6-200.fc33.x86_64.rpm                                                                                                                                                                                          21 MB/s |  33 MB     00:01
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Total                                                                                                                                                                                                                           16 MB/s |  33 MB     00:02
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                        1/1
  Reinstalling     : kernel-core-5.10.6-200.fc33.x86_64     1/2
  Running scriptlet: kernel-core-5.10.6-200.fc33.x86_64     1/2
  Running scriptlet: kernel-core-5.10.6-200.fc33.x86_64     2/2
  Cleanup          : kernel-core-5.10.6-200.fc33.x86_64     2/2
  Running scriptlet: kernel-core-5.10.6-200.fc33.x86_64     2/2
sort: fflush failed: 'standard output': Broken pipe
sort: write error

gzip: stdout: Broken pipe

gzip: stdout: Broken pipe
sort: write failed: 'standard output': Broken pipe
sort: write error

  Verifying        : kernel-core-5.10.6-200.fc33.x86_64                                                                                                                                                                                                    1/2
  Verifying        : kernel-core-5.10.6-200.fc33.x86_64                                                                                                                                                                                                    2/2

Reinstalled:
  kernel-core-5.10.6-200.fc33.x86_64

Complete!
```

Comment 9 Thorsten Leemhuis 2021-01-15 09:44:31 UTC
> I can confirm that I get this error as well with kernel-core 5.10.6

Sure you do, as all kernel-core packages call dracut after installing.

To others: please do not post more confirmations like that; I'm sure they are meant well, but they don't help and just noise that makes things harder to follow here.

Comment 10 Sjoerd Mullender 2021-01-15 10:55:42 UTC
(In reply to Magnus Glantz from comment #7)
> More specifically here:
> ```
>   lsinitrd "${DRACUT_TMPDIR}/initramfs.img"| ddebug
> 
> ...
> + lsinitrd /var/tmp/dracut.mWrMMW/initramfs.img
> + ddebug
> + set +x
> 
> gzip: stdout: Broken pipe
> 
> gzip: stdout: Broken pipe
> sort: write failed: 'standard output': Broken pipe
> sort: write error
> ```

It looks like this ddebug doesn't read its complete input, so the output pipe of lsinitrd gets closed before its done, which causes it to receive a SIGPIPE from which it dies (with an error indication that this happened).  Normally that's ignored by the shell, but with the call to trap '' PIPE that was done, the shell reports the "crash" from lsinitrd.

Comment 11 Sjoerd Mullender 2021-01-15 10:58:59 UTC
(In reply to Sjoerd Mullender from comment #10)
> It looks like this ddebug doesn't read its complete input, so the output
> pipe of lsinitrd gets closed before its done, which causes it to receive a
> SIGPIPE from which it dies (with an error indication that this happened). 
> Normally that's ignored by the shell, but with the call to trap '' PIPE that
> was done, the shell reports the "crash" from lsinitrd.

Try these commands:

cat /boot/vmlinuz* | date
trap "" PIPE
cat /boot/vmlinuz* | date

(no special privileges needed)
date doesn't read its input, and the cat produces loads of output so it gets the SIGPIPE.  The second time, the shell reports the SIGPIPE signal.

Comment 12 Tom Horsley 2021-01-15 14:00:13 UTC
*** Bug 1911364 has been marked as a duplicate of this bug. ***

Comment 13 Gabriel Ramirez 2021-01-16 20:19:06 UTC
Hi,

In my case the error happened to have kexec-tools installed.

after removing kexec-tools and its dependencies and reinstalling the kernel, it completes without error

if you don't have kexec-tools and the error occurs, maybe is another rpm which plugs in the dracut process in the same way.

Gabriel

Comment 14 Alessio 2021-01-17 07:09:29 UTC
*** Bug 1916990 has been marked as a duplicate of this bug. ***

Comment 15 Thorsten Leemhuis 2021-01-17 08:31:05 UTC
(In reply to Gabriel Ramirez from comment #13)

> after removing kexec-tools and its dependencies and reinstalling the kernel,
> it completes without error

Interesting. I tried a different path and on my f33 machine reverted to this build
https://koji.fedoraproject.org/koji/buildinfo?buildID=1581189
Then the error messages vanished.

Which afaics makes this a kexec-bug, introduced somewhere between kexec-tools-2.0.20-17 kexec-tools-2.0.20-21. Reassiging (even if I wonder if dracut should prevent cases like this...)

Comment 16 Dominik 'Rathann' Mierzejewski 2021-01-17 10:41:58 UTC
Both dracut and kexec-tools have their own definition of ddebug():

/usr/lib/dracut/dracut-logger.sh:
...
ddebug() {
    set +x
    dlog 5 "$@"
    [ -n "$debug" ] && set -x || :
}

/usr/lib/kdump/kdump-logger.sh:
...
ddebug() {
    set +x
    dlog 4 "$@"
    [ -n "$debug" ] && set -x || :
}

I haven't dug deeper into the nested function calls, but it seems there's a lot of duplicated bash code between kdump and dracut.

Comment 17 Coiby 2021-01-18 09:17:47 UTC
Hi Thorsten, 

> Which afaics makes this a kexec-bug, introduced somewhere between kexec-tools-2.0.20-17 kexec-tools-2.0.20-21. Reassiging (even if I wonder if dracut should prevent cases like this...)

Thanks for providing this clue! I did a git bisect and the result shows commit 3b743ae6aee36921800629b0fe9e80acc2e6f59f ("enable the logger for kdump") is the first bad commit. I'll post further updates when the root cause is found.

Comment 18 Joseph D. Wagner 2021-01-18 15:02:57 UTC
Confirmed. Downgrading to kexec-tools-2.0.20-17.fc33.x86_64 resolves the issue.

I've added "excludepkgs=kexec-tools" to /etc/dnf/dnf.conf until this is resolved.

Comment 19 Cristian Ciupitu 2021-01-18 17:48:05 UTC
(In reply to Joseph D. Wagner from comment #18)
> Confirmed. Downgrading to kexec-tools-2.0.20-17.fc33.x86_64 resolves ...

Confirmed here, too.

Comment 20 Antoine Gourlay 2021-01-19 15:23:42 UTC
Ah, this is caused by `dlog` (called by all the `dfoo` functions) not consuming its input if the logging level is too low and there is nothing to do. Specifically, at line 254 (https://src.fedoraproject.org/rpms/kexec-tools/blob/f33/f/kdump-logger.sh#_294), the input is read only if the early returns above (line 288/289) are not taken.

Just moving the checks to be done per-line seems to fix the problem:

```
diff --git 1/usr/lib/kdump/kdump-logger.sh 2/usr/lib/kdump/kdump-logger.sh
index 619bc62..8424655 100755
--- 1/usr/lib/kdump/kdump-logger.sh
+++ 2/usr/lib/kdump/kdump-logger.sh
@@ -252,6 +252,9 @@ _dlvl2syslvl() {
 #   - @c DEBUG to @c debug
 _do_dlog() {
     local lvl="$1"; shift
+    [ -z "$kdump_maxloglvl" ] && return 0
+    [[ $lvl -le $kdump_maxloglvl ]] || return 0
+
     local msg="$*"
 
     [[ $lvl -le $kdump_stdloglvl ]] && printf -- 'kdump: %s\n' "$msg" >&2
@@ -285,9 +288,6 @@ _do_dlog() {
 # dwarn "This is a warning"
 # echo "This is a warning" | dwarn
 dlog() {
-    [ -z "$kdump_maxloglvl" ] && return 0
-    [[ $1 -le $kdump_maxloglvl ]] || return 0
-
     if [[ $# -gt 1 ]]; then
         _do_dlog "$@"
     else

```

Or alternatively it could leave the checks in place but read all input into nothing before the `return 0`.

Comment 21 Kairui Song 2021-01-20 02:43:32 UTC
Hi, this is caused by kexec-tools' logger lib overriding dracut's logger functions, I've posted a fix here:
https://lists.fedoraproject.org/archives/list/kexec@lists.fedoraproject.org/thread/O7W7XOMTK5L4YZFHTMYGZSSKDS63FZTN/

It will be fixed in next kexec-tools release.

Comment 22 Fedora Update System 2021-01-22 09:05:19 UTC
FEDORA-2021-574110157b has been submitted as an update to Fedora 33. https://bodhi.fedoraproject.org/updates/FEDORA-2021-574110157b

Comment 23 Fedora Update System 2021-01-23 02:09:49 UTC
FEDORA-2021-574110157b has been pushed to the Fedora 33 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-574110157b`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-574110157b

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 24 Joseph D. Wagner 2021-01-23 15:04:30 UTC
This fix works for me. I commented and upvoted on bodhi.

Comment 25 jpbn 2021-01-23 15:42:44 UTC
close the report?

Comment 26 Fedora Update System 2021-01-24 01:26:09 UTC
FEDORA-2021-574110157b has been pushed to the Fedora 33 stable repository.
If problem still persists, please make note of it in this bug report.