Bug 1473411

Summary: gdb leaks ignored SIGPIPE to child process
Product: Red Hat Enterprise Linux 7 Reporter: Eric Blake <eblake>
Component: gdbAssignee: Jan Kratochvil <jan.kratochvil>
Status: CLOSED ERRATA QA Contact: Michal Kolar <mkolar>
Severity: unspecified Docs Contact: Vladimír Slávik <vslavik>
Priority: unspecified    
Version: 7.4CC: gdb-bugs, jan.kratochvil, mcermak, mkolar, ohudlick, palves, sergiodj, vslavik
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: gdb-7.6.1-107.el7 Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
: 1884577 1884578 (view as bug list) Environment:
Last Closed: 2018-04-10 10:25:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1471969, 1531838    

Description Eric Blake 2017-07-20 18:11:52 UTC
Description of problem:
While trying to validate another bug (1468107) regarding whether a process was properly calling signal(SIGPIPE,SIG_IGN), I ran into a weird issue.  I was running the program under test inside gdb (in order to use a breakpoint at an opportune moment to make it easy to trigger SIGPIPE), and was able to get gdb to admit that SIGPIPE was indeed encountered on both my machine and the QE machine. But on my machine, the process died from SIGPIPE, while on the QE machine, the process ignored SIGPIPE, got EPIPE as its read() result, and exited normally.

I finally traced the difference: the QE machine was using an older version of gdb; and apparently, the older version leaks an ignored SIGPIPE to child processes, even when gdb itself was started with SIGPIPE at default status.

Version-Release number of selected component (if applicable):
qemu-img-1.5.3-140.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. First, prove that starting child processes indeed defaults to SIGPIPE at default status:
$ sleep 30&
$ grep SigIgn /proc/$!/status
$ kill %1

2. Next, note that gdb itself requests SIGPIPE to be ignored, but the real test is learning whether SIGPIPE is reverted back to default for the child process:
$ gdb --args sleep 30
(gdb) ! grep SigIgn /proc/self/status
(gdb) b main
(gdb) r
(gdb) info thread
(gdb) ! grep SigIgn /proc/XXX/status  # Fill in XXX based on the info thread results

Actual results:
1. For both machines I tested, I get:
$ sleep 30&
[1] 26872
$ grep SigIgn /proc/$!/status
SigIgn:	0000000000000000
$ kill %1
[1]+  Terminated              sleep 30

So I'm in an environment where children start life with SIGPIPE set to SIG_DFL.

2. On the QE machine, I saw:
$ gdb --args sleep 30
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-100.el7
...
(gdb) ! grep SigIgn /proc/self/status
SigIgn:	0000000001001000
(gdb) b main
Breakpoint 1 at 0x401510
(gdb) r
Starting program: /usr/bin/sleep 30

Breakpoint 1, 0x0000000000401510 in main ()
(gdb) info thread
  Id   Target Id         Frame 
* 1    process 4885 "sleep" 0x0000000000401510 in main ()
(gdb) ! grep SigIgn /proc/4885/status
SigIgn:	0000000001001000
(gdb) 

Ouch - the child process is starting with SigIgn equal to gdb's environment.

Expected results:
On my working machine, I see:

GNU gdb (GDB) Fedora 8.0-13.fc26
...
(gdb) info thread
  Id   Target Id         Frame 
* 1    process 26798 "sleep" 0x0000555555555590 in main ()
(gdb) ! grep SigIgn /proc/26798/status
SigIgn:	0000000000000000


Additional info:
I didn't see anything in gdb 8's /usr/share/doc/gdb/NEWS that mentions that this was an intentional bug fix, but obviously newer gdb is being smarter about restoring the signals that it hands to the child process.  As it cost me some debug time to figure out why QE couldn't reproduce my test, it would be good to backport the fix to RHEL gdb.

Comment 2 Eric Blake 2017-07-20 18:22:31 UTC
The all-zeros SigIgn is a bit weak. You really want to make sure that if gdb starts with something ignored, then the child process ALSO gets started with that same signal ignored.  With working gdb, here's a slightly stronger test:

$ ( trap - BUS; sleep 30& grep SigIgn /proc/$!/status )
SigIgn:	0000000000000006
$ ( trap '' BUS; sleep 30& grep SigIgn /proc/$!/status )
SigIgn:	0000000000000046

Proof that we can control whether 0x40 is ignored during sleep using just the shell (the shell also ignores SIGINT and SIGQUIT during background tasks; that's life, hence why I used SIGBUS to show what I'm actually toggling)

$ ( trap '' BUS; gdb --args sleep 30 )
GNU gdb (GDB) Fedora 8.0-13.fc26
...
(gdb) ! grep SigIgn /proc/self/status
SigIgn:	0000000001001040

# So gdb started with SIGBUS ignored, and additionally ignores SIGPIPE itself...

(gdb) b main
Breakpoint 1 at 0x1590
(gdb) r
Starting program: /usr/bin/sleep 30

Breakpoint 1, 0x0000555555555590 in main ()
(gdb) info thread
  Id   Target Id         Frame 
* 1    process 27153 "sleep" 0x0000555555555590 in main ()
(gdb) ! grep SigIgn /proc/27153/status
SigIgn:	0000000000000040

# ...but the child process sees JUST SIGBUS ignored (ie. the exact environment that we started gdb with, prior to gdb fudging things around)

Comment 4 Pedro Alves 2017-07-21 10:06:57 UTC
> I didn't see anything in gdb 8's /usr/share/doc/gdb/NEWS that mentions that 
> this was an intentional bug fix, but obviously newer gdb is being smarter 
> about restoring the signals that it hands to the child process.

Intentional:
 https://sourceware.org/ml/gdb-patches/2016-08/msg00086.html

Comment 5 Pedro Alves 2017-07-21 10:22:13 UTC
> (gdb) ! grep SigIgn /proc/self/status
> SigIgn:	0000000001001040
> 
> # So gdb started with SIGBUS ignored, and additionally ignores SIGPIPE itself...

GDB should probably be transparent here (!/shell command) too, I guess.
But that'd be a separate bug.

Comment 6 Jan Kratochvil 2017-10-18 19:44:55 UTC
The fix is only a backport of:
  http://sourceware.org/bugzilla/show_bug.cgi?id=18653

It does not deal with Comment 5 issue.

Comment 8 Michal Kolar 2017-11-03 06:54:57 UTC
Seems not resolved. Problem described by reporter is still present in gdb-7.6.1-104.el7.
Spawned process still inherit mask of ignored signals from gdb.

Please review. Thanks.

Comment 9 Jan Kratochvil 2017-11-03 19:09:33 UTC
Sorry, a backport mistake.  Unfortunately it did compile fine and I did not verify it myself.

Comment 11 Michal Kolar 2017-11-23 13:49:13 UTC
Reproduced against gdb-7.6.1-100.el7 and verified against gdb-7.6.1-107.el7.

Comment 14 errata-xmlrpc 2018-04-10 10:25:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0701