Bug 1560657

Summary: sudo blocks in poll() for /dev/ptmx with iolog enabled
Product: Red Hat Enterprise Linux 7 Reporter: Stepan Broz <sbroz>
Component: sudoAssignee: Daniel Kopeček <dkopecek>
Status: CLOSED ERRATA QA Contact: Dalibor Pospíšil <dapospis>
Severity: high Docs Contact: Mirek Jahoda <mjahoda>
Priority: urgent    
Version: 7.4CC: dapospis, dkopecek, fkrska, mjahoda, mmatsuya, rmetrich
Target Milestone: rcKeywords: Patch, Regression, Reproducer, Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
*sudo* no longer blocks `poll()` for `/dev/ptmx` Previously, when running a command through *sudo* that had the I/O logging enabled, a parent process of the command was occasionally blocked in the `poll()` function execution, waiting for an event on the `/dev/ptmx` file descriptor. Consequently, a deadlock occurred and *sudo* might leave the process of the command in an unresponsive state. This update adds a pseudoterminal cleanup logic, and sudo no longer causes a deadlock in the described scenario.
Story Points: ---
Clone Of:
: 1582154 1582155 (view as bug list) Environment:
Last Closed: 2018-10-30 11:09:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1547974    
Bug Blocks: 1549635, 1582154, 1582155    

Description Stepan Broz 2018-03-26 16:37:52 UTC
Description of problem:

When running a command in sudo that has iolog enabled, there is a chance that after command termination, sudo (parent process of the command) will block in poll() waiting for a file descriptor of /dev/ptmx (master pty) being ready to be read from.

Sudo may also leave the command in Z (Zombie) state.

How reproducible:
Easily, after few iterations of the Reproducer.
On some systems this is easier than others, I've been lucky with a KVM virtual machine.


Steps to Reproduce:
1. Enable iolog in sudoers, using visudo add:

Defaults    iolog_dir=/var/log/sudo-io/%{user}, log_input, log_output, compress_io

2. Make sure you have at least one user that can run sudo (root is enabled by default, you can use root).

3. Run a command in sudo that prints long output to the terminal, you may not be lucky for the first time - I've added a loop:

# while true; do sudo cat /var/log/messages; sleep .1; done

4. Wait for the command to block.

Actual results:
sudo blocks with the following backtrace:

(gdb) 
#0  0x00007f425fa3aa20 in __poll_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007f425ff1f675 in poll (__timeout=<optimized out>, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:46
#2  sudo_ev_scan_impl (base=base@entry=0x5564dfd55000, flags=flags@entry=0) at ./event_poll.c:155
#3  0x00007f425ff18c9e in sudo_ev_loop_v1 (base=base@entry=0x5564dfd55000, flags=flags@entry=0) at ./event.c:302
#4  0x00005564deec95ba in del_io_events (nonblocking=nonblocking@entry=false) at ./exec_pty.c:1045
#5  0x00005564deecd142 in pty_close (cstat=cstat@entry=0x7ffecfa6e8e0) at ./exec_pty.c:895
#6  0x00005564deec8917 in sudo_execute (details=details@entry=0x5564df0e1fa0 <command_details>, cstat=cstat@entry=0x7ffecfa6e8e0) at ./exec.c:537
#7  0x00005564deed3222 in run_command (details=0x5564df0e1fa0 <command_details>) at ./sudo.c:1161
#8  0x00005564deec589c in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ./sudo.c:311

---
lib/util/event_poll.c:

151     } else {
152         timeout = (flags & SUDO_EVLOOP_NONBLOCK) ? 0 : -1;
153     }
154 
155     nready = poll(base->pfds, base->pfd_high + 1, timeout);
156     sudo_debug_printf(SUDO_DEBUG_INFO, "%s: %d fds ready", __func__, nready);
157     switch (nready) {

In our case:
(gdb) print *base->pfds
$2 = {fd = 14, events = 1, revents = 0}

(gdb) info os files
5063       sudo       14         /dev/ptmx 

Expected results:
sudo does not block, terminates completely, leaves no zombies behind.

Additional info:

The problem appears to happen only when write buffer is non-empty, is being flushed and it's file descriptor becomes ready after pty_close() has been called, del_io_event is already called in blocking mode and so is sudo_ev_loop (exec_pty.c:1045).

I was able to reproduce the issue in Fedora 27 and even Rawhide.

Comment 2 Stepan Broz 2018-03-26 16:52:51 UTC
Created attachment 1413242 [details]
This patch prevents the blocking poll()

I was able to avoid the blocking poll using this patch.

Comment 3 Stepan Broz 2018-03-28 08:55:09 UTC
The problem appears to be that the write_callback() schedules a reader for the master PTY file descriptor after flushing the writing buffers. The subsequent poll() that happens with a negative timeout value (-1) makes it blocking forever.

Comment 4 Stepan Broz 2018-03-31 09:16:14 UTC
Just yesterday, there was another beta version of sudo released (sudo-1.8.23b2) that contains a better fix. I couldn't reproduce the problem any more.

The patch is: https://bugzilla.sudo.ws/attachment.cgi?id=505&action=diff

Comment 17 Radovan Sroka 2018-06-29 12:40:50 UTC
*** Bug 1549199 has been marked as a duplicate of this bug. ***

Comment 20 errata-xmlrpc 2018-10-30 11:09:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2018:3199