*sudo* no longer blocks `poll()` for `/dev/ptmx`
Previously, when running a command through *sudo* that had the I/O logging enabled, a parent process of the command was occasionally blocked in the `poll()` function execution, waiting for an event on the `/dev/ptmx` file descriptor. Consequently, a deadlock occurred and *sudo* might leave the process of the command in an unresponsive state. This update adds a pseudoterminal cleanup logic, and sudo no longer causes a deadlock in the described scenario.
Description of problem:
When running a command in sudo that has iolog enabled, there is a chance that after command termination, sudo (parent process of the command) will block in poll() waiting for a file descriptor of /dev/ptmx (master pty) being ready to be read from.
Sudo may also leave the command in Z (Zombie) state.
How reproducible:
Easily, after few iterations of the Reproducer.
On some systems this is easier than others, I've been lucky with a KVM virtual machine.
Steps to Reproduce:
1. Enable iolog in sudoers, using visudo add:
Defaults iolog_dir=/var/log/sudo-io/%{user}, log_input, log_output, compress_io
2. Make sure you have at least one user that can run sudo (root is enabled by default, you can use root).
3. Run a command in sudo that prints long output to the terminal, you may not be lucky for the first time - I've added a loop:
# while true; do sudo cat /var/log/messages; sleep .1; done
4. Wait for the command to block.
Actual results:
sudo blocks with the following backtrace:
(gdb)
#0 0x00007f425fa3aa20 in __poll_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1 0x00007f425ff1f675 in poll (__timeout=<optimized out>, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:46
#2 sudo_ev_scan_impl (base=base@entry=0x5564dfd55000, flags=flags@entry=0) at ./event_poll.c:155
#3 0x00007f425ff18c9e in sudo_ev_loop_v1 (base=base@entry=0x5564dfd55000, flags=flags@entry=0) at ./event.c:302
#4 0x00005564deec95ba in del_io_events (nonblocking=nonblocking@entry=false) at ./exec_pty.c:1045
#5 0x00005564deecd142 in pty_close (cstat=cstat@entry=0x7ffecfa6e8e0) at ./exec_pty.c:895
#6 0x00005564deec8917 in sudo_execute (details=details@entry=0x5564df0e1fa0 <command_details>, cstat=cstat@entry=0x7ffecfa6e8e0) at ./exec.c:537
#7 0x00005564deed3222 in run_command (details=0x5564df0e1fa0 <command_details>) at ./sudo.c:1161
#8 0x00005564deec589c in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ./sudo.c:311
---
lib/util/event_poll.c:
151 } else {
152 timeout = (flags & SUDO_EVLOOP_NONBLOCK) ? 0 : -1;
153 }
154
155 nready = poll(base->pfds, base->pfd_high + 1, timeout);
156 sudo_debug_printf(SUDO_DEBUG_INFO, "%s: %d fds ready", __func__, nready);
157 switch (nready) {
In our case:
(gdb) print *base->pfds
$2 = {fd = 14, events = 1, revents = 0}
(gdb) info os files
5063 sudo 14 /dev/ptmx
Expected results:
sudo does not block, terminates completely, leaves no zombies behind.
Additional info:
The problem appears to happen only when write buffer is non-empty, is being flushed and it's file descriptor becomes ready after pty_close() has been called, del_io_event is already called in blocking mode and so is sudo_ev_loop (exec_pty.c:1045).
I was able to reproduce the issue in Fedora 27 and even Rawhide.
The problem appears to be that the write_callback() schedules a reader for the master PTY file descriptor after flushing the writing buffers. The subsequent poll() that happens with a negative timeout value (-1) makes it blocking forever.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHEA-2018:3199
Description of problem: When running a command in sudo that has iolog enabled, there is a chance that after command termination, sudo (parent process of the command) will block in poll() waiting for a file descriptor of /dev/ptmx (master pty) being ready to be read from. Sudo may also leave the command in Z (Zombie) state. How reproducible: Easily, after few iterations of the Reproducer. On some systems this is easier than others, I've been lucky with a KVM virtual machine. Steps to Reproduce: 1. Enable iolog in sudoers, using visudo add: Defaults iolog_dir=/var/log/sudo-io/%{user}, log_input, log_output, compress_io 2. Make sure you have at least one user that can run sudo (root is enabled by default, you can use root). 3. Run a command in sudo that prints long output to the terminal, you may not be lucky for the first time - I've added a loop: # while true; do sudo cat /var/log/messages; sleep .1; done 4. Wait for the command to block. Actual results: sudo blocks with the following backtrace: (gdb) #0 0x00007f425fa3aa20 in __poll_nocancel () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f425ff1f675 in poll (__timeout=<optimized out>, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:46 #2 sudo_ev_scan_impl (base=base@entry=0x5564dfd55000, flags=flags@entry=0) at ./event_poll.c:155 #3 0x00007f425ff18c9e in sudo_ev_loop_v1 (base=base@entry=0x5564dfd55000, flags=flags@entry=0) at ./event.c:302 #4 0x00005564deec95ba in del_io_events (nonblocking=nonblocking@entry=false) at ./exec_pty.c:1045 #5 0x00005564deecd142 in pty_close (cstat=cstat@entry=0x7ffecfa6e8e0) at ./exec_pty.c:895 #6 0x00005564deec8917 in sudo_execute (details=details@entry=0x5564df0e1fa0 <command_details>, cstat=cstat@entry=0x7ffecfa6e8e0) at ./exec.c:537 #7 0x00005564deed3222 in run_command (details=0x5564df0e1fa0 <command_details>) at ./sudo.c:1161 #8 0x00005564deec589c in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at ./sudo.c:311 --- lib/util/event_poll.c: 151 } else { 152 timeout = (flags & SUDO_EVLOOP_NONBLOCK) ? 0 : -1; 153 } 154 155 nready = poll(base->pfds, base->pfd_high + 1, timeout); 156 sudo_debug_printf(SUDO_DEBUG_INFO, "%s: %d fds ready", __func__, nready); 157 switch (nready) { In our case: (gdb) print *base->pfds $2 = {fd = 14, events = 1, revents = 0} (gdb) info os files 5063 sudo 14 /dev/ptmx Expected results: sudo does not block, terminates completely, leaves no zombies behind. Additional info: The problem appears to happen only when write buffer is non-empty, is being flushed and it's file descriptor becomes ready after pty_close() has been called, del_io_event is already called in blocking mode and so is sudo_ev_loop (exec_pty.c:1045). I was able to reproduce the issue in Fedora 27 and even Rawhide.