Description of problem: Whilst debugging some network filesystem stuff, I found that if the open directory system call fails with ERESTARTSYS due to a signal, that signal is deferred by userspace and the system call restarted. Unfortunately, this means that an ls that is waiting for a network timeout cannot be cancelled with CTRL-C and cannot be suspended with CTRL-Z. open("/afs/user/dhowells", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} --- rt_sigreturn({mask=[]}) = 2 open("/afs/user/dhowells", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) --- SIGQUIT {si_signo=SIGQUIT, si_code=SI_KERNEL} --- rt_sigreturn({mask=[]}) = 2 open("/afs/user/dhowells", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 fstat(3, {st_mode=S_IFDIR|0777, st_size=2048, ...}) = 0 getdents(3, /* 3 entries */, 32768) = 80 rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT PIPE ALRM TERM TSTP XCPU XFSZ VTALRM PROF IO], [], 8) = 0 rt_sigaction(SIGINT, {SIG_DFL, [INT], SA_RESTORER|SA_RESTART, 0x7f9c95c93b20}, {0x5640f276ce60, [HUP INT QUIT PIPE ALRM TERM TSTP XCPU XFSZ VTALRM PROF IO], SA_RESTORER|SA_RESTART, 0x7f9c95c93b20}, 8) = 0 tgkill(3954, 3954, SIGINT) = 0 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 --- SIGINT {si_signo=SIGINT, si_code=SI_TKILL, si_pid=3954, si_uid=0} --- +++ killed by SIGINT +++ Version-Release number of selected component (if applicable): coreutils-8.24-6.fc23.x86_64 Unfortunately, it's quite hard to get hold of a directory that will hang for a long enough period of time, but if you can it's quite reproducible.
Could you please attach gdb while ls is hanging and paste the backtrace?
#0 0x00007faca7731455 in __opendir (name=0x560b8766cc90 "/afs/user/dhowells") at ../sysdeps/posix/opendir.c:200 #1 0x0000560b8744caed in print_dir (command_line_arg=<optimized out>, realname=0x0, name=<optimized out>) at src/ls.c:2551 #2 main (argc=<optimized out>, argv=<optimized out>) at src/ls.c:1457
Does it work as expected with --color=none or if stdout is redirected off terminal?
ls --color=none can be interrupted, as can ls >/tmp/foo. A subsequent ls without anything special will then still be uninterruptible until the network timeout occurs.
If ls prints to a terminal, it postpones processing of signals to always leave terminal in a consistent state when being terminated or suspended. A possible solution would be to disable SA_RESTART before calling opendir() and process signals synchronously in an EINTR loop. Nevertheless, I am afraid we will hit the same issue with subsequent stat() calls if the -l option is used.
reported upstream: http://lists.gnu.org/archive/html/bug-coreutils/2016-08/msg00010.html
Created attachment 1198255 [details] [PATCH] ls: postpone installation of signal handlers ... until they are actually needed. That is right before the first escape sequence is printed to a terminal.
Could you please test the attached patch? Does it solve the problem for you? I can prepare a scratch build for Fedora release of your choice if needed...
The patch works for me. If I make afs_open() in the kernel sleep till a signal happens than then return -ERESTARTSYS, the old ls just restarts the open() system call when I press CTRL-C, but the patched ls is correctly killed by SIGINT.
Thank you for testing the patch! I have proposed it to cureutils upstream: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=24232#17
Upstream patch: http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=v8.25-63-g5445f78
Thanks for the upstream commit! Included in coreutils-8.25-15.fc26...
coreutils-8.25-15.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-be13b257bb
coreutils-8.25-7.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-75adc7da4f
coreutils-8.25-15.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-be13b257bb
coreutils-8.25-7.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-75adc7da4f
coreutils-8.25-15.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.
coreutils-8.25-7.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.