Bug 1365933

Summary: The ls program can't be interrupted by a signal
Product: [Fedora] Fedora Reporter: David Howells <dhowells>
Component: coreutilsAssignee: Kamil Dudka <kdudka>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 24CC: admiller, dhowells, jamartis, kdudka, kzak, ooprala, ovasik, p, twaugh
Target Milestone: ---Keywords: Patch
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: coreutils-8.25-15.fc26 coreutils-8.25-15.fc25 coreutils-8.25-7.fc24 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1421802 (view as bug list) Environment:
Last Closed: 2016-11-05 09:40:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1421802    
Attachments:
Description Flags
[PATCH] ls: postpone installation of signal handlers none

Description David Howells 2016-08-10 14:30:01 UTC
Description of problem:

Whilst debugging some network filesystem stuff, I found that if the open directory system call fails with ERESTARTSYS due to a signal, that signal is deferred by userspace and the system call restarted.  Unfortunately, this means that an ls that is waiting for a network timeout cannot be cancelled with CTRL-C and cannot be suspended with CTRL-Z.

open("/afs/user/dhowells", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} ---
rt_sigreturn({mask=[]})                 = 2
open("/afs/user/dhowells", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGQUIT {si_signo=SIGQUIT, si_code=SI_KERNEL} ---
rt_sigreturn({mask=[]})                 = 2
open("/afs/user/dhowells", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFDIR|0777, st_size=2048, ...}) = 0
getdents(3, /* 3 entries */, 32768)     = 80
rt_sigprocmask(SIG_BLOCK, [HUP INT QUIT PIPE ALRM TERM TSTP XCPU XFSZ VTALRM PROF IO], [], 8) = 0
rt_sigaction(SIGINT, {SIG_DFL, [INT], SA_RESTORER|SA_RESTART, 0x7f9c95c93b20}, {0x5640f276ce60, [HUP INT QUIT PIPE ALRM TERM TSTP XCPU XFSZ VTALRM PROF IO], SA_RESTORER|SA_RESTART, 0x7f9c95c93b20}, 8) = 0
tgkill(3954, 3954, SIGINT)              = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
--- SIGINT {si_signo=SIGINT, si_code=SI_TKILL, si_pid=3954, si_uid=0} ---
+++ killed by SIGINT +++

Version-Release number of selected component (if applicable):

coreutils-8.24-6.fc23.x86_64

Unfortunately, it's quite hard to get hold of a directory that will hang for a long enough period of time, but if you can it's quite reproducible.

Comment 1 Kamil Dudka 2016-08-11 08:49:04 UTC
Could you please attach gdb while ls is hanging and paste the backtrace?

Comment 2 David Howells 2016-08-11 09:52:25 UTC
#0  0x00007faca7731455 in __opendir (name=0x560b8766cc90 "/afs/user/dhowells")
    at ../sysdeps/posix/opendir.c:200
#1  0x0000560b8744caed in print_dir (command_line_arg=<optimized out>, 
    realname=0x0, name=<optimized out>) at src/ls.c:2551
#2  main (argc=<optimized out>, argv=<optimized out>) at src/ls.c:1457

Comment 3 Kamil Dudka 2016-08-11 11:23:12 UTC
Does it work as expected with --color=none or if stdout is redirected off terminal?

Comment 4 David Howells 2016-08-11 11:51:47 UTC
ls --color=none can be interrupted, as can ls >/tmp/foo.  A subsequent ls without anything special will then still be uninterruptible until the network timeout occurs.

Comment 5 Kamil Dudka 2016-08-11 12:22:48 UTC
If ls prints to a terminal, it postpones processing of signals to always leave terminal in a consistent state when being terminated or suspended.  A possible solution would be to disable SA_RESTART before calling opendir() and process signals synchronously in an EINTR loop.  Nevertheless, I am afraid we will hit the same issue with subsequent stat() calls if the -l option is used.

Comment 6 Kamil Dudka 2016-08-15 14:05:35 UTC
reported upstream:
http://lists.gnu.org/archive/html/bug-coreutils/2016-08/msg00010.html

Comment 7 Kamil Dudka 2016-09-06 12:53:25 UTC
Created attachment 1198255 [details]
[PATCH] ls: postpone installation of signal handlers

... until they are actually needed.  That is right before the first
escape sequence is printed to a terminal.

Comment 8 Kamil Dudka 2016-09-06 12:54:39 UTC
Could you please test the attached patch?  Does it solve the problem for you?

I can prepare a scratch build for Fedora release of your choice if needed...

Comment 9 David Howells 2016-09-06 15:09:42 UTC
The patch works for me.  If I make afs_open() in the kernel sleep till a signal happens than then return -ERESTARTSYS, the old ls just restarts the open() system call when I press CTRL-C, but the patched ls is correctly killed by SIGINT.

Comment 10 Kamil Dudka 2016-09-06 15:42:04 UTC
Thank you for testing the patch!  I have proposed it to cureutils upstream:

http://debbugs.gnu.org/cgi/bugreport.cgi?bug=24232#17

Comment 11 Pádraig Brady 2016-09-07 09:19:45 UTC
Upstream patch: http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commitdiff;h=v8.25-63-g5445f78

Comment 12 Kamil Dudka 2016-09-07 18:30:01 UTC
Thanks for the upstream commit!  Included in coreutils-8.25-15.fc26...

Comment 13 Fedora Update System 2016-10-31 17:44:26 UTC
coreutils-8.25-15.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-be13b257bb

Comment 14 Fedora Update System 2016-10-31 17:46:17 UTC
coreutils-8.25-7.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-75adc7da4f

Comment 15 Fedora Update System 2016-11-01 02:22:13 UTC
coreutils-8.25-15.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-be13b257bb

Comment 16 Fedora Update System 2016-11-01 18:22:54 UTC
coreutils-8.25-7.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-75adc7da4f

Comment 17 Fedora Update System 2016-11-03 18:29:36 UTC
coreutils-8.25-15.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.

Comment 18 Fedora Update System 2016-11-03 23:55:08 UTC
coreutils-8.25-7.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.

Comment 19 Fedora Update System 2016-11-19 21:06:42 UTC
coreutils-8.25-15.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.