Bug 504798

Summary: stty/termios lockup (by race?) due to TCSADRAIN
Product: [Fedora] Fedora Reporter: Jan Kratochvil <jan.kratochvil>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: 12CC: itamar, kernel-maint, ovasik, redhat-bugzilla
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-12-05 06:52:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jan Kratochvil 2009-06-09 14:23:38 UTC
Description of problem:
Machine running 4 GDB testsuites in parallel in mock always gets at least one of the testsuites hanging on `stty sane' by `expect' with no apparent reason.
stty hangs on:
      if (tcsetattr (STDIN_FILENO, TCSADRAIN, &mode))
Got it workarounded by TCSANOW but it looks as a kernel wait race to me.

Version-Release number of selected component (if applicable):
kernel-2.6.30-0.97.rc8.fc12.x86_64
( + coreutils-7.2-1.fc11.{x86_64.i586} )

How reproducible:
Always in about 4 runs.

Steps to Reproduce:
I do not have a reproducer, it requires some custom testing scripts etc.
I can test some kernel patches.

Actual results:
Hanging on:
27991 /var/lib/mock/fedora-11-i386/root/dev/pts/5 SNs+   0:00  \_ expect -- /usr/share/dejagnu/runtest.exp --target_board=unix/-m32/-fPIE/-pie 
27992 /var/lib/mock/fedora-11-i386/root/dev/pts/5 SN+   0:00  \_ sh -c /bin/stty sane < /dev/pts/5    
28036 /var/lib/mock/fedora-11-i386/root/dev/pts/5 SN+   0:00  \_ /bin/stty sane                           

Expected results:
No hang.

Additional info:
Replaced TCSADRAIN by TCSANOW in /bin/stty in the mock chroots and the problem is gone.  Still I see no reason why TCSADRAIN should hang, no-one is writing into that chrootdir/dev/pts/5.
Standard global devpts is mounted in all chrootdirs/dev/pts.

hanging `stty sane':
#0  0x00007f5117104c78 in tcsetattr () from /lib64/libc.so.6
0x00007f5117104c76 <tcsetattr+150>:     syscall
      if (tcsetattr (STDIN_FILENO, TCSADRAIN, &mode))
#1  0x00000000004031af in main (argc=<value optimized out>, argv=<value optimized out>) at stty.c:1004

# ls -l /proc/*/fd/*|grep pts/5
lrwx------ 1 jkratoch jkratoch 64 2009-06-09 12:43 /proc/27991/fd/0 -> /var/lib/mock/fedora-11-i386/root/dev/pts/5
lrwx------ 1 jkratoch jkratoch 64 2009-06-09 12:43 /proc/27991/fd/1 -> /var/lib/mock/fedora-11-i386/root/dev/pts/5
lrwx------ 1 jkratoch jkratoch 64 2009-06-09 12:35 /proc/27991/fd/2 -> /var/lib/mock/fedora-11-i386/root/dev/pts/5
lrwx------ 1 jkratoch jkratoch 64 2009-06-09 12:43 /proc/27992/fd/0 -> /var/lib/mock/fedora-11-i386/root/dev/pts/5
lrwx------ 1 jkratoch jkratoch 64 2009-06-09 12:43 /proc/27992/fd/1 -> /var/lib/mock/fedora-11-i386/root/dev/pts/5
lrwx------ 1 jkratoch jkratoch 64 2009-06-09 12:35 /proc/27992/fd/2 -> /var/lib/mock/fedora-11-i386/root/dev/pts/5
lr-x------ 1 jkratoch jkratoch 64 2009-06-09 12:43 /proc/28036/fd/0 -> /var/lib/mock/fedora-11-i386/root/dev/pts/5
lrwx------ 1 jkratoch jkratoch 64 2009-06-09 12:43 /proc/28036/fd/1 -> /var/lib/mock/fedora-11-i386/root/dev/pts/5
lrwx------ 1 jkratoch jkratoch 64 2009-06-09 12:35 /proc/28036/fd/2 -> /var/lib/mock/fedora-11-i386/root/dev/pts/5
$ cat /proc/{27991,27992,28036}/fdinfo/{0,1,2}
pos:    0
flags:  02
pos:    0
flags:  02
pos:    0
flags:  02
pos:    0
flags:  02
pos:    0
flags:  02
pos:    0
flags:  02
pos:    0
flags:  0100000
pos:    0
flags:  02
pos:    0
flags:  02
$

$ stty -F /var/lib/mock/fedora-11-i386/root/dev/pts/5
speed 9600 baud; line = 0;
intr = <undef>; quit = <undef>; erase = <undef>; kill = <undef>; eof = <undef>; start = <undef>; stop = <undef>; susp = <undef>;
rprnt = <undef>; werase = <undef>; lnext = <undef>; flush = <undef>; min = 1; time = 0;
-brkint -icrnl -imaxbel
-opost -onlcr
-isig -icanon -iexten -echo -echoe -echok noflsh -echoctl -echoke
$ stty -F /var/lib/mock/fedora-11-i386/root/dev/pts/5 -a
speed 9600 baud; rows 0; columns 0; line = 0;
intr = <undef>; quit = <undef>; erase = <undef>; kill = <undef>; eof = <undef>; eol = <undef>; eol2 = <undef>; swtch = <undef>;
start = <undef>; stop = <undef>; susp = <undef>; rprnt = <undef>; werase = <undef>; lnext = <undef>; flush = <undef>; min = 1; time = 0;
-parenb -parodd cs8 hupcl -cstopb cread clocal -crtscts
-ignbrk -brkint ignpar -parmrk -inpck -istrip -inlcr -igncr -icrnl -ixon -ixoff -iuclc -ixany -imaxbel -iutf8
-opost -olcuc -ocrnl -onlcr -onocr -onlret -ofill -ofdel nl0 cr0 tab0 bs0 vt0 ff0
-isig -icanon -iexten -echo -echoe -echok -echonl noflsh -xcase -tostop -echoprt -echoctl -echoke

stty          S ffff880032955440  5504 28036  27992
ffff88019c9999f8 0000000000000082 0000000000000000 0000000000000000
0000000000000000 000000000d402ead ffff88019c999968 ffffffff8101a805
ffff8800965127c8 000000000000e440 ffff8800965127c8 00000000001d33c0
Call Trace:
[<ffffffff8101a805>] ? native_sched_clock+0x2d/0x54
[<ffffffff810888cb>] ? trace_hardirqs_on_caller+0x139/0x173
[<ffffffff814b6fe3>] schedule+0x21/0x49
[<ffffffff814b74a9>] schedule_timeout+0x36/0xf6
[<ffffffff812dd24c>] ? n_tty_chars_in_buffer+0x8c/0xae
[<ffffffff812e2abe>] ? pty_chars_in_buffer+0x3b/0x6d
[<ffffffff812dfdb1>] tty_wait_until_sent+0xc8/0x11c
[<ffffffff81075843>] ? autoremove_wake_function+0x0/0x5f
[<ffffffff812dff4e>] set_termios+0x149/0x3c0
[<ffffffff812e0426>] tty_mode_ioctl+0x15e/0x436
[<ffffffff810888cb>] ? trace_hardirqs_on_caller+0x139/0x173
[<ffffffff81088925>] ? trace_hardirqs_on+0x20/0x36
[<ffffffff812e088a>] n_tty_ioctl_helper+0x18c/0x1ae
[<ffffffff812e09eb>] ? tty_ldisc_try+0x4f/0x6d
[<ffffffff812dd802>] n_tty_ioctl+0xda/0xf5
[<ffffffff812db987>] tty_ioctl+0x823/0x86e
[<ffffffff8101a805>] ? native_sched_clock+0x2d/0x54
[<ffffffff811347b7>] vfs_ioctl+0x31/0xaa
[<ffffffff81134cad>] do_vfs_ioctl+0x47d/0x4d4
[<ffffffff81088925>] ? trace_hardirqs_on+0x20/0x36
[<ffffffff81134d69>] sys_ioctl+0x65/0x9c
[<ffffffff811652b3>] do_ioctl32_pointer+0x23/0x39
[<ffffffff81167d7d>] compat_sys_ioctl+0x324/0x38b
[<ffffffff810b4c07>] ? audit_syscall_entry+0x12d/0x16d
[<ffffffff814b94a3>] ? trace_hardirqs_off_thunk+0x3a/0x3c
[<ffffffff81042c7f>] sysenter_dispatch+0x7/0x33
[<ffffffff814b9464>] ? trace_hardirqs_on_thunk+0x3a/0x3f

Name:   stty
State:  S (sleeping)
Tgid:   28036
Pid:    28036
PPid:   27992
TracerPid:      0
Uid:    502     502     502     502
Gid:    502     502     502     502
Utrace: 0
FDSize: 64
Groups: 502
VmPeak:     1900 kB
VmSize:     1900 kB
VmLck:         0 kB
VmHWM:       412 kB
VmRSS:       412 kB
VmData:      156 kB
VmStk:        84 kB
VmExe:        52 kB
VmLib:      1592 kB
VmPTE:        20 kB
Threads:        1
SigQ:   14/53248
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000001000
SigCgt: 0000000000000000
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: ffffffffffffffff
Cpus_allowed:   ff
Cpus_allowed_list:      0-7
Mems_allowed:   00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list:      0
voluntary_ctxt_switches:        1
nonvoluntary_ctxt_switches:     3

sh            S ffff88003277a440  5456 27992  27991
ffff88013319fd28 0000000000000082 0000000000000000 00000000ba37fac7
ffff88013319fc88 ffffffff8101a805 ffff88013319fc98 00000000ba37fac7
ffff8800da1f03e8 000000000000e440 ffff8800da1f03e8 00000000001d33c0
Call Trace:
[<ffffffff8101a805>] ? native_sched_clock+0x2d/0x54
[<ffffffff814b6fe3>] schedule+0x21/0x49
[<ffffffff8105ee09>] do_wait+0x2a2/0x3e6
[<ffffffff81051e7b>] ? default_wake_function+0x0/0x3b
[<ffffffff814b9acf>] ? _spin_unlock_irqrestore+0x5a/0x7f
[<ffffffff8105efe3>] sys_wait4+0x96/0xc7
[<ffffffff8109fd0c>] compat_sys_wait4+0x3a/0xe8
[<ffffffff8107a127>] ? up_read+0x3a/0x55
[<ffffffff81013b35>] ? retint_swapgs+0x13/0x1b
[<ffffffff810b4c07>] ? audit_syscall_entry+0x12d/0x16d
[<ffffffff8104416c>] sys32_waitpid+0x23/0x39
[<ffffffff81042c7f>] sysenter_dispatch+0x7/0x33

expect        S ffff880032955440  5984 27991  31547
ffff88010a52fd28 0000000000000082 0000000000000000 00000000daac65fa
ffff88010a52fc88 ffffffff8101a805 ffff88010a52fc98 00000000daac65fa
ffff8800965103e8 000000000000e440 ffff8800965103e8 00000000001d33c0
Call Trace:
[<ffffffff8101a805>] ? native_sched_clock+0x2d/0x54
[<ffffffff814b6fe3>] schedule+0x21/0x49
[<ffffffff8105ee09>] do_wait+0x2a2/0x3e6
[<ffffffff81051e7b>] ? default_wake_function+0x0/0x3b
[<ffffffff81071cf2>] ? find_get_pid+0x70/0x8f
[<ffffffff8105efe3>] sys_wait4+0x96/0xc7
[<ffffffff8109fd0c>] compat_sys_wait4+0x3a/0xe8
[<ffffffff8107a127>] ? up_read+0x3a/0x55
[<ffffffff81013b35>] ? retint_swapgs+0x13/0x1b
[<ffffffff810b4c07>] ? audit_syscall_entry+0x12d/0x16d
[<ffffffff8104416c>] sys32_waitpid+0x23/0x39
[<ffffffff81042c7f>] sysenter_dispatch+0x7/0x33
[<ffffffff814b9464>] ? trace_hardirqs_on_thunk+0x3a/0x3f

Comment 1 Jan Kratochvil 2009-06-10 19:21:31 UTC
To the coreutils maintainer:
As the workaround looks mostly harmless to me Fedora may possibly workaround it (stty.c TCSADRAIN->TCSANOW) before the kernel bug gets fixed.

Comment 2 Ondrej Vasik 2009-06-11 08:37:18 UTC
Sounds reasonably and harmless in rawhide - built with TCSANOW in stty as coreutils-7.4-2.fc12 until fixed in kernel.

Comment 3 Bug Zapper 2009-11-16 10:02:12 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 12 development cycle.
Changing version to '12'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 4 Bug Zapper 2010-11-04 11:09:46 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 5 Bug Zapper 2010-12-05 06:52:20 UTC
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.