Bug 141992 - luit deadlock with vi and less
luit deadlock with vi and less
Status: CLOSED UPSTREAM
Product: Fedora
Classification: Fedora
Component: xorg-x11 (Show other bugs)
3
All Linux
medium Severity medium
: ---
: ---
Assigned To: X/OpenGL Maintenance List
: Triaged
: 128495 (view as bug list)
Depends On:
Blocks: FC4Target
  Show dependency treegraph
 
Reported: 2004-12-06 11:46 EST by Jan "Yenya" Kasprzak
Modified: 2007-11-30 17:10 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2005-02-01 06:10:59 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
script which makes the bug happen (514 bytes, text/plain)
2005-01-18 18:46 EST, John Smith
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
FreeDesktop.org 2443 None None None Never

  None (edit)
Description Jan "Yenya" Kasprzak 2004-12-06 11:46:41 EST
Description of problem:
The luit(1) program sometimes deadlocks when running commands like
vi(1) or less(1).

Version-Release number of selected component (if applicable):
xorg-x11-6.8.1-12.FC3.1

How reproducible:
non-deterministic, probably a race condition.

Steps to Reproduce:
1. run "luit vi"
2. if vi starts up, type ":q<enter>" and repeat from step 1.

Actual results:
sometimes no output is written to the terminal, and CPU is idle.


Expected results:
vi should start up every time

Additional info:
- I have not seen this on i386, just on x86_64 (but I use x86_64 as my
workstation, so I don't run luit on i386 very often).

- this is totally nondeterministic. Sometimes vi starts up 5 times in
a row, sometimes luit locks up 5+ times in a row.

- it happens only when running full-screen applications (I have seen
this with "luit vi file", "luit less file", and "luit slrn" so far,
and I haven't seen this with "luit ssh <host>" even though I often run
ssh under luit).

- this is probably some race condition in luit, because I have not
been able to reproduce this when running "strace -f -o /tmp/strace
luit vi".

- however, I can reproduce this using strace without the -f switch. I
ran "strace -o /dev/pts/22 vi", and on pts/22 the text ended with the
following lines:

[...]
ioctl(3, TIOCSPTLCK, [0])               = 0
ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo
...}) = 0ioctl(3, TIOCGPTN, [30])                = 0
stat("/dev/pts/30", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 30),
...}) = 0
getuid()                                = 11561
getgid()                                = 10000
stat("/dev/pts/30", {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 30),
...}) = 0
getgid()                                = 10000
getuid()                                = 11561
chown("/dev/pts/30", 11561, 10000)      = 0
getuid()                                = 11561
geteuid()                               = 11561
getgid()                                = 10000
getegid()                               = 10000
setuid(11561)                           = 0
setgid(10000)                           = 0
clone(child_stack=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x2a9556c3b0) = 26487
rt_sigaction(SIGWINCH, {0x401aa0, [], 0x4000000}, NULL, 8) = 0
rt_sigaction(SIGCHLD, {0x401ab0, [], 0x4000000}, NULL, 8) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo
...}) = 0
ioctl(3, SNDCTL_TMR_CONTINUE or TCSETSF

(and yes, the output ended with "TCSETF" in the middle of the line (no
"<unfinished>" or anything else was printed by strace). When I hit
Ctrl+C, the strace continues by this (I am repeating the last line up
to the "TCSETF" string):

ioctl(3, SNDCTL_TMR_CONTINUE or TCSETSF, {B38400 opost isig icanon
echo ...}) = -1 EINTR (Interrupted system call)
--- SIGINT (Interrupt) @ 0 (0) ---
+++ killed by SIGINT +++

- when I try to strace the already locked-up luit, I get this:

$ strace -p 26584
Process 26584 attached - interrupt to quit
write(2, "Couldn\'t copy terminal settings\n", 32) = 32
exit_group(0x1, 0x1, 0x39d3c64d60, 0x1, 0x3cProcess 26584 detached

(and of course luit prints "Couldn't copy terminal settings" to stderr
and finishes with exit status "1").

- when I try to strace the process running inside luit, it is waiting
for the terminal input:

$ strace -p 26664    # This is a PID of the "vi" process
Process 26664 attached - interrupt to quit
select(1, [0], NULL, [0], NULL

so it is waiting inside select for reading from stdin.

This may also be kernel related - I am using vanilla 2.6.10-rc2 from
kernel.org just now. I will try it under FC3 kernel if you want.
Comment 1 Jan "Yenya" Kasprzak 2004-12-06 12:24:44 EST
One more thing: I have not been reproduce this problem when running
"luit sh -c 'sleep 1; vi'" instead of plain "luit vi". Definitely a
race condition in luit or in kernel.
Comment 2 Mike A. Harris 2004-12-07 04:05:28 EST
Please reproduce under the official Red Hat kernel, with a fully
updated FC3 system and report back.  Additionally, make sure you
are not running any 3rd party kernel modules (proprietary or
otherwise).

Thanks in advance.
Comment 3 Jan "Yenya" Kasprzak 2004-12-07 08:00:18 EST
Yes, I can reproduce it under Red Hat kernel 2.6.9-1.681_FC3 #1 Thu
Nov 18 15:13:22 EST 2004 x86_64 x86_64 x86_64 GNU/Linux

The modules loaded were the following (no 3rd-party module).

Module                  Size  Used by
radeon                145137  2
md5                     4801  1
ipv6                  292769  10
parport_pc             29569  1
lp                     15153  0
parport                53837  2 parport_pc,lp
autofs4                30921  4
i2c_dev                14273  0
i2c_core               27841  1 i2c_dev
sunrpc                202553  1
ds                     20681  0
yenta_socket           22209  0
pcmcia_core            69713  2 ds,yenta_socket
dm_mod                 66345  0
button                  8161  0
battery                10313  0
ac                      5833  0
ohci1394               41305  0
usb_storage            73737  0
ieee1394              383569  1 ohci1394
uhci_hcd               37481  0
ehci_hcd               37957  0
snd_via82xx            33445  2
snd_ac97_codec         84417  1 snd_via82xx
snd_pcm_oss            59513  0
snd_mixer_oss          20801  2 snd_pcm_oss
snd_pcm               123981  2 snd_via82xx,snd_pcm_oss
snd_timer              36169  1 snd_pcm
snd_page_alloc         11473  2 snd_via82xx,snd_pcm
gameport                5057  1 snd_via82xx
snd_mpu401_uart        11713  1 snd_via82xx
snd_rawmidi            32997  1 snd_mpu401_uart
snd_seq_device          9805  1 snd_rawmidi
snd                    64425  11
snd_via82xx,snd_ac97_codec,snd_pcm_oss,snd_mixer_oss,snd_pcm,snd_timer,snd_mpu401_uart,snd_rawmidi,snd_seq_device
soundcore              12641  2 snd
sk98lin               165677  1
floppy                 72305  0
ext3                  139985  3
jbd                    91761  1 ext3
sata_via                8517  0
libata                 49608  1 sata_via
sd_mod                 19137  0
scsi_mod              148577  3 usb_storage,libata,sd_mod

I have tried to look into the source, and the lockup occurs in
xorg-x11-6.8.1/xc/programs/luit/sys.c line 227:

     rc = tcsetattr(dfd, TCSAFLUSH, &tio);

when I use 0 instead of TCSAFLUSH, it does not lock up, but luit
also does not work correctly - it does not set "noecho" in vi, so the
commands are echoed on the screen.

Comment 4 John Smith 2005-01-18 18:44:38 EST
I was going to file a bug about luit in bugzilla when I saw this
one.  I think it's more or less the same.  Please see the attached
script (the script calls luit 10000 times).

Actually there are two problems:
1) sometimes luit hangs, and the script doesn't terminate; attaching
a strace to luit gives the same results as explained by Jan "Yenya"
Kasprzak;
2) when you're lucky and the script terminates, it should normally
print "Bug happened 0 times out of 10000", but here it rather print
things like "Bug happened 959 times out of 10000" (much lower figures
if I use /bin/echo -n instead of ssh -S foo bar though).
Comment 5 John Smith 2005-01-18 18:46:51 EST
Created attachment 109952 [details]
script which makes the bug happen
Comment 6 Frank Schmitt 2005-01-23 06:02:15 EST
I have the same problem over here: Fully updated Fedora Core 3 running
2.6.10. I have LANG=en_US.UTF-8 but use luit to start centericq like this:

export LANG=en_US.ISO8859-15
luit centericq

and the luit centericq most of the times hangs, too. If I strace it,
it hangs at "ioctl(3, SNDCTL_TMR_CONTINUE or TCSETSF". I'm not running
x64 but on a Pentium M.
Comment 7 Mike A. Harris 2005-02-01 05:29:16 EST
Please report this issue also to X.Org, in the X.Org bugzilla
located at http://bugs.freedesktop.org in the "xorg" component.

Once you've filed your report to X.org, please paste the URL
here and Red Hat will track the issue, and review any fixes
that become available for consideration in future Fedora Core
updates.

Setting status to "NEEDINFO", and awaiting upstream bug URL.

Thanks in advance.
Comment 8 Jan "Yenya" Kasprzak 2005-02-01 05:46:40 EST
The upstream bug URL is https://bugs.freedesktop.org/show_bug.cgi?id=2443

I am changing the Platform: field to "All" (see the comment #6).
Comment 9 Mike A. Harris 2005-02-01 06:10:59 EST
Thanks, setting status to "UPSTREAM" for tracking in X.org bugzilla.
Comment 10 Mike A. Harris 2005-05-17 17:01:36 EDT
*** Bug 128495 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.