Bug 202480 - screen(1) reliably crashes on third attach
screen(1) reliably crashes on third attach
Status: CLOSED RAWHIDE
Product: Fedora
Classification: Fedora
Component: ncurses (Show other bugs)
rawhide
All Linux
medium Severity high
: ---
: ---
Assigned To: Miroslav Lichvar
:
Depends On:
Blocks: FC6Blocker
  Show dependency treegraph
 
Reported: 2006-08-14 15:00 EDT by Adam Jackson
Modified: 2013-01-09 23:02 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-08-16 11:07:20 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch fixing the tgetent bug. (688 bytes, patch)
2006-08-16 10:41 EDT, Miroslav Lichvar
no flags Details | Diff

  None (edit)
Description Adam Jackson 2006-08-14 15:00:46 EDT
Description of problem:

Start two terminals, and:
Term 1: screen -S test
Term 2: screen -rd test
Term 1: screen -rd test
Term 2: screen -rd test

(obviously waiting until the detach/attach completes for each screen -rd).  On
the third screen -rd, the master screen process goes away, the dungeon
collapses, and you die.

Version-Release number of selected component (if applicable):

screen-4.0.2-15.1

How reproducible:
Always.
Comment 1 Jesse Keating 2006-08-15 22:21:31 EDT
I'm able to replicate this after updating multiple systems to rawhide.  This is
a severe problem and makes screen rather less than useful
Comment 2 Jesse Keating 2006-08-15 22:31:11 EDT
Hrm, installing older releases of screen (4.0.1) isn't helping.  There must be
some other change in the distribution that is causing screen to freak.
Comment 3 Jesse Keating 2006-08-15 23:15:37 EDT
I coaxed a backtrace out of gdb:

#0  0x00002b139fd9388b in free () from /lib64/libc.so.6
#1  0x00002b139f4bbf32 in _nc_free_termtype () from /usr/lib64/libncurses.so.5
#2  0x00002b139f4bc643 in del_curterm () from /usr/lib64/libncurses.so.5
#3  0x00002b139f4beab0 in tgetent () from /usr/lib64/libncurses.so.5
#4  0x0000000000421e15 in getlogin ()
#5  0x00000000004233f8 in getlogin ()
#6  0x0000000000416796 in _nc_timed_wait ()
#7  0x00000000004175db in _nc_timed_wait ()
#8  0x000000000044029e in getlogin ()
#9  0x0000000000407e34 in ?? ()
#10 0x00002b139fd42aa4 in __libc_start_main () from /lib64/libc.so.6
#11 0x0000000000403309 in ?? ()
#12 0x00007ffffff0a0f8 in ?? ()
#13 0x0000000000000000 in ?? ()
Comment 4 Bill Nottingham 2006-08-15 23:23:18 EDT
I believe this is related to the fix for bug 198032 - cc'ing ncurses maintainer.
Comment 5 Bill Nottingham 2006-08-15 23:31:21 EDT
This works around it in screen, FWIW:

diff -ru screen-4.0.2/termcap.c screen-4.0.2.foo/termcap.c
--- screen-4.0.2/termcap.c      2003-09-08 10:45:36.000000000 -0400
+++ screen-4.0.2.foo/termcap.c  2006-08-15 23:38:06.000000000 -0400
@@ -1333,7 +1333,7 @@
   xseteuid(real_uid);
   xsetegid(real_gid);
 #endif
-  r = tgetent(bp, name);
+  r = tgetent(NULL, name);
 #ifdef USE_SETEUID
   xseteuid(eff_uid);
   xsetegid(eff_gid);
Comment 6 Bill Nottingham 2006-08-15 23:37:58 EDT
CC'ing upstream ncurses maintainer - this appears to be an issue with the new
ncurses tgetent caching logic.
Comment 7 Jesse Keating 2006-08-15 23:59:13 EDT
I've confirmed the work around does work around the issue.

I'm going to do some cleanup in the spec file, but we want to wait for a proper
fix before pushing a new package.
Comment 8 Miroslav Lichvar 2006-08-16 04:39:22 EDT
This is a ncurses bug, minimal code to reproduce it:

tgetent(b2, "rxvt");
tgetent(b1, "rxvt");
tgetent(b3, "screen");
tgetent(b1, "rxvt");
tgetent(b3, "screen.rxvt");
tgetent(b3, "screen");
tgetent(b1, "rxvt");
Comment 9 Thomas E. Dickey 2006-08-16 06:04:37 EDT
Someone reported a problem with the fix a couple of weeks later.

That fix (the most recent one) is in 20060715.  What patchlevel
of ncurses is the RPM based on?
Comment 10 Miroslav Lichvar 2006-08-16 06:13:06 EDT
The package is based on the 20060715 patchlevel, so this bug is something a bit
different.
Comment 11 Thomas E. Dickey 2006-08-16 06:23:32 EDT
thanks - then that could be an error in the cache logic.
I'll investigate it this evening.
Comment 12 Miroslav Lichvar 2006-08-16 10:41:05 EDT
Created attachment 134313 [details]
Patch fixing the tgetent bug.

It crashes when two cache records have the same last_bufp and both are deleted.
Another problem is that last_bufp isn't set to 0 when terminal description
wasn't found. The patch should fix that.
Comment 13 Miroslav Lichvar 2006-08-16 11:07:20 EDT
Fixed in ncurses-5.5-23.20060715.
Comment 14 Thomas E. Dickey 2006-08-16 15:47:47 EDT
The assignment to LAST_BUF appears to be unnecessary.
Otherwise the patch seems ok (tested for leaks, etc).
Comment 15 Miroslav Lichvar 2006-08-17 03:15:59 EDT
Thomas, without the assignment screen is still not working correctly.
Explanation is that after unsuccessful tgetent there will be cache record with
old LAST_BUF and when tgetent is called with this value it will delete wrong
LAST_TRM.
Comment 16 Steven Haigh 2006-08-28 08:39:23 EDT
I'm not 100% sure if this is related or not, however if I ssh into a box running
screen, and reattach an existing session (screen -DR), then log in again via
another ssh session and bring up screen in 'multi-screen' mode (screen -x), then
irssi that I always have up goes completely mental.

The displaying text scatters all over the screen and is unusable. The only way
to recover is to detatch the screwed screen (^a ^d), then use screen -DR to kill
all sessions and attach again.

This basically means that using screen in -x mode is impossible.
Comment 17 Steven Haigh 2006-08-28 08:41:08 EDT
Oh, and I forgot to mention, this is with screen-4.0.2-15.1 and
ncurses-5.5-23.20060715. Both seem to be the current version in rawhide.
Comment 18 Miroslav Lichvar 2006-08-29 10:57:24 EDT
Ah, another one. I can reproduce it, and when I disable caching in tgetent it
works fine, so it seems related to the problem.
Comment 19 Miroslav Lichvar 2006-08-31 08:07:58 EDT
ncurses-5.5-24.20060715 has a patch that modifies tgetstr function a bit, so it
returns pointer to provided area instead of internal ncurses structure. This
should make screen happy again.

Note You need to log in before you can comment on or make changes to this bug.