Red Hat Bugzilla – Full Text Bug Listing
|Summary:||screen(1) reliably crashes on third attach|
|Product:||[Fedora] Fedora||Reporter:||Adam Jackson <ajax>|
|Component:||ncurses||Assignee:||Miroslav Lichvar <mlichvar>|
|Status:||CLOSED RAWHIDE||QA Contact:|
|Version:||rawhide||CC:||dcantrell, dickey, mlichvar, notting|
|Fixed In Version:||Doc Type:||Bug Fix|
|Doc Text:||Story Points:||---|
|Last Closed:||2006-08-16 11:07:20 EDT||Type:||---|
|oVirt Team:||---||RHEL 7.3 requirements from Atomic Host:|
|Bug Depends On:|
Description Adam Jackson 2006-08-14 15:00:46 EDT
Description of problem: Start two terminals, and: Term 1: screen -S test Term 2: screen -rd test Term 1: screen -rd test Term 2: screen -rd test (obviously waiting until the detach/attach completes for each screen -rd). On the third screen -rd, the master screen process goes away, the dungeon collapses, and you die. Version-Release number of selected component (if applicable): screen-4.0.2-15.1 How reproducible: Always.
Comment 1 Jesse Keating 2006-08-15 22:21:31 EDT
I'm able to replicate this after updating multiple systems to rawhide. This is a severe problem and makes screen rather less than useful
Comment 2 Jesse Keating 2006-08-15 22:31:11 EDT
Hrm, installing older releases of screen (4.0.1) isn't helping. There must be some other change in the distribution that is causing screen to freak.
Comment 3 Jesse Keating 2006-08-15 23:15:37 EDT
I coaxed a backtrace out of gdb: #0 0x00002b139fd9388b in free () from /lib64/libc.so.6 #1 0x00002b139f4bbf32 in _nc_free_termtype () from /usr/lib64/libncurses.so.5 #2 0x00002b139f4bc643 in del_curterm () from /usr/lib64/libncurses.so.5 #3 0x00002b139f4beab0 in tgetent () from /usr/lib64/libncurses.so.5 #4 0x0000000000421e15 in getlogin () #5 0x00000000004233f8 in getlogin () #6 0x0000000000416796 in _nc_timed_wait () #7 0x00000000004175db in _nc_timed_wait () #8 0x000000000044029e in getlogin () #9 0x0000000000407e34 in ?? () #10 0x00002b139fd42aa4 in __libc_start_main () from /lib64/libc.so.6 #11 0x0000000000403309 in ?? () #12 0x00007ffffff0a0f8 in ?? () #13 0x0000000000000000 in ?? ()
Comment 4 Bill Nottingham 2006-08-15 23:23:18 EDT
I believe this is related to the fix for bug 198032 - cc'ing ncurses maintainer.
Comment 5 Bill Nottingham 2006-08-15 23:31:21 EDT
This works around it in screen, FWIW: diff -ru screen-4.0.2/termcap.c screen-4.0.2.foo/termcap.c --- screen-4.0.2/termcap.c 2003-09-08 10:45:36.000000000 -0400 +++ screen-4.0.2.foo/termcap.c 2006-08-15 23:38:06.000000000 -0400 @@ -1333,7 +1333,7 @@ xseteuid(real_uid); xsetegid(real_gid); #endif - r = tgetent(bp, name); + r = tgetent(NULL, name); #ifdef USE_SETEUID xseteuid(eff_uid); xsetegid(eff_gid);
Comment 6 Bill Nottingham 2006-08-15 23:37:58 EDT
CC'ing upstream ncurses maintainer - this appears to be an issue with the new ncurses tgetent caching logic.
Comment 7 Jesse Keating 2006-08-15 23:59:13 EDT
I've confirmed the work around does work around the issue. I'm going to do some cleanup in the spec file, but we want to wait for a proper fix before pushing a new package.
Comment 8 Miroslav Lichvar 2006-08-16 04:39:22 EDT
This is a ncurses bug, minimal code to reproduce it: tgetent(b2, "rxvt"); tgetent(b1, "rxvt"); tgetent(b3, "screen"); tgetent(b1, "rxvt"); tgetent(b3, "screen.rxvt"); tgetent(b3, "screen"); tgetent(b1, "rxvt");
Comment 9 Thomas E. Dickey 2006-08-16 06:04:37 EDT
Someone reported a problem with the fix a couple of weeks later. That fix (the most recent one) is in 20060715. What patchlevel of ncurses is the RPM based on?
Comment 10 Miroslav Lichvar 2006-08-16 06:13:06 EDT
The package is based on the 20060715 patchlevel, so this bug is something a bit different.
Comment 11 Thomas E. Dickey 2006-08-16 06:23:32 EDT
thanks - then that could be an error in the cache logic. I'll investigate it this evening.
Comment 12 Miroslav Lichvar 2006-08-16 10:41:05 EDT
Created attachment 134313 [details] Patch fixing the tgetent bug. It crashes when two cache records have the same last_bufp and both are deleted. Another problem is that last_bufp isn't set to 0 when terminal description wasn't found. The patch should fix that.
Comment 13 Miroslav Lichvar 2006-08-16 11:07:20 EDT
Fixed in ncurses-5.5-23.20060715.
Comment 14 Thomas E. Dickey 2006-08-16 15:47:47 EDT
The assignment to LAST_BUF appears to be unnecessary. Otherwise the patch seems ok (tested for leaks, etc).
Comment 15 Miroslav Lichvar 2006-08-17 03:15:59 EDT
Thomas, without the assignment screen is still not working correctly. Explanation is that after unsuccessful tgetent there will be cache record with old LAST_BUF and when tgetent is called with this value it will delete wrong LAST_TRM.
Comment 16 Steven Haigh 2006-08-28 08:39:23 EDT
I'm not 100% sure if this is related or not, however if I ssh into a box running screen, and reattach an existing session (screen -DR), then log in again via another ssh session and bring up screen in 'multi-screen' mode (screen -x), then irssi that I always have up goes completely mental. The displaying text scatters all over the screen and is unusable. The only way to recover is to detatch the screwed screen (^a ^d), then use screen -DR to kill all sessions and attach again. This basically means that using screen in -x mode is impossible.
Comment 17 Steven Haigh 2006-08-28 08:41:08 EDT
Oh, and I forgot to mention, this is with screen-4.0.2-15.1 and ncurses-5.5-23.20060715. Both seem to be the current version in rawhide.
Comment 18 Miroslav Lichvar 2006-08-29 10:57:24 EDT
Ah, another one. I can reproduce it, and when I disable caching in tgetent it works fine, so it seems related to the problem.
Comment 19 Miroslav Lichvar 2006-08-31 08:07:58 EDT
ncurses-5.5-24.20060715 has a patch that modifies tgetstr function a bit, so it returns pointer to provided area instead of internal ncurses structure. This should make screen happy again.