Description of problem: Start two terminals, and: Term 1: screen -S test Term 2: screen -rd test Term 1: screen -rd test Term 2: screen -rd test (obviously waiting until the detach/attach completes for each screen -rd). On the third screen -rd, the master screen process goes away, the dungeon collapses, and you die. Version-Release number of selected component (if applicable): screen-4.0.2-15.1 How reproducible: Always.
I'm able to replicate this after updating multiple systems to rawhide. This is a severe problem and makes screen rather less than useful
Hrm, installing older releases of screen (4.0.1) isn't helping. There must be some other change in the distribution that is causing screen to freak.
I coaxed a backtrace out of gdb: #0 0x00002b139fd9388b in free () from /lib64/libc.so.6 #1 0x00002b139f4bbf32 in _nc_free_termtype () from /usr/lib64/libncurses.so.5 #2 0x00002b139f4bc643 in del_curterm () from /usr/lib64/libncurses.so.5 #3 0x00002b139f4beab0 in tgetent () from /usr/lib64/libncurses.so.5 #4 0x0000000000421e15 in getlogin () #5 0x00000000004233f8 in getlogin () #6 0x0000000000416796 in _nc_timed_wait () #7 0x00000000004175db in _nc_timed_wait () #8 0x000000000044029e in getlogin () #9 0x0000000000407e34 in ?? () #10 0x00002b139fd42aa4 in __libc_start_main () from /lib64/libc.so.6 #11 0x0000000000403309 in ?? () #12 0x00007ffffff0a0f8 in ?? () #13 0x0000000000000000 in ?? ()
I believe this is related to the fix for bug 198032 - cc'ing ncurses maintainer.
This works around it in screen, FWIW: diff -ru screen-4.0.2/termcap.c screen-4.0.2.foo/termcap.c --- screen-4.0.2/termcap.c 2003-09-08 10:45:36.000000000 -0400 +++ screen-4.0.2.foo/termcap.c 2006-08-15 23:38:06.000000000 -0400 @@ -1333,7 +1333,7 @@ xseteuid(real_uid); xsetegid(real_gid); #endif - r = tgetent(bp, name); + r = tgetent(NULL, name); #ifdef USE_SETEUID xseteuid(eff_uid); xsetegid(eff_gid);
CC'ing upstream ncurses maintainer - this appears to be an issue with the new ncurses tgetent caching logic.
I've confirmed the work around does work around the issue. I'm going to do some cleanup in the spec file, but we want to wait for a proper fix before pushing a new package.
This is a ncurses bug, minimal code to reproduce it: tgetent(b2, "rxvt"); tgetent(b1, "rxvt"); tgetent(b3, "screen"); tgetent(b1, "rxvt"); tgetent(b3, "screen.rxvt"); tgetent(b3, "screen"); tgetent(b1, "rxvt");
Someone reported a problem with the fix a couple of weeks later. That fix (the most recent one) is in 20060715. What patchlevel of ncurses is the RPM based on?
The package is based on the 20060715 patchlevel, so this bug is something a bit different.
thanks - then that could be an error in the cache logic. I'll investigate it this evening.
Created attachment 134313 [details] Patch fixing the tgetent bug. It crashes when two cache records have the same last_bufp and both are deleted. Another problem is that last_bufp isn't set to 0 when terminal description wasn't found. The patch should fix that.
Fixed in ncurses-5.5-23.20060715.
The assignment to LAST_BUF appears to be unnecessary. Otherwise the patch seems ok (tested for leaks, etc).
Thomas, without the assignment screen is still not working correctly. Explanation is that after unsuccessful tgetent there will be cache record with old LAST_BUF and when tgetent is called with this value it will delete wrong LAST_TRM.
I'm not 100% sure if this is related or not, however if I ssh into a box running screen, and reattach an existing session (screen -DR), then log in again via another ssh session and bring up screen in 'multi-screen' mode (screen -x), then irssi that I always have up goes completely mental. The displaying text scatters all over the screen and is unusable. The only way to recover is to detatch the screwed screen (^a ^d), then use screen -DR to kill all sessions and attach again. This basically means that using screen in -x mode is impossible.
Oh, and I forgot to mention, this is with screen-4.0.2-15.1 and ncurses-5.5-23.20060715. Both seem to be the current version in rawhide.
Ah, another one. I can reproduce it, and when I disable caching in tgetent it works fine, so it seems related to the problem.
ncurses-5.5-24.20060715 has a patch that modifies tgetstr function a bit, so it returns pointer to provided area instead of internal ncurses structure. This should make screen happy again.