Bug 784504 - GNU screen crashes randomly at least once after every reboot
Summary: GNU screen crashes randomly at least once after every reboot
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: screen
Version: 16
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Lukáš Nykrýn
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-01-25 07:47 UTC by Stefan Krüger
Modified: 2012-04-18 19:29 UTC (History)
2 users (show)

Fixed In Version: screen-4.1.0-0.7.20110328git8cf5ef.fc16
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-04-18 19:29:04 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Stefan Krüger 2012-01-25 07:47:12 UTC
Description of problem:
Every time I reboot my system screen crashes at least once. This happens randomly and I'm not able to track it down. All I see is

[screen caught signal 11. (core dumped)]
Connection to f16srv closed.

And then this:

$ screen -ls
There is a screen on:
	1382.tty2.f16srv	(Dead ???)
Remove dead screens with 'screen -wipe'.
1 Socket in /var/run/screen/S-stadtkind.

Version-Release number of selected component (if applicable):
screen-4.1.0-0.5.20110328git8cf5ef.fc16.x86_64

How reproducible:
Happens at least once every time I reboot my system.

Steps to Reproduce:
1. Reboot fedora.
2. Start screen like this: exec /usr/bin/screen -U
3. Do stuff in mutt or slrn; this should make screen crash (but not always...)
  
Actual results:
Screen crashes randomly. I wasn't able to nail it down to a specific action yet though.


Expected results:
Screen should not crash and if it does, a message should appear in /var/log/messages and the core file should be dumped (I haven't found the core file, despite screen saying it dumped one).


Additional info:
If I'm going to start screen in a gdb session, how would I do that to get smth meaningful out of it?

Comment 1 Stefan Krüger 2012-03-01 21:17:13 UTC
Does this help? As I said, this is reproducible!
(29380 == PID of the /usr/bin/SCREEN process)

$ gdb -p 29380 
GNU gdb (GDB) Fedora (7.3.50.20110722-10.fc16)
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Attaching to process 29380
Reading symbols from /usr/bin/screen...Reading symbols from /usr/lib/debug/usr/bin/screen.debug...done.
done.
Reading symbols from /lib64/libtinfo.so.5...Reading symbols from /usr/lib/debug/lib64/libtinfo.so.5.9.debug...done.
done.
Loaded symbols for /lib64/libtinfo.so.5
Reading symbols from /usr/lib64/libutempter.so.0...Reading symbols from /usr/lib/debug/usr/lib64/libutempter.so.1.1.5.debug...done.
done.
Loaded symbols for /usr/lib64/libutempter.so.0
Reading symbols from /lib64/libcrypt.so.1...Reading symbols from /usr/lib/debug/lib64/libcrypt-2.14.90.so.debug...done.
done.
Loaded symbols for /lib64/libcrypt.so.1
Reading symbols from /lib64/libpam.so.0...Reading symbols from /usr/lib/debug/lib64/libpam.so.0.83.1.debug...done.
done.
Loaded symbols for /lib64/libpam.so.0
Reading symbols from /lib64/libc.so.6...Reading symbols from /usr/lib/debug/lib64/libc-2.14.90.so.debug...done.
done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/libfreebl3.so...Reading symbols from /usr/lib/debug/lib64/libfreebl3.so.debug...done.
done.
Loaded symbols for /lib64/libfreebl3.so
Reading symbols from /lib64/libaudit.so.1...Reading symbols from /usr/lib/debug/lib64/libaudit.so.1.0.0.debug...done.
done.
Loaded symbols for /lib64/libaudit.so.1
Reading symbols from /lib64/libdl.so.2...Reading symbols from /usr/lib/debug/lib64/libdl-2.14.90.so.debug...done.
done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from /usr/lib/debug/lib64/ld-2.14.90.so.debug...done.
done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...Reading symbols from /usr/lib/debug/lib64/libnss_files-2.14.90.so.debug...done.
done.
Loaded symbols for /lib64/libnss_files.so.2
0x00007f2ba3a3a3e3 in __select_nocancel ()
    at ../sysdeps/unix/syscall-template.S:82
82      T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) cont
Continuing.
Detaching after fork from child process 32349.
Detaching after fork from child process 32350.
Detaching after fork from child process 32351.
Detaching after fork from child process 32352.

Program received signal SIGSEGV, Segmentation fault.
0x0000000000417f34 in ChangeWindowSize (p=0x11044d0, wi=80, he=50, hi=2000)
    at resize.c:750
750               if (ml->image[p->w_width] == ' ')
(gdb) thread apply all bt full

Thread 1 (process 29380):
#0  0x0000000000417f34 in ChangeWindowSize (p=0x11044d0, wi=80, he=50, hi=2000)
    at resize.c:750
        mlf = 0x0
        mlt = 0x0
        ml = <optimized out>
        nmlines = 0x12b8120
        nhlines = 0x1227d70
        fy = 2023
        ty = 2049
        l = <optimized out>
        lx = <optimized out>
        lf = <optimized out>
        lt = <optimized out>
        yy = 2024
        oty = <optimized out>
        addone = 0
        ncx = 0
        ncy = 75
        naka = 0
        t = <optimized out>
        y = <optimized out>
        shift = -51
#1  0x00000000004186b4 in LeaveAltScreen (p=0x11044d0) at resize.c:1105
No locals.
#2  0x000000000040e798 in DoCSI (c=<optimized out>,
    intermediate=<optimized out>) at ansi.c:1441
        i = 0
        a1 = 1049
        a2 = <optimized out>
#3  0x000000000041115f in WriteString (wp=<optimized out>,
    buf=0x7fff06e1c933 "\r\033[?1l\033>", ' ' <repeats 24 times>, "\033[36;1H> > Hallo,", ' ' <repeats 70 times>, "\033[37;1H> >", ' ' <repeats 71 times>...,
    len=9) at ansi.c:590
        c = 108
        font = <optimized out>
        cv = <optimized out>
#4  0x000000000042043f in win_readev_fn (ev=<optimized out>, data=0x11044d0 "")
    at window.c:1934
        p = 0x11044d0
        buf = "\000\r\033[37m\033[40m", ' ' <repeats 27 times>, "\r\033[39;49m\033[m\017\033[50;1H\033[?1049l\r\033[?1l\033>", ' ' <repeats 24 times>, "\033[36;1H> > Hallo,", ' ' <repeats 70 times>, "\033[37;1H> >", ' ' <repeats 77 times>, "\033[38;1H> > ja ich fahre am 06.04.12 zu Oma. Wenn ich euch mitnehmen soll, ma
ch ich     "...
        bp = 0x7fff06e1c8f1 "\r\033[37m\033[40m", ' ' <repeats 27 times>, "\r\033[39;49m\033[m\017\033[50;1H\033[?1049l\r\033[?1l\033>", ' ' <repeats 24 times>, "\033[36;1H> > Hallo,", ' ' <repeats 70 times>, "\033[37;1H> >     "...
        size = <optimized out>
        len = 74
        wtop = <optimized out>
#5  0x000000000044adf3 in sched () at sched.c:237
        ev = 0x1104598
        r = {__fds_bits = {32, 0 <repeats 15 times>}}
        w = {__fds_bits = {0 <repeats 16 times>}}
        set = <optimized out>
        timeoutev = <optimized out>
        timeout = {tv_sec = 43, tv_usec = 659575}
        nsel = 0
#6  0x0000000000405ccb in main (ac=<optimized out>, av=<optimized out>)
    at screen.c:1481
        n = <optimized out>
        ap = <optimized out>
        av0 = <optimized out>
        socknamebuf = "29380.tty1.sonne\000\t\226\243+\177\000\000 dW\244+\177\000\000\000\000\000\000\001\000\000\000\202\b\000\000\001", '\000' <repeats 11 times>"\340, \225y\244+\177\000\000P\356\341\006\377\177\000\000\207\360\226|\000\000\000\000\070\352x\244+\177\000\000 \357\341\006\377\177\000\000\230\211y\244+\177\000\000\023\004X\244+\177\000\000\000\000\000\000\000\000\000\000\070\352x\244+\177\000\000\001", '\000' <repeats 15 times>, "\001\000\000\000\000\000\000\000\230\211y\244+\177\000\000\060\352\341\006\377\177\000\000\000\000\000\000\000\000\000\000P>f\000\000\000\000\000\246\065X\244+\177\000\000\001\000\000\000\000\000\000\000\310\004y\244+\177\000\000\000\000\000\000\000\000\000\000\340\225y\244+\177\000\000\234r\226\243\001\000\000\000\246\065X\244\001\000\000\000\001\000\000\000\000\000\000\000"...
        mflag = <optimized out>
        myname = <optimized out>
        SockDir = <optimized out>
        st = {st_dev = 17, st_ino = 56272, st_nlink = 2, st_mode = 16832,
          st_uid = 1001, st_gid = 1001, __pad0 = 0, st_rdev = 0, st_size = 40,
          st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1330503934,
            tv_nsec = 569478892}, st_mtim = {tv_sec = 1330504237,
            tv_nsec = 847756375}, st_ctim = {tv_sec = 1330504237,
            tv_nsec = 847756375}, __unused = {0, 0, 0}}
        oumask = 2
        nwin = {StartAt = -1, aka = 0x0, args = 0x0, dir = 0x0, term = 0x0,
          aflag = -1, flowflag = -1, lflag = -1, histheight = -1,
          monitor = -1, wlock = -1, silence = -1, wrap = -1, Lflag = -1,
          slow = -1, gr = -1, c1 = -1, bce = -1, encoding = -1, hstatus = 0x0,
          charset = 0x0}
        detached = <optimized out>
        sockp = <optimized out>
        sty = 0x0
(gdb)

Comment 2 Lukáš Nykrýn 2012-03-02 08:44:13 UTC
Thank you for this debug information, I will try to look at it as soon as possible.

Comment 3 Stefan Krüger 2012-03-03 13:55:26 UTC
I have:

altscreen on

in my ~/.screenrc

Comment 4 Lukáš Nykrýn 2012-03-08 14:02:53 UTC
This looks quite same as https://savannah.gnu.org/bugs/index.php?26742 but both patches mentioned there are in this version of screen.

Comment 5 Erik Falor 2012-03-08 20:49:48 UTC
(In reply to comment #3)
> I have:
> 
> altscreen on
> 
> in my ~/.screenrc

I've been having the same problem as you for many months now.  Your stack trace from the segfault looks just like mine.

I'd been pulling my hair out over this until just last night, when I finally broke through the randomness.  I filed this bug report with upstream:

http://savannah.gnu.org/bugs/?35757

Try the steps to reproduce which I outlined there to see if this is in fact the same bug.

Comment 6 Stefan Krüger 2012-03-09 07:37:39 UTC
Nice catch Erik, that's indeed the same screen bug I'm facing, easily reproducible with your steps.

Thank you.

Comment 7 Lukáš Nykrýn 2012-03-09 09:35:11 UTC
Thank you for additional information. You are right that this issue was caused by commit f9535294, I was not able to find the exact mistake, but I have tried to built package without this commit and it works fine.
 
I will try to find what is wrong, meanwhile you can to use:
http://lnykryn.fedorapeople.org/screen/

Comment 9 Fedora Update System 2012-03-14 11:44:26 UTC
screen-4.1.0-0.6.20110328git8cf5ef.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/screen-4.1.0-0.6.20110328git8cf5ef.fc16

Comment 10 Fedora Update System 2012-03-17 23:35:39 UTC
Package screen-4.1.0-0.6.20110328git8cf5ef.fc16:
* should fix your issue,
* was pushed to the Fedora 16 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing screen-4.1.0-0.6.20110328git8cf5ef.fc16'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-4004/screen-4.1.0-0.6.20110328git8cf5ef.fc16
then log in and leave karma (feedback).

Comment 11 Fedora Update System 2012-03-28 15:10:28 UTC
screen-4.1.0-0.7.20110328git8cf5ef.fc16 has been submitted as an update for Fedora 16.
https://admin.fedoraproject.org/updates/screen-4.1.0-0.7.20110328git8cf5ef.fc16

Comment 12 Fedora Update System 2012-04-18 19:29:04 UTC
screen-4.1.0-0.7.20110328git8cf5ef.fc16 has been pushed to the Fedora 16 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.