Bug 813607

Summary: cannot use screen in xterm as user, root appears to work
Product: [Fedora] Fedora Reporter: Brian Johnson <voyager.106>
Component: screenAssignee: Petr Hracek <phracek>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 17CC: hgb, jcp, lnykryn, ovasik, rmilner, trevor
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-08-01 08:33:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Brian Johnson 2012-04-18 03:08:49 UTC
Description of problem: When trying to use screen in an xterm as a non-root user, error message flashes (too quickly to see what it is, though it appears to have to do with memory) and then immediately terminates with "[screen is terminating]". Occasionally gnome will report a crash in my shell (tcsh)

I am, however, able to sudo to root, run screen, and it runs as expected.


Version-Release number of selected component (if applicable):
screen-4.1.0-0.7.20110819git450e8f.fc17.i686


How reproducible:always


Steps to Reproduce:
1.start xterm 
2.run 'screen' at the prompt

  
Actual results:
screen immediately terminates with an error followed by "[screen is terminating]"


Expected results:
screen runs normally and allows me to create new virtual terminals within xterm


Additional info:

Comment 1 Lukáš Nykrýn 2012-04-18 12:11:12 UTC
Screen seems to work fine for me with tcsh and xterm in f17.
Can you please try to update screen to newest version 
screen-4.1.0-0.9.20120314git3c2946.fc17?
Secondly if you set something in ~/.screenrc can you please post it here?

Comment 2 Brian Johnson 2012-04-18 13:51:51 UTC
Thank you for the quick response.

I've experienced this issue on 2 different machines, a laptop running F17beta x86_64 that was updated from a new F17 alpha install, and a desktop running i386 that I updated to F17 beta yesterday from F16 using preupgrade.

Updated my home desktop (i386) to screen-4.1.0-0.9.20120314git3c2946 but can't try it until I get home tonight (and gnome-shell starts working again, which I've filed a bug about :) )

My laptop (x86_64) also has screen-4.1.0-0.9.20120314git3c2946 installed on it and I just tried it again. Unfortunately, I got the same result as previously.

also of note, it doesn't seem to matter which shell I'm using. I've tried in both tcsh and bash and still get the same thing.

I do not have a customized ~/.screenrc file and I've not done anything to the default /etc/screenrc that comes on the system.

Thanks!

Comment 3 Lukáš Nykrýn 2012-04-19 07:25:14 UTC
And if you run this in other terminal e.g. gnome-terminal, does it also crash or not?

Comment 4 Brian Johnson 2012-04-19 14:23:17 UTC
Unfortunately, I'm at work and can't try it in gnome-term there, but last night I *did* get a chance to try the updated screen with xterm. Same result.

Did try gnome-term on my x86_64 laptop this morning, though, and same result. Quick flash of something than "screen is terminating". Su'd to root in gnome-term and it worked fine.

(In reply to comment #3)
> And if you run this in other terminal e.g. gnome-terminal, does it also crash
> or not?

Comment 5 Brian Johnson 2012-04-19 17:24:12 UTC
I finally did get to see the entirety of the message that flashes, at least on my laptop:


manpath: warning: $MANPATH set, ignoring /etc/man_db.conf
free(207da4d) bad block. (memtop = 0x21bd000 membot = 0x2074000)

Comment 6 Brian Johnson 2012-05-17 14:30:38 UTC
Hello,

I was just curious if there was any more movement on this bug....I did a preupgrade from F16 to F17 beta 2 days ago on my work desktop and I'm seeing the same problem on it now.


free(945f24d) bad block. (memtop = 0x9595000 membot = 0x9456000)

thanks!

Comment 7 Lukáš Nykrýn 2012-05-21 08:49:26 UTC
I am still unable to reproduce this problem. I have tried several F17 installations and screen work fine on every one of them. If you are familiar with valgrind, can you try to find the exact line of code where the screen is crashing?

Comment 8 Peter Rayner 2012-05-25 04:33:40 UTC
seeing the same thing connecting via ssh, perhaps it is the shell not screen

Comment 9 Brian Johnson 2012-05-25 14:17:10 UTC
Lukas, I'll give valgrind a shot and let you know. I tried just running 'valgrind screen' but that gives me a permission denied problem.

(In reply to comment #7)
> I am still unable to reproduce this problem. I have tried several F17
> installations and screen work fine on every one of them. If you are familiar
> with valgrind, can you try to find the exact line of code where the screen
> is crashing?

Comment 10 Brian Johnson 2012-05-25 14:17:46 UTC
Peter -- yeah, after a recent upgrade of my system, I noticed that even just trying to open xterm caused the error and didn't even had to use screen. It may be a shell issue, but I've seen it in both tcsh (my default) as well as bash....

(In reply to comment #8)
> seeing the same thing connecting via ssh, perhaps it is the shell not screen

Comment 11 Lukáš Nykrýn 2012-06-04 12:51:15 UTC
*** Bug 828174 has been marked as a duplicate of this bug. ***

Comment 12 John C Peterson 2012-06-21 18:52:51 UTC
I'm seeing a similar problem here (installed Fedora 17 beta, and currently up to date as of 20 June, 2012).

I use "sterile" or dedicated user accounts on my system for running Skype and other 3rd party binary applications that might be giving me the computer equivalent of a prostrate examination (the snooping activity of Skype was documented several years ago).

I just ssh into the account, and possibly start a vnc server or sometimes just use the X11 forwarding feature to run GUI applications. I've never had any problems with that before, but after upgrading to Fedora 17 beta, I get kicked out with an error message similar to Brian's, comment #6. Usually before even getting a command prompt, (I did get a prompt on one occasion, but exited before I could type anything in). For example;

% env -i ssh skype@localhost
Enter passphrase for key '/home/jcp/.ssh/id_dsa': 
Last login: Wed Jun 20 22:51:26 2012 from localhost
tput: No value for $TERM and no -T specified
free(197344d) bad block. (memtop = 0x1a9b000 membot = 0x195e000)
Connection to localhost closed.

In my case, it does appear to be specific to tcsh. My default shell is tcsh, but when I explicitly specify bash/sh, everything works just fine as in this example;

% env -i ssh skype@localhost /bin/sh

This might be (just guessing from the error message) a bad call to free(). The address values, shown in parenthesis after free, and for memtop, membot are always different. But it seems that memtop - membot is always 0x13d000

Comment 13 Fedora Admin XMLRPC Client 2013-02-05 12:31:21 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 14 Trevor Cordes 2013-03-11 06:07:40 UTC
This is a tcsh bug, or something directly related to its interaction with other programs.  Someone with permission should change the bug subject.

We aren't the only ones seeing this, I've found ubuntu & hpux (and more) people having the same problem.

The bug for me shows up as:
  ssh root@box2withtcsh
free(82d9a4d) bad block. (memtop = 0x83a8800 membot = 0x82d1000)

I'm sshing from box1 in a gnome-terminal (the terminal seems irrelevant for this bug).

About half the time I run that ssh above, it crashes as shown.  About half the time it completes ok and logs in ok like nothing is the matter!  So this is non-deterministic or some sort of race condition.

If you set your login shell to /bin/bash you can always ssh into box2, no problems, no crash.  Then you can always run tcsh manually.

As another posting about this problem says, try adding "set verbose" near the top of your /etc/csh.cshrc file.  Then trigger the bug and see where tcsh is crashing.

Mine crashes every time in my /etc/csh.login right after the command:
setenv TERM     linux
and before the command
setenv ZIPOPT   "-vy"           # zip   -> verbose output & don't follow links

It would appear TERM=linux is the problem.

When I comment out TERM=linux then the bug disappears.

Setting it to TERM=xterm also causes the bug to disappear.

All other posters on this bug, please check if you have TERM=linux in your configs:
grep TERM /etc/csh* /etc/tcsh* ~/csh* ~/tcsh*

Is TERM=linux not a valid thing to do anymore?  What am I losing/gaining by TERM=linux?  vs xterm, etc.  What is the "correct" thing to set TERM to?

But certainly tcsh should not bomb just because of a bogus term setting, so this is a tcsh bug that needs to be fixed.

This did NOT occur in Fedora 16 tcsh-6.17-15.fc16.i686.
It does occur in Fedora 17 tcsh-6.17-18.fc17.i686
Bug was introduced between those two.

Comment 15 rmilner 2013-03-19 21:39:10 UTC
Some additional information on the symptoms I've seen, which differ a bit from previous posters'.

This has been a problem for me on my ISP's Linux systems for the past several months. I had to switch to bash for my login shell so I could get on, and whenever I tried to run (t)csh from the command line I still got the "free(xxxxxxx) bad block" error and a core dump.

Like Trevor, using "set verbose" in my .cshrc pointed to this line:

   setenv TERM xterms

I've had this line in my Linux login files for at least 15 years, with no prior problems.

Unlike Trevor, however, in my testing the choice of terminal type didn't matter: tcsh still died when it reached that command in my .cshrc, whatever valid option I specified (including "xterm"). If it's commented out, though, tcsh starts up fine every time.

I did encounter the weird behavior where, after adding the "set verbose" line with *no other changes* to my .cshrc, it actually worked. Twice. Then it went back to dumping core every time. Perhaps the code or something else the program needed was cached by that time, so it could start up faster? At any rate, this would seem to support the idea of a race condition.

Note that I can run that same command manually once I'm in tcsh, with no error. It also doesn't affect running csh scripts, and I can see from the verbose output that it is doing the setenv TERM. But for logins or command-line startups, tcsh always dies at that command if it's there.

Very strange bug. I agree with previous posters that tcsh should give a sensible error if there really is a problem, rather than just dumping core. How rude.

Comment 16 Trevor Cordes 2013-05-21 17:48:36 UTC
Strange, I wanted to play with this bug some more and now I can't reproduce it.  Perhaps some yum updates since then have fixed it?  I checked and tcsh doesn't seem to have been updated since.  Perhaps there was a library tcsh uses that was fixed that has fixed this?

Can anyone else still reproduce this bug on a fully updated F17 or F18?

Anyone want to try to guess which package update fixed it?

Comment 17 Fedora End Of Life 2013-07-04 02:31:31 UTC
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 18 Fedora End Of Life 2013-08-01 08:33:37 UTC
Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.