Red Hat Bugzilla – Bug 495074
nohup jobs killed on logout if konsole is open!
Last modified: 2009-07-14 11:27:50 EDT
This is something we have been trying to track down for a while.
1. We start a openmp run of a fortran program using intel compiler on a
quad 64 bit system. The run is started with "nohup run&" command, where
run is a small script that does:
ulimit -s 1200000
2. If we leave the konsole window that the nohup process started from and
logout job is killed.
3. If we ctrl-D and close the konsole window and logout job keeps running.
4. Happens every time.
Details: Fedora 9+all updates including test. Using KDE-4.2.2. X86_64 system.
I had been having a similar problem which I think are related:
When I ssh from home and start the same job as above and do not log out, my
providers connection times out and kills the ssh connnection. The job is then
killed. If I ctrl-d out from the ssh connection the job keeps running not matter
how many time I log in and out or even if it times out.
maybe related to Bug 467622
It is related since I reported that too.
Did some more testing....if I repeat the above experiment with
different binaries it does not crash. For example I tried bulding
kdelibs from my account by putting "make -j4" into the run script,
and it was running on 4 cpu's. I kept the nohup window open and
logged in an out and it did not crash!
Which, makes me remember something I heard but vaguely remember;
Intel compiler has all its dynamical libraries in a different location
then the system ones. I specify those in /etc/ld.so.conf file and
everything works fine.
I am using nvidia binary drivers.
Apparently, "libGL.so.1 "takes over" dynamic library loading" here:
It may just be the case since if we disable some kde "effects" for logout
that link to GL the crash does not happen.
Anyway I can get around this?
I think you can get around this by running "nohup run&" not from konsole, but with
"ssh <remote machine> nohup run&" from your computer or by running it directly in terminal where you are logged on via ssh... then it should survive. If I understand correctly, you are connected to the remote machine via ssh, you run konsole on remote machine and "nohup run&" is a child process of konsole. ssh timeouts, konsole is killed by SIGHUP and bash kills child processes (not with SIGHUP) as well. IMHO not a bug in nohup and something what could be reassigned to either bash, ssh or x11 - but I doubt they have any chance to fix it...
Could you please confirm this solution? If so, I'll close it NOTABUG, otherwise CANTFIX ...
Not exactly. I connect to remote machine via ssh, I do not run konsole
on the remote machine, I just do "nohup run&" in the shell login. If
I then ctrl-d and kill the connection the job keep running fine, but
if I let the ssh timeout (which I did not set since if I ssh locally
between computers in my office the connection never times out) the job
is killed as well.
I think these two things are related somehow.
However, what I describe in the original bug report is more important
and only happens to applications that have their libraries in an
unconventional location (like the intel compiler). This is happening
to all the members of my research group. They start a nohup job and
would like to logout without closing their windows and the job is gone!
something weird is going on!
I'll try to reassign that to openssh, as I guess it's not fault of nohup. We'll see if that's the correct component which could act/prevent such thing, for me it looks like nohup is killed by SIGKILL, but some strace might say more about this...
I don't think it is the openssh. I have done more testing:
1. Open konsole and run a true openmp job (not make -j4). If we now logout
the job is aborted or hangs on a single cpu.
. More detailed examination shows:
a) If one compiles the same program to run on a single processor the job
keeps running after logout/login.
b) Compiling the program to link statically with all the libraries does not
c) If I run the program via strace (still using nohup) I get thousands of
lines of "sched_yield() = 0" in the strace output (in the case of crash
2, Everything works perfectly if one closes the konsole window that started the
nohup job before logout.
The question is what does logout do to open shells (konsole). KDE is obviously
saving some information about that window to open it exactly as it was in the
next login. Does it try to save the processes it is running as well? I think
the key is understanding this.
This message is a reminder that Fedora 9 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 9. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora
'version' of '9'.
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version prior to Fedora 9's end of life.
Bug Reporter: Thank you for reporting this issue and we are sorry that
we may not be able to fix it before Fedora 9 is end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora please change the 'version' of this
bug to the applicable version. If you are unable to change the version,
please add a comment here and someone will do it for you.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
The process we are following is described here:
Fedora 9 changed to end-of-life (EOL) status on 2009-07-10. Fedora 9 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version.
Thank you for reporting this bug and we are sorry it could not be fixed.