Bug 39128 (openssh_hang)

Summary: openssh hanging bug on exit with background processes; hang-on-exit
Product: [Retired] Red Hat Linux Reporter: John Bowman <bowman>
Component: opensshAssignee: Tomas Mraz <tmraz>
Status: CLOSED UPSTREAM QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 7.1CC: eric, hjl, ivo, j.k.vanamerongen, shishz
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
URL: http://www.math.ualberta.ca/imaging/snfs
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-02-03 11:08:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
openssh hanging bug on exit with background processes
none
Latest hang-on-exit patch with X-hang + exit-delay-option patches; see www.math.ualberta.ca/imaging/snfs
none
Minor change to hang-on-exit patch to work with -T option none

Description John Bowman 2001-05-04 16:40:08 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.73 [en] (X11; U; Linux 2.2.19 i586)

Description of problem:


How reproducible:
Always

Steps to Reproduce:
1. ssh somehost
2. sleep 20&
3. exit
	

Additional info:

I have developed a patch to fix the notorious hanging bug when exiting from
recent versions of open-ssh in the presence of active background processes.

There has been alot of frustration and misinformation about this bug posted
on the net (eg. see RedHat's openssh-closing.txt from
openssh-2.5.2p2-5.src.rpm).  The supposed workaround "shopt -s huponexit"
mentioned in the openssh FAQ apparently only works with bash2, not with
bash1.

In any case huponexit is not a workaround at all: it is frequently
desirable to
launch a long simulation in the background and then log out, leaving the
program running in the background. Normally, backgrounded processes should
not be killed by ssh.

Below I have attached a new patch (to openssh-2.9p1) that doesn't appear to
break ssh or scp, unlike previous attempts in openssh-2.3.0p1. It passes
repeated tests of the form (bash syntax) on both linux-i386 and linux-alpha
architectures running RedHat 6.2:

while [ 1 ] ; do \
	ssh somehost "dd if=/dev/zero bs=8192 count=10" | wc -c ; \
done

-- John Bowman

diff -ur openssh-2.9p1/clientloop.c openssh-2.9p1J/clientloop.c
--- openssh-2.9p1/clientloop.c	Fri Apr 20 06:50:51 2001
+++ openssh-2.9p1J/clientloop.c	Wed May  2 16:21:16 2001
@@ -440,9 +440,13 @@
 		len = read(connection_in, buf, sizeof(buf));
 		if (len == 0) {
 			/* Received EOF.  The remote host has closed the connection. */
-			snprintf(buf, sizeof buf, "Connection to %.300s closed by remote
host.\r\n",
-				 host);
-			buffer_append(&stderr_buffer, buf, strlen(buf));
+/* 
+ * This message duplicates the one already in client_loop().
+ *
+ *			snprintf(buf, sizeof buf, "Connection to %.300s closed by remote
host.\r\n",
+ *				 host);
+ *			buffer_append(&stderr_buffer, buf, strlen(buf));
+ */			
 			quit_pending = 1;
 			return;
 		}
diff -ur openssh-2.9p1/nchan.c openssh-2.9p1J/nchan.c
--- openssh-2.9p1/nchan.c	Tue Apr  3 07:02:48 2001
+++ openssh-2.9p1J/nchan.c	Wed May  2 16:19:11 2001
@@ -56,7 +56,7 @@
 
 /* helper */
 static void	chan_shutdown_write(Channel *c);
-static void	chan_shutdown_read(Channel *c);
+void		chan_shutdown_read(Channel *c);
 
 /*
  * SSH1 specific implementation of event functions
@@ -479,7 +479,7 @@
 		c->wfd = -1;
 	}
 }
-static void
+void
 chan_shutdown_read(Channel *c)
 {
 	if (compat20 && c->type == SSH_CHANNEL_LARVAL)
diff -ur openssh-2.9p1/nchan.h openssh-2.9p1J/nchan.h
--- openssh-2.9p1/nchan.h	Sun Mar  4 23:16:12 2001
+++ openssh-2.9p1J/nchan.h	Wed May  2 16:19:11 2001
@@ -88,4 +88,5 @@
 
 void    chan_init_iostates(Channel * c);
 void	chan_init(void);
+void	chan_shutdown_read(Channel *c);
 #endif
diff -ur openssh-2.9p1/session.c openssh-2.9p1J/session.c
--- openssh-2.9p1/session.c	Wed Apr 18 09:29:34 2001
+++ openssh-2.9p1J/session.c	Wed May  2 16:20:04 2001
@@ -1960,6 +1960,8 @@
 	 */
 	if (c->ostate != CHAN_OUTPUT_CLOSED)
 		chan_write_failed(c);
+	if (c->istate != CHAN_INPUT_CLOSED)
+		chan_shutdown_read(c);
 	s->chanid = -1;
 }

Comment 1 John Bowman 2001-05-04 16:42:19 UTC
Created attachment 17366 [details]
openssh hanging bug on exit with background processes

Comment 2 John Bowman 2001-05-10 20:44:00 UTC
This patch only fixes the hang-on-exit bug under Protocol 2.

Comment 3 John Bowman 2001-05-19 22:58:23 UTC
Created attachment 19030 [details]
Latest hang-on-exit patch with X-hang + exit-delay-option patches; see www.math.ualberta.ca/imaging/snfs

Comment 4 Seth Vidal 2001-07-12 16:34:27 UTC
Has this made it into an unpushed set of rawhide rpms yet? If not I'm going to
see if the patch applies cleanly and try and rebuild 2.9p2 with it.

I noticed some discussion on the openssh-unix-dev mailing list regarding this
patch. Has there been any more latent issues with it?


Comment 5 John Bowman 2001-09-12 18:59:23 UTC
Created attachment 31678 [details]
Minor change to hang-on-exit patch to work with -T option

Comment 6 John Dalbec 2002-02-19 19:48:33 UTC
As I understand the problem, it's just a question of making sure the process isn't holding the terminal open (stdout, stderr, _or stdin_).
E.g.:
sleep 60 > /dev/null 2>&1 &; exit
hangs, but
sleep 60 < /dev/null > /dev/null 2>&1 &; exit
does not.  I guess some programs/commands are smart enough to close stdin right away if they're not using it and wouldn't need the second redirection.

Comment 7 Tomas Mraz 2005-02-03 11:08:53 UTC
Please try to discuss the patch upstream
(http://bugzilla.mindrot.org/show_bug.cgi?id=52).


Comment 8 Tomas Mraz 2005-02-04 10:57:16 UTC
*** Bug 77242 has been marked as a duplicate of this bug. ***

Comment 9 Tomas Mraz 2005-02-07 13:10:01 UTC
*** Bug 76926 has been marked as a duplicate of this bug. ***