RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1218424 - infinite loop, at 100% cpu in ssh if ^Z is pressed at password prompt
Summary: infinite loop, at 100% cpu in ssh if ^Z is pressed at password prompt
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: openssh
Version: 6.6
Hardware: All
OS: All
unspecified
low
Target Milestone: rc
: ---
Assignee: Jakub Jelen
QA Contact: Stefan Dordevic
URL:
Whiteboard:
Depends On:
Blocks: 1172231 1269194 1402424
TreeView+ depends on / blocked
 
Reported: 2015-05-04 20:53 UTC by Paulo Andrade
Modified: 2019-12-16 04:44 UTC (History)
7 users (show)

Fixed In Version: openssh-5.3p1-120.el6
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1402424 (view as bug list)
Environment:
Last Closed: 2017-03-21 10:01:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenSSH Project 2619 0 None None None 2016-09-27 11:46:11 UTC
Red Hat Product Errata RHSA-2017:0641 0 normal SHIPPED_LIVE Moderate: openssh security and bug fix update 2017-03-21 12:31:22 UTC

Description Paulo Andrade 2015-05-04 20:53:23 UTC
Steps to reproduce:

1. either change or create a test login with /bin/sh as
   login shell
2. "ssh user@localhost" and login
3. "sftp user@localhost" and press ^Z in the password
   prompt

A few times it will work. It depends a bit on what code
is being executed in the readpassphrase function, at
openbsd-compat/readpassphrase.c in the openssh code.

It has been verified that exec'ing again /bin/sh with
--posix before running sftp, or exporting the environment
variable POSIXLY_CORRECT before the "ssh user@localhost"
step prevents the problem.

So, while the problem appears to be kind of expected, it
is being reported in case it was not meant to happen.

Comment 2 Paulo Andrade 2015-05-05 20:37:09 UTC
Some of the comments after steps to reproduce are
actually incorrect.

The conditions to reproduce the bug is actually
having the global variable posixly_correct set
to a non zero value.

posixly_correct is only set, by default, if
argv[0] is either "sh" or "-sh".

The comment about setting POSIXLY_CORRECT in
the ssh environment is incorrect. It was likely
one of the cases it did work, as even in the
environment the problem happens, sometimes it
works, likely due to order if signals, that
may cause it to not start an infinite loop,
triggered by attempting to write to the tty,
and after receiving the signal, raising it
again.

Comment 3 Paulo Andrade 2015-05-15 21:28:13 UTC
This problem happens also in rhel7, but is hard to
reproduce (happens like 1 in 5+ tries). On rhel 6.6
it happens all the time.

I reduced it to, testing posixly_correct code path,
to only need this to prevent the problem on rhel 6.6:

$ . /etc/profile.d/lang.sh

Something that should be useful to note, is that
when it does not enter the infinite loop in ssh,
it just cancels/kills sftp with ^C. When the problem
happens, ^C does not cancel/kill the sftp command;
regardless of having pressed ^Z and then run "fg",
or pressing ^C in the first prompt.

So, the problem may actually be in readline, and,
maybe, still not verified, the "solution" would be,
if in posix mode, do not use/initialize readline.

Comment 4 Paulo Andrade 2015-05-18 19:38:04 UTC
This corrects the problem, but should be more of
a hint of where to look to correct the problem:
---8<---
diff -up bash-4.1/bashline.c.orig bash-4.1/bashline.c
--- bash-4.1/bashline.c.orig	2015-05-18 15:22:24.420999898 -0300
+++ bash-4.1/bashline.c	2015-05-18 15:32:29.271000816 -0300
@@ -370,6 +370,8 @@ initialize_readline ()
     return;
 
   rl_terminal_name = get_string_value ("TERM");
+  if (!rl_terminal_name)
+    rl_terminal_name = "vt100";
   rl_instream = stdin;
   rl_outstream = stderr;
 
---8<---
The readline fallback is "dumb", but that apparently
does not create enough defaults for terminal handling.
With the vt100 default, ^C kills the sftp password
prompt, and it does not go 100% cpu if ^Z is pressed
in the password prompt.

Comment 9 Siteshwar Vashisht 2016-07-21 10:19:49 UTC
I am able to reproduce this issue. This is how the backtrace for ssh process looks like when this issue happens :

(gdb) bt
#0  0x00007fd3fa228048 in tcsetattr (fd=4, optional_actions=<value optimized out>, termios_p=0x7ffc61354560) at ../sysdeps/unix/sysv/linux/tcsetattr.c:84
#1  0x00007fd3fc66ffed in readpassphrase (prompt=0x7ffc61354ea0 "temp.36.221's password: ", buf=0x7ffc61354a50 "", bufsiz=<value optimized out>, flags=2) at readpassphrase.c:143
#2  0x00007fd3fc657f4c in read_passphrase (prompt=0x7ffc61354ea0 "temp.36.221's password: ", flags=0) at readpass.c:153
#3  0x00007fd3fc63b360 in userauth_passwd (authctxt=0x7ffc61355010) at sshconnect2.c:967
#4  0x00007fd3fc63c62d in userauth (authctxt=0x7ffc61355010, authlist=0x7fd3fd79ac00 "publickey,gssapi-keyex,gssapi-with-mic,password") at sshconnect2.c:468
#5  0x00007fd3fc65fc73 in dispatch_run (mode=0, done=0x7ffc61355038, ctxt=0x7ffc61355010) at dispatch.c:98
#6  0x00007fd3fc63d6fd in ssh_userauth2 (local_user=0x7fd3fd791c60 "root", server_user=0x7fd3fd77c752 "temp", host=0x7fd3fd796510 "172.16.36.221", sensitive=0x7fd3fc8923a0) at sshconnect2.c:432
#7  0x00007fd3fc6386fc in ssh_login (sensitive=0x7fd3fc8923a0, orighost=<value optimized out>, hostaddr=0x7fd3fc8923c0, pw=<value optimized out>, timeout_ms=-1000) at sshconnect.c:1138
#8  0x00007fd3fc62e7de in main (ac=<value optimized out>, av=<value optimized out>) at ssh.c:904


ssh process is trying to set terminal attributes for fd=4 which refers to "/dev/tty" :
(gdb) frame 1
#1  0x00007fd3fc66ffed in readpassphrase (prompt=0x7ffc61354ea0 "temp.36.221's password: ", buf=0x7ffc61354a50 "", bufsiz=<value optimized out>, flags=2) at readpassphrase.c:143
143                     while (tcsetattr(input, _T_FLUSH, &oterm) == -1 &&
(gdb) p input
$1 = 4

However since ssh process is in background it keeps receiving 'SIGTTOU' (background processes can not set terminal attributes) and returns with errno = EINTR. It is stuck in below loop :

143                     while (tcsetattr(input, _T_FLUSH, &oterm) == -1 &&
144                         errno == EINTR)
145                             continue;


ssh process is back to normal cpu usage when it is brought to foreground. This issue is not specific to bash and happens with ksh too. I would like somebody from openssh team to look at if it could be considered a bug with openssh.

Comment 10 Jakub Jelen 2016-07-21 15:13:27 UTC
Thank you for a verbose analysis. I see the same code in openssh upstream, so it should be applicable to RHEL7 and Fedora. Though not sure how to correctly resolve such a problem, especially, when it is such a corner case. Also the impact for customer does not look very critical.

The behavior on the bash side is most likely intended. The OpenSSH part is in the OpenBSD (compat) code and not sure how likely to change. I will check that tomorrow, what we can do.

Comment 12 Jakub Jelen 2016-09-27 11:46:11 UTC
Sorry for a late reply. I posted the bug upstream. The idea is probably to check that SIGTTOU signal was caught and we should not cycle anymore. With something like this we should make it working:

--- openssh-7.3p1/openbsd-compat/readpassphrase.c.patch	2016-09-27 11:36:46.801980295 +0200
+++ openssh-7.3p1/openbsd-compat/readpassphrase.c	2016-09-27 11:38:11.161970239 +0200
@@ -157,7 +157,7 @@ restart:
 	/* Restore old terminal settings and signals. */
 	if (memcmp(&term, &oterm, sizeof(term)) != 0) {
 		while (tcsetattr(input, _T_FLUSH, &oterm) == -1 &&
-		    errno == EINTR)
+		    errno == EINTR && signo[SIGTTOU] != 1)
 			continue;
 	}
 	(void)sigaction(SIGALRM, &savealrm, NULL);

as soon as we will have upstream opinion on this bug, we can consider backporting to RHEL6.

Comment 14 Jakub Jelen 2016-10-13 06:17:58 UTC
Thank you Darren for looking into that.
I was certainly searching for the difference against OpenBSD sources, but I probably didn't find recent OpenBSD repository. Do you have some link to CVS or HTTP version of it?

Comment 18 Darren Tucker 2016-10-13 13:25:24 UTC
All of the files that we (try to) keep in sync have a marker like this denoting the upstream file:

/* OPENBSD ORIGINAL: lib/libc/gen/readpassphrase.c */

In this case the upstream file is here:
http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/lib/libc/gen/readpassphrase.c

Any one of the anonymous CVS servers from https://www.openbsd.org/anoncvs.html will also have it.

Comment 20 Jakub Jelen 2016-10-13 14:45:11 UTC
Thanks Darren once more :)

Good catch Paulo.

It seems like we are hitting some another race condition now. In some cases, after returning to the foreground, we are not receiving the prompt back from the sftp, because the process is blocked in the kill() call:

#0  0x00007f7b1ce1d8c7 in kill () from /lib64/libc.so.6
#1  0x00007f7b1f3130c5 in readpassphrase (prompt=0x7ffce5c7c1a0 "test@localhost's password: ", 
    buf=0x7ffce5c7bd50 "", bufsiz=<value optimized out>, flags=2) at readpassphrase.c:182
#2  0x00007f7b1f2faf0c in read_passphrase (prompt=0x7ffce5c7c1a0 "test@localhost's password: ", 

This is related to the part of the (Linux)  man 3p kill:

> If the value of pid causes sig to be generated for the sending process,
> and if sig is not blocked for the calling thread and if no other thread has
> sig unblocked or is waiting  in  a sigwait() function for sig,
> either sig or at least one pending unblocked signal shall be delivered to
> the sending thread **before kill() returns**.

This is happening in RHEL6, but it will probably be in all Linuxes and POSIX systems, but not in OpenBSD [1] if I see right. This might need some tweaking for the portable version. I will try to investigate further what can we do about it tomorrow.

[1] http://man.openbsd.org/kill.2

Comment 21 Darren Tucker 2016-10-13 14:55:36 UTC
I'll see if I can reproduce that, but it won't be for a day or two.  Can get get the signal kill is trying to send out for frame 0?

Comment 22 Jakub Jelen 2016-10-13 15:32:52 UTC
(In reply to Darren Tucker from comment #21)
> I'll see if I can reproduce that, but it won't be for a day or two.  Can get
> get the signal kill is trying to send out for frame 0?

In previous build, it was optimized out. Non optimized build it points to the signal 22:

(gdb) f 1
#1  0x00007fdd560accac in readpassphrase (prompt=0x7fff5a867ef0 "test@localhost's password: ", 
    buf=0x7fff5a867aa0 "", bufsiz=1024, flags=2) at readpassphrase.c:182
182				kill(getpid(), i);

(gdb) p i
$1 = 22

Therefore our discussed SIGTOU:

/usr/include/bits/signum.h
#define	SIGTTOU		22	/* Background write to tty (POSIX).  */

Comment 23 Darren Tucker 2016-10-14 16:27:12 UTC
FYI: I just added a patch to the upstream bug that I think will resolve this.

Comment 30 errata-xmlrpc 2017-03-21 10:01:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0641.html


Note You need to log in before you can comment on or make changes to this bug.