Bug 1480510

Summary: SSH connections get closed when time-based rekeyring is used and ClientAliveMaxCount=0
Product: Red Hat Enterprise Linux 7 Reporter: Renaud Métrich <rmetrich>
Component: opensshAssignee: Jakub Jelen <jjelen>
Status: CLOSED ERRATA QA Contact: Stefan Dordevic <sdordevi>
Severity: high Docs Contact:
Priority: high    
Version: 7.4CC: cww, kbost, mmatsuya, nmavrogi, rhel, rmetrich, sdordevi, szidek
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openssh-7.4p1-15.el7 Doc Type: Bug Fix
Doc Text:
Cause: The timeouts throughout the server code were not handled correctly. Consequence: Setting both time-based rekeying (RekeyLimit=default 45s) and client keep-alive (ClientAliveCountMax=0, ClientAliveInterval=900) in sshd resulted in the connection drop after the rekeying timeout. Fix: The code was updated to handle all combinations of timeouts correctly. Result: The rekeying no longer closes connection
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 18:19:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 1420851, 1476743    

Description Renaud Métrich 2017-08-11 09:47:13 UTC
Description of problem:

When configuring time-based rekeyring on the SSHD server (e.g. RekeyLimit=default 45s)and configuring "ClientAliveMaxCount=0" on the SSHD server also, SSH connection gets unexpectedly closed by the SSHD server just before the rekeyring happens.

Version-Release number of selected component (if applicable):

openssh-7.4p1-11.el7.x86_64

How reproducible:

ALWAYS

Steps to Reproduce:
1. Stop the firewall (for convenience)

systemctl stop firewalld


2. Start a SSHD instance with custom rekeyring based on time and ClientAliveMaxCount=0 (requires ClientAliveInterval != 0)

/usr/sbin/sshd -D -ddd -p 8022 -o "ClientAliveCountMax=0" -o "ClientAliveInterval=900" -o "RekeyLimit=default 45s" -e


3. Connect to that SSHD instance and generate some traffic

ssh -p 8022 root@vm-rhel74 "date; while :; do sleep 30; date; done"


Actual results:

Immediately before Rekeyring is performed, connection gets closed with the following messages on the SSHD server:

"
Timeout, client not responding.
debug1: do_cleanup
debug1: PAM: cleanup
debug1: PAM: closing session
debug1: PAM: deleting credentials
debug3: PAM: sshpam_thread_cleanup entering
"


Expected results:

No connection closure.

Additional info:

This doesn't happen with "traffic-limit" rekeyring only (e.g. "RekeyLimit 4M").

Comment 2 Jakub Jelen 2017-08-11 15:02:36 UTC
Yes, that is indeed a bug. The select() returns on timeout, but it is interpreted as a ClientAlive timeout instead of rekey timeout (sigh ... too many timeouts for a single select()).

The same issue is still reproducible with latest OpenSSH 7.5 and also with current master.

The proposed workaround looks reasonable.

I filled a bug upstream [1] with a patch and briefly tested that it solves our problem. I can build a testing package next week.

[1] https://bugzilla.mindrot.org/show_bug.cgi?id=2757

Comment 13 errata-xmlrpc 2018-04-10 18:19:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:0980