Bug 1701050 - SSH connection hangs on Azure
Summary: SSH connection hangs on Azure
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 4.2.0
Assignee: Steve Milner
QA Contact: Micah Abbott
Depends On:
TreeView+ depends on / blocked
Reported: 2019-04-17 21:30 UTC by Alex Crawford
Modified: 2019-10-16 06:28 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Azure requires ClientAliveInterval to be set to 180 within the sshd configuration Consequence: When not set ssh connections hang within Azure Fix: Default the sshd config to ClientAliveInterval 180 Result: SSH no longer hangs within Azure
Clone Of:
Last Closed: 2019-10-16 06:28:06 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:2922 0 None None None 2019-10-16 06:28:21 UTC

Description Alex Crawford 2019-04-17 21:30:58 UTC
Description of problem:

When SSH'ing to an RHCOS host on Azure, the connection times out if it's left idle for too long (about five minutes).

Version-Release number of selected component (if applicable):

How reproducible:


Steps to Reproduce:
1. Create host on Azure
2. SSH to host
3. Wait five minutes

Actual results:

Connection hangs and must be terminated.

Expected results:

Connection is kept alive.

Additional info:

The SSHD configuration disables the ClientAliveInterval. It should be set to 180. Steven Zarkos (MSFT engineer) told me that value years ago and we use that in Container Linux. Making that change in RHCOS also fixes the issue.

Comment 5 Micah Abbott 2019-07-01 18:39:03 UTC
Verified changes are present in RHCOS using 4.2.0-0.nightly-2019-06-30-221852

$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.2.0-0.nightly-2019-06-30-221852   True        False         3m14s   Cluster version is 4.2.0-0.nightly-2019-06-30-221852
[miabbott@mastershake (container) ~/openshift-cluster-installs/4.2.0-0.nightly-2019-06-30-221852 ]$ oc get nodes
NAME                                         STATUS   ROLES    AGE   VERSION
ip-10-0-139-116.us-west-2.compute.internal   Ready    worker   12m   v1.14.0+04ae0f405
ip-10-0-141-177.us-west-2.compute.internal   Ready    master   21m   v1.14.0+04ae0f405
ip-10-0-150-236.us-west-2.compute.internal   Ready    worker   12m   v1.14.0+04ae0f405
ip-10-0-151-141.us-west-2.compute.internal   Ready    master   21m   v1.14.0+04ae0f405
ip-10-0-163-151.us-west-2.compute.internal   Ready    worker   12m   v1.14.0+04ae0f405
ip-10-0-167-205.us-west-2.compute.internal   Ready    master   21m   v1.14.0+04ae0f405
[miabbott@mastershake (container) ~/openshift-cluster-installs/4.2.0-0.nightly-2019-06-30-221852 ]$ oc debug node/ip-10-0-139-116.us-west-2.compute.internal
Starting pod/ip-10-0-139-116us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
If you don't see a command prompt, try pressing enter.
sh-4.2# chroot /host
sh-4.4# rpm-ostree status
State: idle
AutomaticUpdates: disabled
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:66e8bca50ff16c7082425b8f42578b69f1d28b08fe62d359451132a3e837735d
              CustomOrigin: Managed by pivot tool
                   Version: 420.8.20190630.0 (2019-06-30T20:53:07Z)

              CustomOrigin: Provisioned from oscontainer
                   Version: 420.8.20190624.0 (2019-06-24T00:25:32Z)
sh-4.4# grep ClientAlive /etc/ssh/sshd_config 
ClientAliveInterval 180
#ClientAliveCountMax 3

Comment 6 errata-xmlrpc 2019-10-16 06:28:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.