Bug 1026970

Summary: Cannot SSH to app; cannot restart app
Product: OpenShift Online Reporter: steven.merrill
Component: ContainersAssignee: Rob Millner <rmillner>
Status: CLOSED WORKSFORME QA Contact: libra bugs <libra-bugs>
Severity: low Docs Contact:
Priority: unspecified    
Version: 1.xCC: agrimm, bmeng, jkeck, mfisher, rmillner, xtian
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1016917 Environment:
Last Closed: 2013-12-17 16:16:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description steven.merrill 2013-11-05 18:23:25 UTC
Description of problem:

Our client has a Silver subscription to OpenShift Online. On application 5238c2aa5973cabd1d0001d3 / utilities-allyou.rhcloud.com, the gear cannot be SSHed to, nor can it be restarted from the Console.

Version-Release number of selected component (if applicable):

OpenShift Online Silver

How reproducible:

Try to restart the app from the Console, get an error.

Try to SSH to the app, get a broken pipe error, a la:

$ rhc ssh utilities
Connecting to 5238c2aa5973cabd1d0001d3.com ...
Write failed: Broken pipe

Steps to Reproduce:

Try to restart from CLI tools:

$ rhc-ay app restart utilities
Unable to complete the requested operation due to: Failed to correctly execute all parallel operations.
Reference ID: df2580cf5529553ff87f308bafc0ccf4

Try to restart from the Console and get a similar message.

Try to SSH to the app and get a broken pipe:

$ rhc ssh utilities
Connecting to 5238c2aa5973cabd1d0001d3.com ...
Write failed: Broken pipe

Actual results:

Unable to restart or SSH to my app to do Jenkins builds.

Expected results:

I can SSH to my app to do Jenkins builds.

Additional info:

The application is ID 5238c2aa5973cabd1d0001d3, or http://utilities-allyou.rhcloud.com/.

Comment 1 steven.merrill 2013-11-05 18:35:42 UTC
Just to be clear, restarting the app isn't as important as making SSH work. I merely restarted in the hope that it might make SSH work again.

Comment 2 steven.merrill 2013-11-05 18:50:06 UTC
And a little more debugging information - it looks like it manages to get into oo-trap-user, but then dies soon thereafter.

From an `ssh -vvv`:

[snipped]
debug1: Server accepts key: pkalg ssh-rsa blen 277
debug2: input_userauth_pk_ok: [snip]
debug3: sign_and_send_pubkey: [snip]
debug1: Remote: Forced command: /usr/bin/oo-trap-user
debug1: Remote: X11 forwarding disabled.
debug1: Authentication succeeded (publickey).
Authenticated to utilities-allyou.rhcloud.com ([23.22.254.239]:22).
debug1: channel 0: new [client-session]
debug3: ssh_session2_open: channel_new: 0
debug2: channel 0: send open
debug1: Requesting no-more-sessions
debug1: Entering interactive session.
Write failed: Broken pipe

Comment 3 steven.merrill 2013-11-05 19:41:04 UTC
This appears to have cleared itself up without any intervention from my end. Could this have been related to the deploy today? It seemed to start before the deploy and fix itself during it.

Comment 4 Rob Millner 2013-11-05 19:44:30 UTC
Its possible that this was related to a deploy.  Working through ops to determine what happened.

Comment 5 Rob Millner 2013-11-12 19:51:12 UTC
This turned out to be deploy related and should only be transient failures.

Please feel free to re-open if you are still experiencing problems.

Comment 6 steven.merrill 2013-12-17 16:05:48 UTC
This appears to be happening again with the same gear.

This morning, I got this:

rhc-ay ssh utilities
Connecting to 5238c2aa5973cabd1d0001d3.com ...
Write failed: Broken pipe

And then the error message progressed a bit to:

rhc-ay ssh utilities
Connecting to 5238c2aa5973cabd1d0001d3.com ...
shell request failed on channel 0

Which is actually this under the covers:

ssh -v 5238c2aa5973cabd1d0001d3.com
OpenSSH_6.2p2, OSSLShim 0.9.8r 8 Dec 2011
debug1: Reading configuration data /Users/smerrill/.ssh/config
debug1: Reading configuration data /etc/ssh_config
debug1: /etc/ssh_config line 20: Applying options for *
debug1: /etc/ssh_config line 53: Applying options for *
debug1: Connecting to utilities-allyou.rhcloud.com [23.22.254.239] port 22.
debug1: Connection established.
debug1: identity file /Users/smerrill/.ssh/id_rsa type 1
debug1: identity file /Users/smerrill/.ssh/id_rsa-cert type -1
debug1: identity file /Users/smerrill/.ssh/id_dsa type 2
debug1: identity file /Users/smerrill/.ssh/id_dsa-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.2
debug1: Remote protocol version 2.0, remote software version OpenSSH_5.3
debug1: match: OpenSSH_5.3 pat OpenSSH_5*
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: server->client aes128-ctr hmac-md5 none
debug1: kex: client->server aes128-ctr hmac-md5 none
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
debug1: Server host key: RSA cf:ee:77:cb:0e:fc:02:d7:72:7e:ae:80:c0:90:88:a7
debug1: Host 'utilities-allyou.rhcloud.com' is known and matches the RSA host key.
debug1: Found key in /Users/smerrill/.ssh/known_hosts:667
debug1: ssh_rsa_verify: signature correct
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug1: SSH2_MSG_NEWKEYS received
debug1: Roaming not allowed by server
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic
debug1: Next authentication method: publickey
debug1: Offering RSA public key: /Users/smerrill/.ssh/id_rsa
debug1: Remote: Forced command: /usr/bin/oo-trap-user
debug1: Remote: X11 forwarding disabled.
debug1: Server accepts key: pkalg ssh-rsa blen 277
debug1: Remote: Forced command: /usr/bin/oo-trap-user
debug1: Remote: X11 forwarding disabled.
debug1: Authentication succeeded (publickey).
Authenticated to utilities-allyou.rhcloud.com ([23.22.254.239]:22).
debug1: channel 0: new [client-session]
debug1: Requesting no-more-sessions
debug1: Entering interactive session.
debug1: Sending environment.
debug1: Sending env LANG = en_US.UTF-8
shell request failed on channel 0

It might be related to some kind of ops issue because I am also unable to restart the gear:

rhc-ay app-restart utilities
Unable to complete the requested operation due to: Failed to correctly execute all parallel operations - ["RestartCompOp"].
Reference ID: c1e2b7a1dbb53e9f72a7e8455dda7f88

Any ideas? I don't see any known outages.

Comment 7 steven.merrill 2013-12-17 16:16:21 UTC
And right after I reopened this, it appeared to come back online but with its data directory empty (was it perhaps migrated to a new node?) It was offline to SSH connections from 10:30 AM EST to 11:08 AM EST.