Bug 986865 - [host-deploy] InterruptedIOException during ssh session
[host-deploy] InterruptedIOException during ssh session
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.2.0
All Linux
urgent Severity urgent
: ---
: 3.3.0
Assigned To: Alon Bar-Lev
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-22 05:38 EDT by Julio Entrena Perez
Modified: 2015-09-22 09 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-07-22 12:13:51 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Julio Entrena Perez 2013-07-22 05:38:11 EDT
Description of problem:
Adding a RHEL host to RHEV-M fails with IOError:

2013-07-22 10:41:29 DEBUG otopi.plugins.otopi.dialog.machine dialog.__logString:215 DIALOG:SEND       ***Q:VALUE TIME
2013-07-22 10:41:29 DEBUG otopi.plugins.otopi.dialog.machine dialog.__logString:215 DIALOG:SEND       ###
2013-07-22 10:41:29 DEBUG otopi.plugins.otopi.dialog.machine dialog.__logString:215 DIALOG:SEND       ### Please specify current time (YYYYmmddHHMMSS+0000), empty to skip:
2013-07-22 10:41:29 DEBUG otopi.plugins.otopi.dialog.machine dialog.__logString:215 DIALOG:SEND       ### Response is VALUE TIME=type:value or ABORT TIME
2013-07-22 10:41:29 DEBUG otopi.context context._executeMethod:130 method exception
Traceback (most recent call last):
  File "/tmp/ovirt-1OjmXPyNU3/pythonlib/otopi/context.py", line 120, in _executeMethod
    method['method']()
  File "/tmp/ovirt-1OjmXPyNU3/otopi-plugins/otopi/system/clock.py", line 111, in _set_clock
    format="YYYYmmddHHMMSS+0000"
  File "/tmp/ovirt-1OjmXPyNU3/otopi-plugins/otopi/dialog/machine.py", line 235, in queryValue
    opcode, variable = self._readline().split(' ', 1)
  File "/tmp/ovirt-1OjmXPyNU3/pythonlib/otopi/dialog.py", line 259, in _readline
    raise IOError(_('End of file'))
IOError: End of file

Version-Release number of selected component (if applicable):
otopi-1.0.0-2.el6ev
rhevm-3.2.1-0.39.el6ev

How reproducible:
Always.

Steps to Reproduce:
1. Try to add a host.
2.
3.

Actual results:
Host <hostname> installation failed. Please refer to enging.log on engine and log files under /tmp on host for further details..

Expected results:
Host is added successfully.

Additional info:
server.log shows "Connection reset by peer" for the SSH session:

2013-07-19 14:34:10,150 INFO  [org.apache.sshd.client.session.ClientSessionImpl] (NioProcessor-2) Session created...
2013-07-19 14:34:10,167 INFO  [org.apache.sshd.client.session.ClientSessionImpl] (NioProcessor-2) Server version string: SSH-2.0-OpenSSH_6.2
2013-07-19 14:34:10,169 INFO  [org.apache.sshd.client.session.ClientSessionImpl] (NioProcessor-2) Received SSH_MSG_KEXINIT
2013-07-19 14:34:10,179 INFO  [org.apache.sshd.client.kex.DHG1] (NioProcessor-2) Send SSH_MSG_KEXDH_INIT
2013-07-19 14:34:10,185 INFO  [org.apache.sshd.client.kex.DHG1] (NioProcessor-2) Received SSH_MSG_KEXDH_REPLY
2013-07-19 14:34:10,193 INFO  [org.apache.sshd.client.session.ClientSessionImpl] (NioProcessor-2) Received SSH_MSG_NEWKEYS
2013-07-19 14:34:10,194 INFO  [org.apache.sshd.client.session.ClientSessionImpl] (NioProcessor-2) Send SSH_MSG_SERVICE_REQUEST for ssh-userauth
2013-07-19 14:34:10,276 INFO  [org.apache.sshd.client.auth.UserAuthPassword] (pool-4-thread-4) Send SSH_MSG_USERAUTH_REQUEST for password
2013-07-19 14:34:10,285 INFO  [org.apache.sshd.client.auth.UserAuthPassword] (NioProcessor-2) Received SSH_MSG_USERAUTH_SUCCESS
2013-07-19 14:34:10,328 INFO  [org.apache.sshd.client.channel.ChannelExec] (pool-4-thread-4) Send SSH_MSG_CHANNEL_OPEN on channel 0
2013-07-19 14:34:10,330 INFO  [org.apache.sshd.client.channel.ChannelExec] (NioProcessor-2) Send SSH_MSG_CHANNEL_REQUEST exec
2013-07-19 14:34:47,918 WARN  [org.apache.sshd.client.session.ClientSessionImpl] (NioProcessor-2) Exception caught: java.io.IOException: Connection reset by peer

Problem always occurs at the same point.

Log files will follow shortly.
Comment 2 Alon Bar-Lev 2013-07-22 06:44:39 EDT
This is interesting!

The following should not be in log:

2013-07-19 13:51:43,245 WARN  [org.ovirt.engine.core.compat.backendcompat.PropertyInfo] (VdsDeploy) Unable to get value of property: vds for class org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase

However, I do not think it is related.

I found no reason why at random times we get interrupted exceptions at engine side.

Can you please try to stop engine then start and re-try? I would like to see a fresh start of this, if this repeated I will need engine log in debug level, something is happening that I cannot see.
Comment 7 Alon Bar-Lev 2013-07-22 10:23:55 EDT
Hello Yair,

Any reason why we see the following? can it cause any issue? does it imply data model issue?

I cannot see anything that goes wrong... until this interrupt.

2013-07-22 15:51:25,826 WARN  [org.ovirt.engine.core.compat.backendcompat.PropertyInfo] (VdsDeploy) Unable to get value of property: glusterVolume for class org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase
2013-07-22 15:51:25,827 WARN  [org.ovirt.engine.core.compat.backendcompat.PropertyInfo] (VdsDeploy) Unable to get value of property: vds for class org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase
Comment 10 Julio Entrena Perez 2013-07-22 12:13:51 EDT
Firewall rule was killing SSH sessions with 10+ seconds of inactivity.

Note You need to log in before you can comment on or make changes to this bug.