Description of problem: Adding a RHEL host to RHEV-M fails with IOError: 2013-07-22 10:41:29 DEBUG otopi.plugins.otopi.dialog.machine dialog.__logString:215 DIALOG:SEND ***Q:VALUE TIME 2013-07-22 10:41:29 DEBUG otopi.plugins.otopi.dialog.machine dialog.__logString:215 DIALOG:SEND ### 2013-07-22 10:41:29 DEBUG otopi.plugins.otopi.dialog.machine dialog.__logString:215 DIALOG:SEND ### Please specify current time (YYYYmmddHHMMSS+0000), empty to skip: 2013-07-22 10:41:29 DEBUG otopi.plugins.otopi.dialog.machine dialog.__logString:215 DIALOG:SEND ### Response is VALUE TIME=type:value or ABORT TIME 2013-07-22 10:41:29 DEBUG otopi.context context._executeMethod:130 method exception Traceback (most recent call last): File "/tmp/ovirt-1OjmXPyNU3/pythonlib/otopi/context.py", line 120, in _executeMethod method['method']() File "/tmp/ovirt-1OjmXPyNU3/otopi-plugins/otopi/system/clock.py", line 111, in _set_clock format="YYYYmmddHHMMSS+0000" File "/tmp/ovirt-1OjmXPyNU3/otopi-plugins/otopi/dialog/machine.py", line 235, in queryValue opcode, variable = self._readline().split(' ', 1) File "/tmp/ovirt-1OjmXPyNU3/pythonlib/otopi/dialog.py", line 259, in _readline raise IOError(_('End of file')) IOError: End of file Version-Release number of selected component (if applicable): otopi-1.0.0-2.el6ev rhevm-3.2.1-0.39.el6ev How reproducible: Always. Steps to Reproduce: 1. Try to add a host. 2. 3. Actual results: Host <hostname> installation failed. Please refer to enging.log on engine and log files under /tmp on host for further details.. Expected results: Host is added successfully. Additional info: server.log shows "Connection reset by peer" for the SSH session: 2013-07-19 14:34:10,150 INFO [org.apache.sshd.client.session.ClientSessionImpl] (NioProcessor-2) Session created... 2013-07-19 14:34:10,167 INFO [org.apache.sshd.client.session.ClientSessionImpl] (NioProcessor-2) Server version string: SSH-2.0-OpenSSH_6.2 2013-07-19 14:34:10,169 INFO [org.apache.sshd.client.session.ClientSessionImpl] (NioProcessor-2) Received SSH_MSG_KEXINIT 2013-07-19 14:34:10,179 INFO [org.apache.sshd.client.kex.DHG1] (NioProcessor-2) Send SSH_MSG_KEXDH_INIT 2013-07-19 14:34:10,185 INFO [org.apache.sshd.client.kex.DHG1] (NioProcessor-2) Received SSH_MSG_KEXDH_REPLY 2013-07-19 14:34:10,193 INFO [org.apache.sshd.client.session.ClientSessionImpl] (NioProcessor-2) Received SSH_MSG_NEWKEYS 2013-07-19 14:34:10,194 INFO [org.apache.sshd.client.session.ClientSessionImpl] (NioProcessor-2) Send SSH_MSG_SERVICE_REQUEST for ssh-userauth 2013-07-19 14:34:10,276 INFO [org.apache.sshd.client.auth.UserAuthPassword] (pool-4-thread-4) Send SSH_MSG_USERAUTH_REQUEST for password 2013-07-19 14:34:10,285 INFO [org.apache.sshd.client.auth.UserAuthPassword] (NioProcessor-2) Received SSH_MSG_USERAUTH_SUCCESS 2013-07-19 14:34:10,328 INFO [org.apache.sshd.client.channel.ChannelExec] (pool-4-thread-4) Send SSH_MSG_CHANNEL_OPEN on channel 0 2013-07-19 14:34:10,330 INFO [org.apache.sshd.client.channel.ChannelExec] (NioProcessor-2) Send SSH_MSG_CHANNEL_REQUEST exec 2013-07-19 14:34:47,918 WARN [org.apache.sshd.client.session.ClientSessionImpl] (NioProcessor-2) Exception caught: java.io.IOException: Connection reset by peer Problem always occurs at the same point. Log files will follow shortly.
This is interesting! The following should not be in log: 2013-07-19 13:51:43,245 WARN [org.ovirt.engine.core.compat.backendcompat.PropertyInfo] (VdsDeploy) Unable to get value of property: vds for class org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase However, I do not think it is related. I found no reason why at random times we get interrupted exceptions at engine side. Can you please try to stop engine then start and re-try? I would like to see a fresh start of this, if this repeated I will need engine log in debug level, something is happening that I cannot see.
Hello Yair, Any reason why we see the following? can it cause any issue? does it imply data model issue? I cannot see anything that goes wrong... until this interrupt. 2013-07-22 15:51:25,826 WARN [org.ovirt.engine.core.compat.backendcompat.PropertyInfo] (VdsDeploy) Unable to get value of property: glusterVolume for class org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase 2013-07-22 15:51:25,827 WARN [org.ovirt.engine.core.compat.backendcompat.PropertyInfo] (VdsDeploy) Unable to get value of property: vds for class org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogableBase
Firewall rule was killing SSH sessions with 10+ seconds of inactivity.