Description of problem: hosted-engine deploy on additional host fails with this message: [ INFO ] Still waiting for VDSM host to become operational... [ ERROR ] The VDSM host was found in a failed state. Please check engine and bootstrap installation logs. [ ERROR ] Unable to add hosted_engine_2 to the manager Version-Release number of selected component (if applicable): 3.5.1 How reproducible: Always Steps to Reproduce: 1. Deploy hosted-engine on first host, accept to automatically configure iptables 2. Install OS on second host, enable iptables and allow only ssh access 3. deploy hosted-engine on second host Actual results: Fails with error message above Expected results: Succeeds Additional info: If you check iptables configuration on the second host, you see that it's the same as before deploy. engine.log has: 2015-05-13 14:22:50,603 INFO [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) Connecting to rhel6-he2.tlv.redhat.com/10.35.0.194 2015-05-13 14:22:50,619 WARN [org.ovirt.vdsm.jsonrpc.client.utils.retry.Retryable] (SSL Stomp Reactor) Retry failed 2015-05-13 14:22:50,620 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (org.ovirt.thread.pool-7-thread-3) Exception during connection ... 2015-05-13 14:23:15,783 ERROR [org.ovirt.engine.core.bll.InstallVdsInternalCommand] (org.ovirt.thread.pool-7-thread-35) [72523e50] Host installation failed for host 13886a16-43ab-4275-a2b0- c519337e59bf, hosted_engine_2.: org.ovirt.engine.core.bll.VdsCommand$VdsInstallException: Host not reachable host-deploy log of second host has: 2015-05-13 14:22:05 DEBUG otopi.context context.dumpEnvironment:500 ENV NETWORK/iptablesEnable=bool:'False' several times, and remains so till the end. This was caused by the fixes [1] to bug 1080823 and its 3.5 clone bug 1192462. The fix tells the engine to add the host to itself with "override_iptables" being set to the value of self.environment[otopicons.NetEnv.IPTABLES_ENABLE], which is never set in additional host deploy, where we skip the questions about iptables. [1] https://gerrit.ovirt.org/#/q/I8244440989f486b1006fbe05151cb1fc9aa1fa1d
A workaround is to manually configure iptables on the additional host, prior to deploy, and open the same ports as on the first host (at least 54321 for vdsm, others for libvirt/spice/vnc).
I found I slightly different behavior with firewalld active but the root cause is still the same. It's documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1221221
Can't proceed with this bug verification until https://bugzilla.redhat.com/show_bug.cgi?id=1294784 fixed.
The flow for current bug is the one described in z-stream bug 1222421 comment 11. Please ignore the Description above.
Tested on: ovirt-vmconsole-1.0.0-1.el7ev.noarch vdsm-4.17.15-0.el7ev.noarch qemu-kvm-rhev-2.3.0-31.el7_2.5.x86_64 ovirt-vmconsole-host-1.0.0-1.el7ev.noarch sanlock-3.2.4-2.el7_2.x86_64 ovirt-setup-lib-1.0.1-1.el7ev.noarch libvirt-client-1.2.17-13.el7_2.2.x86_64 mom-0.5.1-1.el7ev.noarch ovirt-hosted-engine-ha-1.3.3.6-1.el7ev.noarch ovirt-hosted-engine-setup-1.3.2.1-1.el7ev.noarch ovirt-host-deploy-1.4.1-1.el7ev.noarch iptables on second host before deployment: # iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED ACCEPT icmp -- anywhere anywhere ACCEPT all -- anywhere anywhere ACCEPT tcp -- anywhere anywhere state NEW tcp dpt:ssh REJECT all -- anywhere anywhere reject-with icmp-host-prohibited Chain FORWARD (policy ACCEPT) target prot opt source destination REJECT all -- anywhere anywhere reject-with icmp-host-prohibited Chain OUTPUT (policy ACCEPT) target prot opt source destination iptables after deployment: Chain INPUT (policy ACCEPT) target prot opt source destination ACCEPT all -- anywhere anywhere state RELATED,ESTABLISHED ACCEPT icmp -- anywhere anywhere ACCEPT all -- anywhere anywhere ACCEPT tcp -- anywhere anywhere tcp dpt:54321 ACCEPT tcp -- anywhere anywhere tcp dpt:sunrpc ACCEPT udp -- anywhere anywhere udp dpt:sunrpc ACCEPT tcp -- anywhere anywhere tcp dpt:ssh ACCEPT udp -- anywhere anywhere udp dpt:snmp ACCEPT tcp -- anywhere anywhere tcp dpt:16514 ACCEPT tcp -- anywhere anywhere multiport dports rockwell-csp2 ACCEPT tcp -- anywhere anywhere multiport dports rfb:6923 ACCEPT tcp -- anywhere anywhere multiport dports 49152:49216 REJECT all -- anywhere anywhere reject-with icmp-host-prohibited Chain FORWARD (policy ACCEPT) target prot opt source destination REJECT all -- anywhere anywhere PHYSDEV match ! --physdev-is-bridged reject-with icmp-host-prohibited Chain OUTPUT (policy ACCEPT) target prot opt source destination
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-0375.html