Description of problem: After installing the HEVM and logging in to Manager, the HE host cannot be installed/activated with the following error: Failed to install Host $(HOSTNAME). Invalid data received during bootstrap. Version-Release number of selected component (if applicable): RHV 4.1.9 rhevm-4.1.9.1-0.1.el7.noarch vdsm-4.19.45-1.el7ev.x86_64 How reproducible: 100% Steps to Reproduce: 1. Install HE host following https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html/installation_guide/red_hat_virtualization_hosts 2. Install HEVM following https://access.redhat.com/documentation/en-us/red_hat_virtualization/4.1/html/self-hosted_engine_guide/initiating_self-hosted_engine_deployment_on_rhvh 3. Login to manager GUI and try to install HE host Actual results: Failed to install Host $(HOSTNAME). Invalid data received during bootstrap. Expected results: Host installation finishes successfully Additional info: The following log sequence is visible in error.log: 2018-03-13 12:17:08,037+01 INFO [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (org.ovirt.thread.pool-7-thread-1) [ab095588-5de1-4907-a3b9-f83abca9a8b1] Installation of $(HOSTNAME). Executing command via SSH umask 0077; MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; tar --warning=no-timestamp -C "${MYTMP}" -x && "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine DIALOG/customization=bool:True < /var/cache/ovirt-engine/ovirt-host-deploy.tar 2018-03-13 12:17:08,037+01 INFO [org.ovirt.engine.core.utils.archivers.tar.CachedTar] (org.ovirt.thread.pool-7-thread-1) [ab095588-5de1-4907-a3b9-f83abca9a8b1] Tarball '/var/cache/ovirt-engine/ovirt-host-deploy.tar' refresh 2018-03-13 12:17:08,686+01 INFO [org.ovirt.engine.core.uutils.ssh.SSHDialog] (org.ovirt.thread.pool-7-thread-1) [ab095588-5de1-4907-a3b9-f83abca9a8b1] SSH execute 'root@$(HOSTNAME)' 'umask 0077; MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; tar --warning=no-timestamp -C "${MYTMP}" -x && "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine DIALOG/customization=bool:True' 2018-03-13 12:17:09,168+01 ERROR [org.ovirt.otopi.dialog.MachineDialogParser] (VdsDeploy) [ab095588-5de1-4907-a3b9-f83abca9a8b1] Cannot parse input: java.lang.RuntimeException: Invalid data received during bootstrap at org.ovirt.otopi.dialog.MachineDialogParser.nextEvent(MachineDialogParser.java:358) [otopi.jar:] at org.ovirt.otopi.dialog.MachineDialogParser.nextEvent(MachineDialogParser.java:393) [otopi.jar:] at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase.threadMain(VdsDeployBase.java:307) [bll.jar:] at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase.lambda$new$0(VdsDeployBase.java:395) [bll.jar:] at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_161]
Hi, according to the logs, adding host to engine failed on: 2018-03-13 11:52:18 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:92 VDSM host in installing state 2018-03-13 11:52:19 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:92 VDSM host in installing state 2018-03-13 11:52:21 DEBUG otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:92 VDSM host in install_failed state 2018-03-13 11:52:21 ERROR otopi.plugins.gr_he_setup.engine.add_host add_host._wait_host_ready:97 The VDSM host was found in a failed state. Please check engine and bootstrap installation logs. 2018-03-13 11:52:21 ERROR otopi.plugins.gr_he_setup.engine.add_host add_host._closeup:666 Unable to add rhev-h1.testumgebung.local to the manager However, the error you are reporting has a different timestamp: 2018-03-13 12:17:09,168+01 ERROR [org.ovirt.otopi.dialog.MachineDialogParser] (VdsDeploy) [ab095588-5de1-4907-a3b9-f83abca9a8b1] Cannot parse input: java.lang.RuntimeException: Invalid data received during boots trap at org.ovirt.otopi.dialog.MachineDialogParser.nextEvent(MachineDialogParser.java:358) [otopi.jar:] at org.ovirt.otopi.dialog.MachineDialogParser.nextEvent(MachineDialogParser.java:393) [otopi.jar:] at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase.threadMain(VdsDeployBase.java:307) [bll.jar:] at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase.lambda$new$0(VdsDeployBase.java:395) [bll.jar:] at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_161] The relevant one for hosted engine deployment is: 2018-03-13 11:52:18,577+01 INFO [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (org.ovirt.thread.pool-7-thread-1) [d8adaca] Installation of rhev-h1.testumgebung.local. Executing command via SSH umask 0077; MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; tar --warning=no-timestamp -C "${MYTMP}" -x && "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine DIALOG/customization=bool:True < /var/cache/ovirt-engine/ovirt-host-deploy.tar 2018-03-13 11:52:18,577+01 INFO [org.ovirt.engine.core.utils.archivers.tar.CachedTar] (org.ovirt.thread.pool-7-thread-1) [d8adaca] Tarball '/var/cache/ovirt-engine/ovirt-host-deploy.tar' refresh 2018-03-13 11:52:19,056+01 INFO [org.ovirt.engine.core.uutils.ssh.SSHDialog] (org.ovirt.thread.pool-7-thread-1) [d8adaca] SSH execute 'root.local' 'umask 0077; MYTMP="$(TMPDIR="${OVIRT_TMPDIR}" mktemp -d -t ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; tar --warning=no-timestamp -C "${MYTMP}" -x && "${MYTMP}"/ovirt-host-deploy DIALOG/dialect=str:machine DIALOG/customization=bool:True' 2018-03-13 11:52:19,401+01 ERROR [org.ovirt.otopi.dialog.MachineDialogParser] (VdsDeploy) [d8adaca] Cannot parse input: java.lang.RuntimeException: Invalid data received during bootstrap at org.ovirt.otopi.dialog.MachineDialogParser.nextEvent(MachineDialogParser.java:358) [otopi.jar:] at org.ovirt.otopi.dialog.MachineDialogParser.nextEvent(MachineDialogParser.java:393) [otopi.jar:] at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase.threadMain(VdsDeployBase.java:307) [bll.jar:] at org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase.lambda$new$0(VdsDeployBase.java:395) [bll.jar:] at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_161] 2018-03-13 11:52:19,407+01 ERROR [org.ovirt.engine.core.bll.hostdeploy.VdsDeployBase] (VdsDeploy) [d8adaca] Error during deploy dialog but for both the errors, due to the execption occurred we are missing the ovirt-host-deploy log. Since it seems to be reproducible on your environment, can you please reproduce it and attach the ovirt-host-deploy log you'll find on /tmp/ on the host?
Hello Sandro, One more update from the customer: ~~~ I have tried today to take another Hardware, unfortunately I got the same error. This way I believe we can exclude a cause on the hardware side. ~~~ -- regards, Igor Netkachev Technical Support Engineer, RHCE, RHCVA Red Hat Global Support Services
nodectl motd prints: ''' node status: OK See `nodectl check` for more information Admin Console: https://10.12.12.100:9090/ or https://10.11.11.160:9090/ ''' So the first line got by the engine is just an empty line which is invalid in machine dialog syntax. And so otopi java code throws a "RuntimeException("Invalid data received during bootstrap")". and unfortunately the breaking like is logged just at DEBUG level and so it's normally discarded, see: https://github.com/oVirt/otopi/blob/otopi-1.6/src/java/MachineDialogParser.java#L343 As a workaround, comment the content of: /etc/profile.d/nodectl-run-banner.sh /etc/profile.d/nodectl-motd.sh Before deploying the rhv-h host. After that the deployment could still fail due to: https://bugzilla.redhat.com/show_bug.cgi?id=1571994 and this also currently needs a workaround: chown vdsm:kvm /var/log/ovirt-imageio-daemon/daemon.log With that workarounds the deployment should work.
I'm not able to reproduce this with any of the versions from the report, and `nodectl motd` has been present since the introduction of 4.1 Were any customizations done in the customer environment? SSH jump hosts from RHVM? Changes to /etc/motd or /etc/issue on RHVH?
Filed bug 1574433 for better logging.
We added better logging of issues similar to this one in bug #1574433 in order to gather more info on this kind of issues. We didn't succeed in reproducing the issue in our labs and fixed the issue on the customer's system. I'm closing this bug with insufficient data resolution since we miss a reproduction of the issue. Please reopen if you can provide more information on how to reproduce the issue.
Could you please provide information which RHV versions exactly contain the fixes so that the customer can try to reproduce and figure out if the issue still exists?
(In reply to Robert Scheck from comment #20) > Could you please provide information which RHV versions exactly contain the > fixes so that the customer can try to reproduce and figure out if the issue > still exists? See bug 1574433 - will be in 4.2.4. It will not fix the problem, as we do not know what that is. It might help debug it further.
BZ<2>Jira Resync
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days