Description of problem: RHEVH-HE deployment failed to register the RHEVH(20151104.0.el7ev) inside the Engine(3.6.0.3-0.1.el6). During HE deployment from RHEVH over iSCSI, after RHEVM was installed on RHEL6.7 (got it from PXE), almost at the end of the deployment, RHEVH wasn't added in to the engine: [ INFO ] Waiting for the host to become operational in the engine. This may take several minutes... [ ERROR ] The VDSM host was found in a failed state. Please check engine and bootstrap installation logs. [ ERROR ] Unable to add hosted_engine_1 to the manager [ INFO ] Saving hosted-engine configuration on the shared storage domain Please shutdown the VM allowing the system to launch it as a monitored service. The system will wait until the VM is down. Later on, I've logged in to the engine and tried activating the host manually via WEBUI of the engine (tried to activate the host via "Activate" button) and received this error: Operation Canceled Error while executing action: hosted_engine_1: Cannot activate Host. Host has no unique id. Version-Release number of selected component (if applicable): Engine: rhevm-3.6.0.3-0.1.el6.noarch Linux version 2.6.32-573.7.1.el6.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC) ) #1 SMP Thu Sep 10 13:42:16 EDT 2015 Host: ovirt-vmconsole-host-1.0.0-1.el7ev.noarch ovirt-node-branding-rhev-3.6.0-0.20.20151103git3d3779a.el7ev.noarch ovirt-node-lib-3.6.0-0.20.20151103git3d3779a.el7ev.noarch ovirt-node-3.6.0-0.20.20151103git3d3779a.el7ev.noarch ovirt-node-plugin-snmp-logic-3.6.0-0.20.20151103git3d3779a.el7ev.noarch ovirt-hosted-engine-setup-1.3.0-1.el7ev.noarch ovirt-node-plugin-vdsm-0.6.1-2.el7ev.noarch ovirt-setup-lib-1.0.0-1.el7ev.noarch ovirt-vmconsole-1.0.0-1.el7ev.noarch ovirt-node-lib-config-3.6.0-0.20.20151103git3d3779a.el7ev.noarch ovirt-node-selinux-3.6.0-0.20.20151103git3d3779a.el7ev.noarch ovirt-node-plugin-cim-logic-3.6.0-0.20.20151103git3d3779a.el7ev.noarch ovirt-hosted-engine-ha-1.3.2.1-1.el7ev.noarch ovirt-node-plugin-hosted-engine-0.3.0-2.el7ev.noarch ovirt-node-plugin-snmp-3.6.0-0.20.20151103git3d3779a.el7ev.noarch ovirt-node-plugin-rhn-3.6.0-0.20.20151103git3d3779a.el7ev.noarch sanlock-3.2.4-1.el7.x86_64 ovirt-node-lib-legacy-3.6.0-0.20.20151103git3d3779a.el7ev.noarch libvirt-1.2.17-13.el7.x86_64 qemu-kvm-rhev-2.3.0-31.el7.x86_64 vdsm-4.17.10.1-0.el7ev.noarch ovirt-host-deploy-offline-1.4.0-1.el7ev.x86_64 ovirt-node-plugin-cim-3.6.0-0.20.20151103git3d3779a.el7ev.noarch ovirt-host-deploy-1.4.0-1.el7ev.noarch How reproducible: 100% Steps to Reproduce: 1.Deploy RHEVH-HE. 2. 3. Actual results: RHEVH was not added to the engine, hence deployment failed. Expected results: RHEVH should be successfully added within the engine. Additional info: logs from the RHEVH attached.
Created attachment 1090311 [details] sosreport from host
Created attachment 1090313 [details] some logs from host tha I've collected after sosreport in a few minutes
Tried again the deployment over iSCSI of Red Hat Enterprise Virtualization Hypervisor (Beta) release 7.2 (20151113.123.el7ev), installed the RHEVH from USB and failed to get running ovirt-ha-agent on host after HE deployment ended. I saw in engine's WEBUI, that host was successfully added and became active. After deployment finished, I could not get the running engine as ovirt-ha-agent was not running on host: # service ovirt-ha-agent status Redirecting to /bin/systemctl status ovirt-ha-agent.service ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled) Active: inactive (dead) since Sun 2015-11-15 13:13:51 UTC; 47min ago Process: 12786 ExecStop=/usr/lib/systemd/systemd-ovirt-ha-agent stop (code=exited, status=0/SUCCESS) Main PID: 12782 (code=exited, status=0/SUCCESS) Nov 15 13:13:44 black-vdsb.qa.lab.tlv.redhat.com systemd[1]: Starting oVirt Hosted Engine High Availability Monitoring Agent... Nov 15 13:13:48 black-vdsb.qa.lab.tlv.redhat.com systemd-ovirt-ha-agent[12756]: Starting ovirt-ha-agent: [ OK ] Nov 15 13:13:48 black-vdsb.qa.lab.tlv.redhat.com systemd[1]: Started oVirt Hosted Engine High Availability Monitoring Agent. Nov 15 13:13:51 black-vdsb.qa.lab.tlv.redhat.com systemd-ovirt-ha-agent[12786]: Stopping ovirt-ha-agent: [ OK ] # service ovirt-ha-broker status Redirecting to /bin/systemctl status ovirt-ha-broker.service ● ovirt-ha-broker.service - oVirt Hosted Engine High Availability Communications Broker Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; enabled; vendor preset: disabled) Active: active (running) since Sun 2015-11-15 13:13:52 UTC; 47min ago Process: 12828 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-broker start (code=exited, status=0/SUCCESS) Main PID: 12872 (ovirt-ha-broker) CGroup: /system.slice/ovirt-ha-broker.service └─12872 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker Nov 15 13:13:48 black-vdsb.qa.lab.tlv.redhat.com systemd[1]: Starting oVirt Hosted Engine High Availability Communications Broker... Nov 15 13:13:52 black-vdsb.qa.lab.tlv.redhat.com systemd-ovirt-ha-broker[12828]: Starting ovirt-ha-broker: [ OK ] Nov 15 13:13:52 black-vdsb.qa.lab.tlv.redhat.com systemd[1]: Started oVirt Hosted Engine High Availability Communications Broker. Attaching the sosreport from host.
Created attachment 1094430 [details] sosreport from host 15_11_15_16_10_PM
Retested the same scenario on NFS, ended up with the same ovirt-ha-agent service dead issue and attached the sosreport. # service ovirt-ha-agent status Redirecting to /bin/systemctl status ovirt-ha-agent.service ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled) Active: inactive (dead) since Sun 2015-11-15 17:26:50 UTC; 1min 44s ago Process: 11444 ExecStop=/usr/lib/systemd/systemd-ovirt-ha-agent stop (code=exited, status=0/SUCCESS) Main PID: 11440 (code=exited, status=0/SUCCESS) Nov 15 17:26:43 black-vdsb.qa.lab.tlv.redhat.com systemd[1]: Starting oVirt Hosted Engine High Availability Monitor...... Nov 15 17:26:47 black-vdsb.qa.lab.tlv.redhat.com systemd[1]: Started oVirt Hosted Engine High Availability Monitori...nt. Nov 15 17:26:47 black-vdsb.qa.lab.tlv.redhat.com systemd-ovirt-ha-agent[11413]: Starting ovirt-ha-agent: [ OK ] Nov 15 17:26:50 black-vdsb.qa.lab.tlv.redhat.com systemd-ovirt-ha-agent[11444]: Stopping ovirt-ha-agent: [ OK ] Hint: Some lines were ellipsized, use -l to show in full. # service ovirt-ha-broker status Redirecting to /bin/systemctl status ovirt-ha-broker.service ● ovirt-ha-broker.service - oVirt Hosted Engine High Availability Communications Broker Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; enabled; vendor preset: disabled) Active: active (running) since Sun 2015-11-15 17:26:51 UTC; 1min 54s ago Process: 11485 ExecStart=/usr/lib/systemd/systemd-ovirt-ha-broker start (code=exited, status=0/SUCCESS) Main PID: 11520 (ovirt-ha-broker) CGroup: /system.slice/ovirt-ha-broker.service └─11520 /usr/bin/python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker Nov 15 17:26:48 black-vdsb.qa.lab.tlv.redhat.com systemd[1]: Starting oVirt Hosted Engine High Availability Communi...... Nov 15 17:26:51 black-vdsb.qa.lab.tlv.redhat.com systemd[1]: Started oVirt Hosted Engine High Availability Communic...er. Nov 15 17:26:51 black-vdsb.qa.lab.tlv.redhat.com systemd-ovirt-ha-broker[11485]: Starting ovirt-ha-broker: [ OK ] Hint: Some lines were ellipsized, use -l to show in full.
Created attachment 1094521 [details] sosreport from host 15_11_15_19_38_PM_NFS_deployment_failed_on_ovirt_ha_agent_dead
According comment 3 and comment 5, it seems same issue with bug 1280268.
This indeed looks like a dupe of bug 1280268 *** This bug has been marked as a duplicate of bug 1280268 ***