Bug 1640097 - Restoring an HE env from backup fails if the power management was configured for HE hosts
Summary: Restoring an HE env from backup fails if the power management was configured ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-hosted-engine-setup
Classification: oVirt
Component: General
Version: 2.2.24
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ovirt-4.2.8
: ---
Assignee: Simone Tiraboschi
QA Contact: Polina
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-17 11:19 UTC by Polina
Modified: 2019-01-22 10:23 UTC (History)
3 users (show)

Fixed In Version: ovirt-hosted-engine-setup-2.2.29-1.el7ev.noarch.rpm
Clone Of:
Environment:
Last Closed: 2019-01-22 10:23:21 UTC
oVirt Team: Integration
Embargoed:
rule-engine: ovirt-4.2+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 94985 0 master MERGED restore: temporary "disable" host fencing 2020-10-28 23:04:20 UTC
oVirt gerrit 95007 0 ovirt-hosted-engine-setup-2.2 MERGED restore: temporary "disable" host fencing 2020-10-28 23:04:21 UTC

Description Polina 2018-10-17 11:19:50 UTC
Description of problem:Restoring an hosted-engine environment from backup fails if the power management was configured for HE hosts

Version-Release number of selected component (if applicable):
Restoring an hosted-engine environment from backup fails if the power management was configured for HE hosts

How reproducible:100

scenario from doc https://docs.google.com/document/d/1Hyg7epVNfwSmPx9N8qaITH5vo2mGQm6Ie1JKYbFYBus/edit?ts=5bbcbe3e: 
Node 0 -> node 0
nfs->nfs 
redeploy on an env where power management is configured and all the hosts could be reached

The 4.2 upstream HE environment has two hosts - host1 - not HE, host2 - HE host. The VM1 is running on not HE host1, VM2 is running on HE host2. Power management is configured on both hosts. 

Steps to reproduce:
1. The backup file is created on engine by running <engine-backup --mode=backup --file=backup_compute-he-4 --log=log_compute-he-4_backup4.2>. Copy the backup file aside (on laptop) .
2. Insert environment into global maintenance. 
3. Cleanup HE Storage NFS Domain.
4. Reprovisioning HE host . Copy repos to /etc/yum.repos.d/, yum update and run <yum install ovirt-hosted-engine-setup>. 
5. Run on HE host restore command
<hosted-engine --deploy --restore-from-file=backup_compute-he-4>.

Actual results:
The Deploy starts ok with all the questions and then hangs for long time (I waited for two hours). then the host disconnects.
the last output lines are :
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Set FQDN]
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Force the local VM FQDN to temporary resolve on the natted network address]
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Restore sshd reverse DNS lookups]
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Generate an answer file for engine-setup]
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Include before engine-setup custom tasks files for the engine VM]
[ INFO  ] TASK [include_tasks]
[ INFO  ] ok: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Copy the backup file to the engine VM for restore]
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Run engine-backup]
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Remove backup file]
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [include_tasks]
[ INFO  ] ok: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Find configuration file for SCL PostgreSQL]
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Check SCL PostgreSQL value]
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Set SCL prefix for PostgreSQL]
[ INFO  ] ok: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Remove previous hosted-engine VM]
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Remove dynamic data for VMs on the host used to redeploy]
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Remove host used to redeploy]
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Remove previous HE storage domain to avoid name conflicts]
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Execute engine-setup]
[ INFO  ] changed: [compute-ge-he-4.scl.lab.tlv.redhat.com]
[ INFO  ] TASK [Include after engine-setup custom tasks files for the engine VM]
[ INFO  ] TASK [Wait for the engine to reach a stable condition]

No ssh access to the host (and no ping). Checking the Power Management shows that that the "Host is currently off".  The host could be started by power control. Though the restore operation didn't succeed - we have no engine.

Expected results: restore-deploy succeeds. engine and hosts are up


Additional info: the hosts in the environment are AMD (cougar03.scl.lab.tlv.redhat.com, cougar04.scl.lab.tlv.redhat.com, cougar05.scl.lab.tlv.redhat.com)

Comment 1 Polina 2018-10-29 11:48:27 UTC
verified on ovirt-hosted-engine-setup-2.2.30

Comment 2 Sandro Bonazzola 2018-11-02 14:36:04 UTC
This bugzilla is included in oVirt 4.2.7 release, published on November 2nd 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.7 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Comment 3 Sandro Bonazzola 2018-11-02 14:40:11 UTC
Closed by mistake, moving back to qa -> verified

Comment 4 Sandro Bonazzola 2019-01-22 10:23:21 UTC
This bugzilla is included in oVirt 4.2.8 release, published on January 22nd 2019.

Since the problem described in this bug report should be
resolved in oVirt 4.2.8 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.