Bug 1034726 - When re-running --deploy, ha services should be stopped to allow re-using existing storage
Summary: When re-running --deploy, ha services should be stopped to allow re-using exi...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-hosted-engine-setup
Version: 3.3.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ---
: 3.4.0
Assignee: Sandro Bonazzola
QA Contact: movciari
URL:
Whiteboard: integration
: 1034826 (view as bug list)
Depends On:
Blocks: 1066373 rhev3.4beta 1142926
TreeView+ depends on / blocked
 
Reported: 2013-11-26 12:03 UTC by Aharon Canan
Modified: 2018-12-05 16:39 UTC (History)
17 users (show)

Fixed In Version: ovirt-3.4.0-beta3
Doc Type: Bug Fix
Doc Text:
* Previously, the high-availability daemon was enabled by the rpm install and not stopped upon termination of a hosted-engine deployment. This meant that if the hosted engine was deployed, but was aborted or failed after having created the engine virtual machine, the hosted engine could not be redeployed as it conflicted with the virtual machine already started by the high availability daemon. Now, the high availability daemon is enabled by hosted-engine deployment, and the hosted engine checks for an existing virtual machine running on the host. Redeployment of the hosted engine no longer fails due to the presence of a virtual machine created during a previous deployment.
Clone Of:
: 1066373 (view as bug list)
Environment:
Last Closed: 2014-06-09 14:47:34 UTC
oVirt Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (3.24 MB, text/x-log)
2013-11-26 12:07 UTC, Aharon Canan
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:0505 0 normal SHIPPED_LIVE ovirt-hosted-engine-setup bug fix and enhancement update 2014-06-09 18:45:23 UTC
oVirt gerrit 24479 0 None None None Never
oVirt gerrit 24481 0 None None None Never
oVirt gerrit 24548 0 None None None Never
oVirt gerrit 24608 0 None None None Never

Description Aharon Canan 2013-11-26 12:03:22 UTC
Description of problem:
trying to redeploy fails, HA service didn't stop

Version-Release number of selected component (if applicable):
is24.2

How reproducible:
100

Steps to Reproduce:
1. run "hosted-engine --deploy" and fail it 
2. rerun "hosted-engine --deploy" using the same NFS share 
3.

Actual results:
deploy fails

Expected results:
should work 

Additional info: (from vdsm logs)
Thread-53::ERROR::2013-11-26 13:27:48,742::BindingXMLRPC::1003::vds::(wrapper) unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/BindingXMLRPC.py", line 989, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/BindingXMLRPC.py", line 240, in vmSetTicket
    return vm.setTicket(password, ttl, existingConnAction, params)
  File "/usr/share/vdsm/API.py", line 592, in setTicket
    return v.setTicket(password, ttl, existingConnAction, params)
  File "/usr/share/vdsm/vm.py", line 4303, in setTicket
    graphics = _domParseStr(self._dom.XMLDesc(0)).childNodes[0]. \
AttributeError: 'NoneType' object has no attribute 'XMLDesc'

Comment 1 Aharon Canan 2013-11-26 12:07:21 UTC
Created attachment 829235 [details]
logs

Comment 2 Sandro Bonazzola 2013-11-26 16:05:58 UTC
*** Bug 1034826 has been marked as a duplicate of this bug. ***

Comment 3 Alex Lourie 2013-11-27 12:13:45 UTC
@Doron

What should the setup do if there's an already defined VM on this machine with the same name? Stop it? Delete?

What is the valid way to continue?

Thanks.

Comment 4 Doron Fediuck 2013-11-28 08:38:15 UTC
Hi Alex,
in this specific case there was an earlier error from libvirt which did not find a VM, since it was not running. So it shouldn't be an issue.

Generally speaking, we should check if there's a running VM. If we find one, ask the user permission to kill it in order to proceed and then stop it.

Comment 12 Sandro Bonazzola 2014-02-14 11:07:49 UTC
Relevant error in attached vdsm.log here is:

Thread-42::DEBUG::2013-11-26 13:27:37,707::libvirtconnection::108::libvirtconnection::(wrapper) Unknown libvirterror: ecode: 9 edom: 20 level: 2 message: operation failed: domain 'HostedEngine' already exists with uuid 7c13d921-6adf-4737-94fa-e387b3de1c97
Thread-42::DEBUG::2013-11-26 13:27:37,707::vm::2118::vm.Vm::(_startUnderlyingVm) vmId=`af3da3f8-b598-4810-9845-f58f679a6d8e`::_ongoingCreations released
Thread-42::ERROR::2013-11-26 13:27:37,708::vm::2144::vm.Vm::(_startUnderlyingVm) vmId=`af3da3f8-b598-4810-9845-f58f679a6d8e`::The vm start process failed

Hosted engine is trying to create a VM 'HostedEngine' with a new uuid: af3da3f8-b598-4810-9845-f58f679a6d8e

The VM has been started by the HA daemon at reboot after a partial / aborted setup.

Comment 13 Sandro Bonazzola 2014-02-14 12:07:22 UTC
Pushed a first patch avoiding to have ha daemons started by just installing the rpm and rebooting.

Comment 14 Sandro Bonazzola 2014-02-14 12:25:48 UTC
pushed a second patch for checking if any vm is already running on the host, the same way we do for storage pools.
If we find any VM running we can't deploy hosted engine on the system.
the system lists the uuids of the running VMs.
Since this is not a condition that should be reached on a clean system, the user should investigate on why the VM is running so we don't shutdown it, we just abort the deploy command.

Comment 15 Sandro Bonazzola 2014-02-17 11:02:17 UTC
hosted-engine-setup side patches have been merged on upstream master and 1.1 branches. Pending review on hosted-engine-ha side.

Comment 22 errata-xmlrpc 2014-06-09 14:47:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0505.html


Note You need to log in before you can comment on or make changes to this bug.