Bug 1641694 - [RFE] hosted-engine --vm-start-paused should clean up any existing HE-VM.
Summary: [RFE] hosted-engine --vm-start-paused should clean up any existing HE-VM.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-hosted-engine-setup
Version: 4.2.7
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ovirt-4.4.0
: 4.4.0
Assignee: Asaf Rachmani
QA Contact: Nikolai Sednev
URL:
Whiteboard:
: 1721144 (view as bug list)
Depends On: 1795672
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-22 14:19 UTC by Nikolai Sednev
Modified: 2020-08-04 13:26 UTC (History)
9 users (show)

Fixed In Version: ovirt-hosted-engine-setup-2.4.4
Doc Type: Enhancement
Doc Text:
With this update, you can start the self-hosted engine virtual machine in a paused state. To do so, enter the following command: ---- # hosted-engine --vm-start-paused ---- To un-pause the virtual machine, enter the following command: ---- # hosted-engine --vm-start ----
Clone Of:
Environment:
Last Closed: 2020-08-04 13:26:25 UTC
oVirt Team: Integration
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2020:3246 0 None None None 2020-08-04 13:26:51 UTC
oVirt gerrit 107682 0 master MERGED src: Encode xml to UTF-8 and add the ability to start HE vm when it is paused 2020-12-09 20:58:04 UTC

Description Nikolai Sednev 2018-10-22 14:19:38 UTC
Description of problem:
In case that HE-VM was properly shutdown from CLI on ha-host using "hosted-engine --vm-shutdown" and then "hosted-engine --vm-start-paused" can't properly start the engine-VM because its still running on host. The command should work like "hosted-engine --vm-start" and clean up any running SHE-VM, like here:
alma03 ~]# hosted-engine --vm-start
VM exists and is down, cleaning up and restarting

Instead of what is currently happens:
!! Cluster is in GLOBAL MAINTENANCE mode !!

[root@alma03 ~]# hosted-engine --vm-shutdown
[root@alma03 ~]#  virsh -r list --all
 Id    Name                           State
----------------------------------------------------
 -     HostedEngine                   shut off

[root@alma03 ~]# hosted-engine --vm-start-paused
Command VM.create with args {'vmParams': {'xml': '<?xml version=\'1.0\' encoding=\'UTF-8\'?>\n<domain xmlns:ovirt-tune="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0" type="kvm"><name>HostedEngine</name><uuid>4ac6d84a-43bd-493e-a443-dda4bdf3d070</uuid><memory>16777216</memory><currentMemory>16777216</currentMemory><iothreads>1</iothreads><maxMemory slots="16">67108864</maxMemory><vcpu current="4">64</vcpu><sysinfo type="smbios"><system><entry name="manufacturer">Red Hat</entry><entry name="product">OS-NAME:</entry><entry name="version">OS-VERSION:</entry><entry name="serial">HOST-SERIAL:</entry><entry name="uuid">4ac6d84a-43bd-493e-a443-dda4bdf3d070</entry></system></sysinfo><clock offset="variable" adjustment="0"><timer name="rtc" tickpolicy="catchup"/><timer name="pit" tickpolicy="delay"/><timer name="hpet" present="no"/></clock><features><acpi/><vmcoreinfo/></features><cpu match="exact"><model>SandyBridge</model><feature policy="require" name="pcid"/><feature policy="require" name="spec-ctrl"/><feature policy="require" name="ssbd"/><topology cores="4" threads="1" sockets="16"/><numa><cell id="0" cpus="0,1,2,3" memory="16777216"/></numa></cpu><cputune/><devices><input type="mouse" bus="ps2"/><channel type="unix"><target type="virtio" name="ovirt-guest-agent.0"/><source mode="bind" path="/var/lib/libvirt/qemu/channels/4ac6d84a-43bd-493e-a443-dda4bdf3d070.ovirt-guest-agent.0"/></channel><channel type="unix"><target type="virtio" name="org.qemu.guest_agent.0"/><source mode="bind" path="/var/lib/libvirt/qemu/channels/4ac6d84a-43bd-493e-a443-dda4bdf3d070.org.qemu.guest_agent.0"/></channel><memballoon model="virtio"><stats period="5"/><alias name="ua-04e5458f-7d6f-488e-8f6e-273a6c580f7c"/></memballoon><graphics type="vnc" port="-1" autoport="yes" passwd="*****" passwdValidTo="1970-01-01T00:00:01" keymap="en-us"><listen type="network" network="vdsm-ovirtmgmt"/></graphics><controller type="virtio-serial" index="0" ports="16"><alias name="ua-209ebc9f-0f38-429b-ab25-7b199d975a53"/></controller><rng model="virtio"><backend model="random">/dev/urandom</backend><alias name="ua-2d2344a8-9a92-45c4-973a-997202c20ce1"/></rng><controller type="scsi" model="virtio-scsi" index="0"><driver iothread="1"/><alias name="ua-4b2c19a6-ec53-42cf-af29-fb657b761ce0"/></controller><graphics type="spice" port="-1" autoport="yes" passwd="*****" passwdValidTo="1970-01-01T00:00:01" tlsPort="-1"><channel name="main" mode="secure"/><channel name="inputs" mode="secure"/><channel name="cursor" mode="secure"/><channel name="playback" mode="secure"/><channel name="record" mode="secure"/><channel name="display" mode="secure"/><channel name="smartcard" mode="secure"/><channel name="usbredir" mode="secure"/><listen type="network" network="vdsm-ovirtmgmt"/></graphics><sound model="ich6"><alias name="ua-8229d696-e3bb-4d7f-9aba-308327650cf2"/></sound><controller type="usb" model="piix3-uhci" index="0"/><video><model type="qxl" vram="32768" heads="1" ram="65536" vgamem="16384"/><alias name="ua-d94588ac-515c-4145-8443-414750cfc3c5"/></video><channel type="spicevmc"><target type="virtio" name="com.redhat.spice.0"/></channel><interface type="bridge"><model type="virtio"/><link state="up"/><source bridge="ovirtmgmt"/><driver queues="4" name="vhost"/><alias name="ua-28350ab9-0101-46d2-8069-f7b5e13b0c29"/><mac address="00:16:3e:7b:b8:53"/><mtu size="1500"/><filterref filter="vdsm-no-mac-spoofing"/><bandwidth/></interface><disk type="file" device="cdrom" snapshot="no"><driver name="qemu" type="raw" error_policy="report"/><source file="" startupPolicy="optional"/><target dev="hdc" bus="ide"/><readonly/><alias name="ua-19d2c5d6-f377-45f0-917a-362c75ff87b7"/></disk><disk snapshot="no" type="file" device="disk"><target dev="vda" bus="virtio"/><source file="/rhev/data-center/00000000-0000-0000-0000-000000000000/7d8dec75-ab36-40ee-a612-344dad55c0d8/images/8c1aae00-71c6-48ee-9125-3946d577cc31/c357ec41-b3df-4ea2-b365-1040aa24d32a"/><driver name="qemu" iothread="1" io="threads" type="raw" error_policy="stop" cache="none"/><alias name="ua-8c1aae00-71c6-48ee-9125-3946d577cc31"/><serial>8c1aae00-71c6-48ee-9125-3946d577cc31</serial></disk><lease><key>c357ec41-b3df-4ea2-b365-1040aa24d32a</key><lockspace>7d8dec75-ab36-40ee-a612-344dad55c0d8</lockspace><target offset="LEASE-OFFSET:c357ec41-b3df-4ea2-b365-1040aa24d32a:7d8dec75-ab36-40ee-a612-344dad55c0d8" path="LEASE-PATH:c357ec41-b3df-4ea2-b365-1040aa24d32a:7d8dec75-ab36-40ee-a612-344dad55c0d8"/></lease></devices><pm><suspend-to-disk enabled="no"/><suspend-to-mem enabled="no"/></pm><os><type arch="x86_64" machine="pc-i440fx-rhel7.5.0">hvm</type><smbios mode="sysinfo"/></os><metadata><ovirt-tune:qos/><ovirt-vm:vm><minGuaranteedMemoryMb type="int">1024</minGuaranteedMemoryMb><clusterVersion>4.2</clusterVersion><ovirt-vm:custom/><ovirt-vm:device mac_address="00:16:3e:7b:b8:53"><ovirt-vm:custom/></ovirt-vm:device><ovirt-vm:device devtype="disk" name="vda"><ovirt-vm:poolID>00000000-0000-0000-0000-000000000000</ovirt-vm:poolID><ovirt-vm:volumeID>c357ec41-b3df-4ea2-b365-1040aa24d32a</ovirt-vm:volumeID><ovirt-vm:shared>exclusive</ovirt-vm:shared><ovirt-vm:imageID>8c1aae00-71c6-48ee-9125-3946d577cc31</ovirt-vm:imageID><ovirt-vm:domainID>7d8dec75-ab36-40ee-a612-344dad55c0d8</ovirt-vm:domainID></ovirt-vm:device><resumeBehavior>auto_resume</resumeBehavior><ovirt-vm:launchPaused>true</ovirt-vm:launchPaused></ovirt-vm:vm></metadata></domain>'}, 'vmID': '4ac6d84a-43bd-493e-a443-dda4bdf3d070'} failed:
(code=4, message=Virtual machine already exists)

Version-Release number of selected component (if applicable):
ovirt-hosted-engine-setup-2.2.28-1.el7ev.noarch
ovirt-hosted-engine-ha-2.2.18-1.el7ev.noarch
rhvm-appliance-4.2-20181018.0.el7.noarch
Red Hat Enterprise Linux Server release 7.6 (Maipo)
Linux 3.10.0-957.el7.x86_64 #1 SMP Thu Oct 4 20:48:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:
100%

Steps to Reproduce:
1.Deployed hosted-engine with node 0.
2.hosted-engine --set-maintenance --mode=global.
3.hosted-engine --vm-shutdown.
4.hosted-engine --vm-start-paused.

Actual results:
hosted-engine --vm-start-paused can't start-paused previously shutdown HE-VM.

Expected results:
Should clean any leftovers just like with hosted-engine --vm-start.

Additional info:

Comment 1 Nikolai Sednev 2018-10-22 14:21:03 UTC
Please also see https://bugzilla.redhat.com/show_bug.cgi?id=1630090.

Comment 2 Sandro Bonazzola 2019-01-21 08:28:32 UTC
re-targeting to 4.3.1 since this BZ has not been proposed as blocker for 4.3.0.
If you think this bug should block 4.3.0 please re-target and set blocker flag.

Comment 4 Sandro Bonazzola 2019-02-18 07:54:50 UTC
Moving to 4.3.2 not being identified as blocker for 4.3.1.

Comment 5 Nikolai Sednev 2020-02-05 14:04:26 UTC
Please provide your input regarding this RFE if required.

Comment 6 Asaf Rachmani 2020-02-11 12:08:30 UTC
Is this still reproducible?
There is a patch that should fix it - https://gerrit.ovirt.org/#/c/100538/

Comment 7 Martin Tessun 2020-02-13 10:25:33 UTC
Could you please verify if this is still reproducible? See comment#6

Comment 8 Nikolai Sednev 2020-03-03 13:25:49 UTC
What is happening now is as follows:
hosted-engine --set-maintenance --mode=global
[root@alma04 ~]# hosted-engine --vm-shutdown
[root@alma04 ~]# hosted-engine --vm-status


!! Cluster is in GLOBAL MAINTENANCE mode !!



--== Host alma03.qa.lab.tlv.redhat.com (id: 1) status ==--

Host ID                            : 1
Host timestamp                     : 534586
Score                              : 3400
Engine status                      : {"vm": "down", "health": "bad", "detail": "unknown", "reason": "vm not running on this host"}
Hostname                           : alma03.qa.lab.tlv.redhat.com
Local maintenance                  : False
stopped                            : False
crc32                              : c74d12cd
conf_on_shared_storage             : True
local_conf_timestamp               : 534586
Status up-to-date                  : True
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=534586 (Tue Mar  3 15:19:38 2020)
        host-id=1
        score=3400
        vm_conf_refresh_time=534586 (Tue Mar  3 15:19:38 2020)
        conf_on_shared_storage=True
        maintenance=False
        state=GlobalMaintenance
        stopped=False


--== Host alma04.qa.lab.tlv.redhat.com (id: 2) status ==--

Host ID                            : 2
Host timestamp                     : 527447
Score                              : 3400
Engine status                      : {"vm": "down_unexpected", "health": "bad", "detail": "Down", "reason": "bad vm status"}
Hostname                           : alma04.qa.lab.tlv.redhat.com
Local maintenance                  : False
stopped                            : False
crc32                              : 626df609
conf_on_shared_storage             : True
local_conf_timestamp               : 527448
Status up-to-date                  : True
Extra metadata (valid at timestamp):
        metadata_parse_version=1
        metadata_feature_version=1
        timestamp=527447 (Tue Mar  3 15:19:39 2020)
        host-id=2
        score=3400
        vm_conf_refresh_time=527448 (Tue Mar  3 15:19:39 2020)
        conf_on_shared_storage=True
        maintenance=False
        state=GlobalMaintenance
        stopped=False


!! Cluster is in GLOBAL MAINTENANCE mode !!

alma04 ~]# hosted-engine --vm-start-paused
VM exists and is down, cleaning up and restarting
Traceback (most recent call last):
  File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", line 214, in <module>
    args.command(args)
  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", line 42, in func
    f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", line 52, in create
    vm_params = vmconf.parseVmConfFile(args.filename)
  File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vmconf.py", line 169, in parseVmConfFile
    engine_xml_tree = ovfenvelope.etree_.fromstring(xml)
  File "src/lxml/etree.pyx", line 3213, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1872, in lxml.etree._parseMemoryDocument
ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.
VM failed to launch
alma04 ~]#  virsh -r list --all
 Id   Name   State
----------------------
 8    VM1    running
 10   VM3    running

Engine still fails to get started in paused mode.
VM1 and VM3 are regular guest VMs running on environment.

Tested on these components:
NFS deployment on these components:
rhvm-appliance.x86_64 2:4.4-20200123.0.el8ev rhv-4.4.0                                               
sanlock-3.8.0-2.el8.x86_64
qemu-kvm-4.2.0-12.module+el8.2.0+5858+afd073bc.x86_64
vdsm-4.40.5-1.el8ev.x86_64
libvirt-client-6.0.0-7.module+el8.2.0+5869+c23fe68b.x86_64
ovirt-hosted-engine-setup-2.4.2-2.el8ev.noarch
ovirt-hosted-engine-ha-2.4.2-1.el8ev.noarch
Linux 4.18.0-183.el8.x86_64 #1 SMP Sun Feb 23 20:50:47 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux release 8.2 Beta (Ootpa)

Engine is software version:4.4.0-0.17.master.el7
Red Hat Enterprise Linux Server release 7.8 Beta (Maipo)
Linux 3.10.0-1123.el7.x86_64 #1 SMP Tue Jan 14 03:44:38 EST 2020 x86_64 x86_64 x86_64 GNU/Linux

Comment 9 Nikolai Sednev 2020-03-03 13:27:29 UTC
In case I'm running "hosted-engine --vm-start" command, I'll get as follows:
Command VM.getStats with args {'vmID': '876bfd86-c46a-47cb-895c-51e61c953782'} failed:
(code=1, message=Virtual machine does not exist: {'vmId': '876bfd86-c46a-47cb-895c-51e61c953782'})
VM in WaitForLaunch

And then engine gets started:

alma04 ~]#  virsh -r list --all
 Id   Name           State
------------------------------
 8    VM1            running
 10   VM3            running
 12   HostedEngine   running

Comment 12 Sandro Bonazzola 2020-03-27 07:14:43 UTC
*** Bug 1721144 has been marked as a duplicate of this bug. ***

Comment 13 Nikolai Sednev 2020-04-01 14:43:17 UTC
hosted-engine --set-maintenance --mode=global
hosted-engine --vmstatus
!! Cluster is in GLOBAL MAINTENANCE mode !!
hosted-engine --vm-shutdown
Command VM.shutdown with args {'vmID': '9e2e81a5-64be-444e-b1c5-d9463f3eee35', 'delay': '120', 'message': 'VM is shutting down!'} failed:
(code=1, message=Virtual machine does not exist: {'vmId': '9e2e81a5-64be-444e-b1c5-d9463f3eee35'})
alma03 ~]# virsh -r list --all
 Id   Name   State
--------------------

alma03 ~]# hosted-engine --vm-start-paused
Command VM.getStats with args {'vmID': '9e2e81a5-64be-444e-b1c5-d9463f3eee35'} failed:
(code=1, message=Virtual machine does not exist: {'vmId': '9e2e81a5-64be-444e-b1c5-d9463f3eee35'})
VM in WaitForLaunch
[root@alma03 ~]# virsh -r list --all
 Id   Name           State
-----------------------------
 1    HostedEngine   paused

Worked for me on fresh and clean environment, deployment of HE 4.4 was performed on NFS.

Tested on host with these components:
rhvm-appliance.x86_64 2:4.4-20200326.0.el8ev
ovirt-hosted-engine-setup-2.4.4-1.el8ev.noarch
ovirt-hosted-engine-ha-2.4.2-1.el8ev.noarch
Red Hat Enterprise Linux release 8.2 Beta (Ootpa)
Linux 4.18.0-193.el8.x86_64 #1 SMP Fri Mar 27 14:35:58 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Engine:
ovirt-engine-setup-base-4.4.0-0.26.master.el8ev.noarch
ovirt-engine-4.4.0-0.26.master.el8ev.noarch
openvswitch2.11-2.11.0-48.el8fdp.x86_64
Linux 4.18.0-192.el8.x86_64 #1 SMP Tue Mar 24 14:06:40 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Red Hat Enterprise Linux release 8.2 Beta (Ootpa)

Comment 17 errata-xmlrpc 2020-08-04 13:26:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHV RHEL Host (ovirt-host) 4.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:3246


Note You need to log in before you can comment on or make changes to this bug.