Description of problem: In case that HE-VM was properly shutdown from CLI on ha-host using "hosted-engine --vm-shutdown" and then "hosted-engine --vm-start-paused" can't properly start the engine-VM because its still running on host. The command should work like "hosted-engine --vm-start" and clean up any running SHE-VM, like here: alma03 ~]# hosted-engine --vm-start VM exists and is down, cleaning up and restarting Instead of what is currently happens: !! Cluster is in GLOBAL MAINTENANCE mode !! [root@alma03 ~]# hosted-engine --vm-shutdown [root@alma03 ~]# virsh -r list --all Id Name State ---------------------------------------------------- - HostedEngine shut off [root@alma03 ~]# hosted-engine --vm-start-paused Command VM.create with args {'vmParams': {'xml': '<?xml version=\'1.0\' encoding=\'UTF-8\'?>\n<domain xmlns:ovirt-tune="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0" type="kvm"><name>HostedEngine</name><uuid>4ac6d84a-43bd-493e-a443-dda4bdf3d070</uuid><memory>16777216</memory><currentMemory>16777216</currentMemory><iothreads>1</iothreads><maxMemory slots="16">67108864</maxMemory><vcpu current="4">64</vcpu><sysinfo type="smbios"><system><entry name="manufacturer">Red Hat</entry><entry name="product">OS-NAME:</entry><entry name="version">OS-VERSION:</entry><entry name="serial">HOST-SERIAL:</entry><entry name="uuid">4ac6d84a-43bd-493e-a443-dda4bdf3d070</entry></system></sysinfo><clock offset="variable" adjustment="0"><timer name="rtc" tickpolicy="catchup"/><timer name="pit" tickpolicy="delay"/><timer name="hpet" present="no"/></clock><features><acpi/><vmcoreinfo/></features><cpu match="exact"><model>SandyBridge</model><feature policy="require" name="pcid"/><feature policy="require" name="spec-ctrl"/><feature policy="require" name="ssbd"/><topology cores="4" threads="1" sockets="16"/><numa><cell id="0" cpus="0,1,2,3" memory="16777216"/></numa></cpu><cputune/><devices><input type="mouse" bus="ps2"/><channel type="unix"><target type="virtio" name="ovirt-guest-agent.0"/><source mode="bind" path="/var/lib/libvirt/qemu/channels/4ac6d84a-43bd-493e-a443-dda4bdf3d070.ovirt-guest-agent.0"/></channel><channel type="unix"><target type="virtio" name="org.qemu.guest_agent.0"/><source mode="bind" path="/var/lib/libvirt/qemu/channels/4ac6d84a-43bd-493e-a443-dda4bdf3d070.org.qemu.guest_agent.0"/></channel><memballoon model="virtio"><stats period="5"/><alias name="ua-04e5458f-7d6f-488e-8f6e-273a6c580f7c"/></memballoon><graphics type="vnc" port="-1" autoport="yes" passwd="*****" passwdValidTo="1970-01-01T00:00:01" keymap="en-us"><listen type="network" network="vdsm-ovirtmgmt"/></graphics><controller type="virtio-serial" index="0" ports="16"><alias name="ua-209ebc9f-0f38-429b-ab25-7b199d975a53"/></controller><rng model="virtio"><backend model="random">/dev/urandom</backend><alias name="ua-2d2344a8-9a92-45c4-973a-997202c20ce1"/></rng><controller type="scsi" model="virtio-scsi" index="0"><driver iothread="1"/><alias name="ua-4b2c19a6-ec53-42cf-af29-fb657b761ce0"/></controller><graphics type="spice" port="-1" autoport="yes" passwd="*****" passwdValidTo="1970-01-01T00:00:01" tlsPort="-1"><channel name="main" mode="secure"/><channel name="inputs" mode="secure"/><channel name="cursor" mode="secure"/><channel name="playback" mode="secure"/><channel name="record" mode="secure"/><channel name="display" mode="secure"/><channel name="smartcard" mode="secure"/><channel name="usbredir" mode="secure"/><listen type="network" network="vdsm-ovirtmgmt"/></graphics><sound model="ich6"><alias name="ua-8229d696-e3bb-4d7f-9aba-308327650cf2"/></sound><controller type="usb" model="piix3-uhci" index="0"/><video><model type="qxl" vram="32768" heads="1" ram="65536" vgamem="16384"/><alias name="ua-d94588ac-515c-4145-8443-414750cfc3c5"/></video><channel type="spicevmc"><target type="virtio" name="com.redhat.spice.0"/></channel><interface type="bridge"><model type="virtio"/><link state="up"/><source bridge="ovirtmgmt"/><driver queues="4" name="vhost"/><alias name="ua-28350ab9-0101-46d2-8069-f7b5e13b0c29"/><mac address="00:16:3e:7b:b8:53"/><mtu size="1500"/><filterref filter="vdsm-no-mac-spoofing"/><bandwidth/></interface><disk type="file" device="cdrom" snapshot="no"><driver name="qemu" type="raw" error_policy="report"/><source file="" startupPolicy="optional"/><target dev="hdc" bus="ide"/><readonly/><alias name="ua-19d2c5d6-f377-45f0-917a-362c75ff87b7"/></disk><disk snapshot="no" type="file" device="disk"><target dev="vda" bus="virtio"/><source file="/rhev/data-center/00000000-0000-0000-0000-000000000000/7d8dec75-ab36-40ee-a612-344dad55c0d8/images/8c1aae00-71c6-48ee-9125-3946d577cc31/c357ec41-b3df-4ea2-b365-1040aa24d32a"/><driver name="qemu" iothread="1" io="threads" type="raw" error_policy="stop" cache="none"/><alias name="ua-8c1aae00-71c6-48ee-9125-3946d577cc31"/><serial>8c1aae00-71c6-48ee-9125-3946d577cc31</serial></disk><lease><key>c357ec41-b3df-4ea2-b365-1040aa24d32a</key><lockspace>7d8dec75-ab36-40ee-a612-344dad55c0d8</lockspace><target offset="LEASE-OFFSET:c357ec41-b3df-4ea2-b365-1040aa24d32a:7d8dec75-ab36-40ee-a612-344dad55c0d8" path="LEASE-PATH:c357ec41-b3df-4ea2-b365-1040aa24d32a:7d8dec75-ab36-40ee-a612-344dad55c0d8"/></lease></devices><pm><suspend-to-disk enabled="no"/><suspend-to-mem enabled="no"/></pm><os><type arch="x86_64" machine="pc-i440fx-rhel7.5.0">hvm</type><smbios mode="sysinfo"/></os><metadata><ovirt-tune:qos/><ovirt-vm:vm><minGuaranteedMemoryMb type="int">1024</minGuaranteedMemoryMb><clusterVersion>4.2</clusterVersion><ovirt-vm:custom/><ovirt-vm:device mac_address="00:16:3e:7b:b8:53"><ovirt-vm:custom/></ovirt-vm:device><ovirt-vm:device devtype="disk" name="vda"><ovirt-vm:poolID>00000000-0000-0000-0000-000000000000</ovirt-vm:poolID><ovirt-vm:volumeID>c357ec41-b3df-4ea2-b365-1040aa24d32a</ovirt-vm:volumeID><ovirt-vm:shared>exclusive</ovirt-vm:shared><ovirt-vm:imageID>8c1aae00-71c6-48ee-9125-3946d577cc31</ovirt-vm:imageID><ovirt-vm:domainID>7d8dec75-ab36-40ee-a612-344dad55c0d8</ovirt-vm:domainID></ovirt-vm:device><resumeBehavior>auto_resume</resumeBehavior><ovirt-vm:launchPaused>true</ovirt-vm:launchPaused></ovirt-vm:vm></metadata></domain>'}, 'vmID': '4ac6d84a-43bd-493e-a443-dda4bdf3d070'} failed: (code=4, message=Virtual machine already exists) Version-Release number of selected component (if applicable): ovirt-hosted-engine-setup-2.2.28-1.el7ev.noarch ovirt-hosted-engine-ha-2.2.18-1.el7ev.noarch rhvm-appliance-4.2-20181018.0.el7.noarch Red Hat Enterprise Linux Server release 7.6 (Maipo) Linux 3.10.0-957.el7.x86_64 #1 SMP Thu Oct 4 20:48:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux How reproducible: 100% Steps to Reproduce: 1.Deployed hosted-engine with node 0. 2.hosted-engine --set-maintenance --mode=global. 3.hosted-engine --vm-shutdown. 4.hosted-engine --vm-start-paused. Actual results: hosted-engine --vm-start-paused can't start-paused previously shutdown HE-VM. Expected results: Should clean any leftovers just like with hosted-engine --vm-start. Additional info:
Please also see https://bugzilla.redhat.com/show_bug.cgi?id=1630090.
re-targeting to 4.3.1 since this BZ has not been proposed as blocker for 4.3.0. If you think this bug should block 4.3.0 please re-target and set blocker flag.
Moving to 4.3.2 not being identified as blocker for 4.3.1.
Please provide your input regarding this RFE if required.
Is this still reproducible? There is a patch that should fix it - https://gerrit.ovirt.org/#/c/100538/
Could you please verify if this is still reproducible? See comment#6
What is happening now is as follows: hosted-engine --set-maintenance --mode=global [root@alma04 ~]# hosted-engine --vm-shutdown [root@alma04 ~]# hosted-engine --vm-status !! Cluster is in GLOBAL MAINTENANCE mode !! --== Host alma03.qa.lab.tlv.redhat.com (id: 1) status ==-- Host ID : 1 Host timestamp : 534586 Score : 3400 Engine status : {"vm": "down", "health": "bad", "detail": "unknown", "reason": "vm not running on this host"} Hostname : alma03.qa.lab.tlv.redhat.com Local maintenance : False stopped : False crc32 : c74d12cd conf_on_shared_storage : True local_conf_timestamp : 534586 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=534586 (Tue Mar 3 15:19:38 2020) host-id=1 score=3400 vm_conf_refresh_time=534586 (Tue Mar 3 15:19:38 2020) conf_on_shared_storage=True maintenance=False state=GlobalMaintenance stopped=False --== Host alma04.qa.lab.tlv.redhat.com (id: 2) status ==-- Host ID : 2 Host timestamp : 527447 Score : 3400 Engine status : {"vm": "down_unexpected", "health": "bad", "detail": "Down", "reason": "bad vm status"} Hostname : alma04.qa.lab.tlv.redhat.com Local maintenance : False stopped : False crc32 : 626df609 conf_on_shared_storage : True local_conf_timestamp : 527448 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=527447 (Tue Mar 3 15:19:39 2020) host-id=2 score=3400 vm_conf_refresh_time=527448 (Tue Mar 3 15:19:39 2020) conf_on_shared_storage=True maintenance=False state=GlobalMaintenance stopped=False !! Cluster is in GLOBAL MAINTENANCE mode !! alma04 ~]# hosted-engine --vm-start-paused VM exists and is down, cleaning up and restarting Traceback (most recent call last): File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main "__main__", mod_spec) File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", line 214, in <module> args.command(args) File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", line 42, in func f(*args, **kwargs) File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vdsm_helper.py", line 52, in create vm_params = vmconf.parseVmConfFile(args.filename) File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_setup/vmconf.py", line 169, in parseVmConfFile engine_xml_tree = ovfenvelope.etree_.fromstring(xml) File "src/lxml/etree.pyx", line 3213, in lxml.etree.fromstring File "src/lxml/parser.pxi", line 1872, in lxml.etree._parseMemoryDocument ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration. VM failed to launch alma04 ~]# virsh -r list --all Id Name State ---------------------- 8 VM1 running 10 VM3 running Engine still fails to get started in paused mode. VM1 and VM3 are regular guest VMs running on environment. Tested on these components: NFS deployment on these components: rhvm-appliance.x86_64 2:4.4-20200123.0.el8ev rhv-4.4.0 sanlock-3.8.0-2.el8.x86_64 qemu-kvm-4.2.0-12.module+el8.2.0+5858+afd073bc.x86_64 vdsm-4.40.5-1.el8ev.x86_64 libvirt-client-6.0.0-7.module+el8.2.0+5869+c23fe68b.x86_64 ovirt-hosted-engine-setup-2.4.2-2.el8ev.noarch ovirt-hosted-engine-ha-2.4.2-1.el8ev.noarch Linux 4.18.0-183.el8.x86_64 #1 SMP Sun Feb 23 20:50:47 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux release 8.2 Beta (Ootpa) Engine is software version:4.4.0-0.17.master.el7 Red Hat Enterprise Linux Server release 7.8 Beta (Maipo) Linux 3.10.0-1123.el7.x86_64 #1 SMP Tue Jan 14 03:44:38 EST 2020 x86_64 x86_64 x86_64 GNU/Linux
In case I'm running "hosted-engine --vm-start" command, I'll get as follows: Command VM.getStats with args {'vmID': '876bfd86-c46a-47cb-895c-51e61c953782'} failed: (code=1, message=Virtual machine does not exist: {'vmId': '876bfd86-c46a-47cb-895c-51e61c953782'}) VM in WaitForLaunch And then engine gets started: alma04 ~]# virsh -r list --all Id Name State ------------------------------ 8 VM1 running 10 VM3 running 12 HostedEngine running
*** Bug 1721144 has been marked as a duplicate of this bug. ***
hosted-engine --set-maintenance --mode=global hosted-engine --vmstatus !! Cluster is in GLOBAL MAINTENANCE mode !! hosted-engine --vm-shutdown Command VM.shutdown with args {'vmID': '9e2e81a5-64be-444e-b1c5-d9463f3eee35', 'delay': '120', 'message': 'VM is shutting down!'} failed: (code=1, message=Virtual machine does not exist: {'vmId': '9e2e81a5-64be-444e-b1c5-d9463f3eee35'}) alma03 ~]# virsh -r list --all Id Name State -------------------- alma03 ~]# hosted-engine --vm-start-paused Command VM.getStats with args {'vmID': '9e2e81a5-64be-444e-b1c5-d9463f3eee35'} failed: (code=1, message=Virtual machine does not exist: {'vmId': '9e2e81a5-64be-444e-b1c5-d9463f3eee35'}) VM in WaitForLaunch [root@alma03 ~]# virsh -r list --all Id Name State ----------------------------- 1 HostedEngine paused Worked for me on fresh and clean environment, deployment of HE 4.4 was performed on NFS. Tested on host with these components: rhvm-appliance.x86_64 2:4.4-20200326.0.el8ev ovirt-hosted-engine-setup-2.4.4-1.el8ev.noarch ovirt-hosted-engine-ha-2.4.2-1.el8ev.noarch Red Hat Enterprise Linux release 8.2 Beta (Ootpa) Linux 4.18.0-193.el8.x86_64 #1 SMP Fri Mar 27 14:35:58 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux Engine: ovirt-engine-setup-base-4.4.0-0.26.master.el8ev.noarch ovirt-engine-4.4.0-0.26.master.el8ev.noarch openvswitch2.11-2.11.0-48.el8fdp.x86_64 Linux 4.18.0-192.el8.x86_64 #1 SMP Tue Mar 24 14:06:40 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux release 8.2 Beta (Ootpa)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (RHV RHEL Host (ovirt-host) 4.4), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:3246