Bug 1690294
Summary: | Hosted engine deployment failed as VDSM fails to query the guest agent with KeyError: 'appsList' | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Mugdha Soni <musoni> | |
Component: | rhhi | Assignee: | Sahina Bose <sabose> | |
Status: | CLOSED ERRATA | QA Contact: | Mugdha Soni <musoni> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | rhhiv-1.6 | CC: | c.handel, cjg9411, nicolas, rcyriac, rhs-bugs, sabose, sankarshan, sasundar, seamurph, stirabos | |
Target Milestone: | --- | |||
Target Release: | RHHI-V 1.6 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1690301 (view as bug list) | Environment: | ||
Last Closed: | 2019-05-09 06:09:09 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1676893, 1690301 | |||
Bug Blocks: | 1672932 |
Description
Mugdha Soni
2019-03-19 09:00:30 UTC
The ovirt-hosted-engine-setup logs are present in the below link mentioned. " http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/mugdha/ovirt-hosted-log/ovirt-hosted-engine-setup-ansible-create_target_vm-201921816045-e1lgu5.log " ovirt-engine logs " http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/mugdha/ovirt-engine-log/ " Let's focus on engine.log. The engine see the bootstrap local VM as an external VM: 2019-03-18 15:59:06,530+05 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] VM 'fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43' was discovered as 'Up' on VDS '7a3ed5ac-786e-476c-8120-e815d9934f98'(rhsqa-grafton1.lab.eng.blr.redhat.com) 2019-03-18 15:59:06,550+05 INFO [org.ovirt.engine.core.bll.AddUnmanagedVmsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [136031b] Running command: AddUnmanagedVmsCommand internal: true. 2019-03-18 15:59:06,553+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [136031b] START, DumpXmlsVDSCommand(HostName = rhsqa-grafton1.lab.eng.blr.redhat.com, Params:{hostId='7a3ed5ac-786e-476c-8120-e815d9934f98', vmIds='[fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43]'}), log id: 95d0bca 2019-03-18 15:59:06,558+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [136031b] FINISH, DumpXmlsVDSCommand, return: {fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43=<domain type='kvm' id='1'> <name>HostedEngineLocal</name> <uuid>fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43</uuid> <memory unit='KiB'>16740352</memory> <currentMemory unit='KiB'>16740352</currentMemory> <vcpu placement='static'>4</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type> <boot dev='hd'/> <bootmenu enable='no'/> </os> <features> <acpi/> <apic/> </features> <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>Haswell-noTSX-IBRS</model> <feature policy='require' name='vme'/> <feature policy='require' name='f16c'/> <feature policy='require' name='rdrand'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='arat'/> <feature policy='require' name='xsaveopt'/> <feature policy='require' name='abm'/> </cpu> <clock offset='utc'> <timer name='kvmclock' present='yes'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <pm> <suspend-to-mem enabled='no'/> <suspend-to-disk enabled='no'/> </pm> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/tmp/localvmDxh30G/images/4276937a-3a05-4a4a-a4c9-771eb4f9a0a9/010d2981-0230-4672-92f7-9c6ed0ef6bb9'/> <backingStore/> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/var/tmp/localvmDxh30G/seed.iso'/> <backingStore/> <target dev='hda' bus='ide'/> <readonly/> <alias name='ide0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='usb' index='0' model='none'/> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='ide' index='0'> <alias name='ide'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <controller type='virtio-serial' index='0'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <interface type='network'> <mac address='00:45:54:19:48:01'/> <source network='default' bridge='virbr0'/> <target dev='vnet0'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/4'/> <target type='isa-serial' port='0'> <model name='isa-serial'/> </target> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/4'> <source path='/dev/pts/4'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-1-HostedEngineLocal/org.qemu.guest_agent.0'/> <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <input type='mouse' bus='ps2'> <alias name='input0'/> </input> <input type='keyboard' bus='ps2'> <alias name='input1'/> </input> <graphics type='vnc' port='5900' autoport='yes' listen='127.0.0.1'> <listen type='address' address='127.0.0.1'/> </graphics> <video> <model type='vga' vram='16384' heads='1' primary='yes'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='none'/> <rng model='virtio'> <backend model='random'>/dev/random</backend> <alias name='rng0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </rng> </devices> <seclabel type='dynamic' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c119,c535</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c119,c535</imagelabel> </seclabel> <seclabel type='dynamic' model='dac' relabel='yes'> <label>+107:+107</label> <imagelabel>+107:+107</imagelabel> </seclabel> </domain> }, log id: 95d0bca 2019-03-18 15:59:06,727+05 INFO [org.ovirt.engine.core.bll.AddVmFromScratchCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [616c5678] Lock Acquired to object 'EngineLock:{exclusiveLocks='[external-HostedEngineLocal=VM_NAME]', sharedLocks=''}' 2019-03-18 15:59:06,815+05 INFO [org.ovirt.engine.core.bll.AddVmFromScratchCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] Running command: AddVmFromScratchCommand internal: true. Entities affected : ID: c7c98b60-4967-11e9-9569-004554194801 Type: ClusterAction group CREATE_VM with role type USER 2019-03-18 15:59:07,000+05 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] EVENT_ID: USER_ADD_VM(34), VM external-HostedEngineLocal was created by SYSTEM. 2019-03-18 15:59:07,001+05 INFO [org.ovirt.engine.core.bll.AddVmFromScratchCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] Lock freed to object 'EngineLock:{exclusiveLocks='[external-HostedEngineLocal=VM_NAME]', sharedLocks=''}' 2019-03-18 15:59:07,012+05 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmDevicesMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] Received a console Device without an address when processing VM fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43 devices, skipping device: {alias=serial0, type=console, device=console, deviceId=ec5ad979-5cd6-443c-97ca-6c005d0709f4} 2019-03-18 15:59:07,013+05 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmDevicesMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] Received a vga Device without an address when processing VM fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43 devices, skipping device: {alias=video0, type=video, device=vga, deviceId=1359c3ee-0ea3-45ce-a212-7e34192c95e3} 2019-03-18 15:59:07,013+05 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmDevicesMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] Received a vnc Device without an address when processing VM fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43 devices, skipping device: {alias=, type=graphics, device=vnc, deviceId=2bedfc6a-e134-4238-a5d1-e29aa8019d03} 2019-03-18 15:59:07,026+05 INFO [org.ovirt.engine.core.bll.AddUnmanagedVmsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] Importing VM 'HostedEngineLocal' as 'external-HostedEngineLocal', as it is running on the on Host, but does not exist in the engine. 2019-03-18 15:59:07,041+05 INFO [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] VMs initialization finished for Host: 'rhsqa-grafton1.lab.eng.blr.redhat.com:7a3ed5ac-786e-476c-8120-e815d9934f98' That bootstrap VM in our case has uuid: fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43 Then we restart the engine on the target VM on the shared storage. So hosted-engine-setup tries to remove the bootstrap engine VM to avoid any leftover but it fails with: 2019-03-18 16:07:59,115+05 INFO [org.ovirt.engine.core.bll.ShutdownVmCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Running command: ShutdownVmCommand internal: false. Entities affected : ID: fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43 Type: VMAction group SHUT_DOWN_VM with role type USER 2019-03-18 16:07:59,138+05 INFO [org.ovirt.engine.core.bll.ShutdownVmCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Entered (VM 'external-HostedEngineLocal'). 2019-03-18 16:07:59,138+05 INFO [org.ovirt.engine.core.bll.ShutdownVmCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Sending shutdown command for VM 'external-HostedEngineLocal'. 2019-03-18 16:07:59,142+05 INFO [org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] START, DestroyVmVDSCommand( DestroyVmVDSCommandParameters:{hostId='7a3ed5ac-786e-476c-8120-e815d9934f98', vmId='fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43', secondsToWait='30', gracefully='true', reason='', ignoreNoVm='false'}), log id: 16d89021 2019-03-18 16:07:59,156+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] START, DestroyVDSCommand(HostName = rhsqa-grafton1.lab.eng.blr.redhat.com, DestroyVmVDSCommandParameters:{hostId='7a3ed5ac-786e-476c-8120-e815d9934f98', vmId='fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43', secondsToWait='30', gracefully='true', reason='', ignoreNoVm='false'}), log id: 66deda5d 2019-03-18 16:07:59,161+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Destroy VM couldn't find VM 'fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43'. If after Cancel Migration and VM is UP on source, ignore next error, it's just a clean-up call 2019-03-18 16:07:59,161+05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Failed in 'DestroyVDS' method 2019-03-18 16:07:59,169+05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM rhsqa-grafton1.lab.eng.blr.redhat.com command DestroyVDS failed: Virtual machine does not exist 2019-03-18 16:07:59,169+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand' return value 'StatusOnlyReturn [status=Status [code=1, message=Virtual machine does not exist]]' 2019-03-18 16:07:59,170+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] HostName = rhsqa-grafton1.lab.eng.blr.redhat.com 2019-03-18 16:07:59,170+05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Command 'DestroyVDSCommand(HostName = rhsqa-grafton1.lab.eng.blr.redhat.com, DestroyVmVDSCommandParameters:{hostId='7a3ed5ac-786e-476c-8120-e815d9934f98', vmId='fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43', secondsToWait='30', gracefully='true', reason='', ignoreNoVm='false'})' execution failed: VDSGenericException: VDSErrorException: Failed to DestroyVDS, error = Virtual machine does not exist, code = 1 2019-03-18 16:07:59,170+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] FINISH, DestroyVDSCommand, return: , log id: 66deda5d 2019-03-18 16:07:59,170+05 ERROR [org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Failed to destroy VM 'fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43' in VDS = '7a3ed5ac-786e-476c-8120-e815d9934f98' , error = 'org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to DestroyVDS, error = Virtual machine does not exist, code = 1' 2019-03-18 16:07:59,170+05 INFO [org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] FINISH, DestroyVmVDSCommand, return: , log id: 16d89021 2019-03-18 16:07:59,170+05 ERROR [org.ovirt.engine.core.bll.ShutdownVmCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Command 'org.ovirt.engine.core.bll.ShutdownVmCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to DestroyVDS, error = Virtual machine does not exist, code = 1 (Failed with error noVM and code 1) 2019-03-18 16:07:59,199+05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] EVENT_ID: USER_FAILED_SHUTDOWN_VM(74), Failed to initiate shutdown on VM external-HostedEngineLocal (Host: rhsqa-grafton1.lab.eng.blr.redhat.com, User: admin@internal-authz). 2019-03-18 16:07:59,217+05 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-1) [] Operation Failed: [Desktop does not exist] but the root cause is here: 2019-03-18 16:08:07,344+05 WARN [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [] Unexpected return value: Status [code=-32603, message=Internal JSON-RPC error: {'reason': "'appsList'"}] 2019-03-18 16:08:07,344+05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [] Failed in 'GetAllVmStatsVDS' method 2019-03-18 16:08:07,344+05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [] Command 'GetAllVmStatsVDSCommand(HostName = rhsqa-grafton1.lab.eng.blr.redhat.com, VdsIdVDSCommandParametersBase:{hostId='7a3ed5ac-786e-476c-8120-e815d9934f98'})' execution failed: VDSGenericException: VDSErrorException: Failed to GetAllVmStatsVDS, error = Internal JSON-RPC error: {'reason': "'appsList'"}, code = -32603 2019-03-18 16:08:07,344+05 INFO [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [] Failed to fetch vms info for host 'rhsqa-grafton1.lab.eng.blr.redhat.com' - skipping VMs monitoring. GetAllVmStatsVDS on rhsqa-grafton1.lab.eng.blr.redhat.com failed with error = Internal JSON-RPC error: {'reason': "'appsList'"}, code = -32603 and so the VM list for rhsqa-grafton1.lab.eng.blr.redhat.com was not up to date and so ovirt-hosted-engine-setup failed to remove the bootstrap local engine VM since the VM list was missing. Mugdha, can you please attach /var/log/vdsm/vdsm.log from rhsqa-grafton1.lab.eng.blr.redhat.com (for the relevant time) to better understand why GetAllVmStatsVDS there failed? VDSM LOGS " http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/mugdha/vdsm-log/ " The error in the vdsm logs states as follows: 2019-03-18 16:07:59,545+0530 ERROR (jsonrpc/4) [jsonrpc.JsonRpcServer] Internal server error (__init__:350) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 345, in _handle_request res = method(**params) File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 194, in _dynamicMethod result = fn(*methodArgs) File "<string>", line 2, in getAllVmStats File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 1388, in getAllVmStats statsList = self._cif.getAllVmStats() File "/usr/lib/python2.7/site-packages/vdsm/clientIF.py", line 567, in getAllVmStats return [v.getStats() for v in self.vmContainer.values()] File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 1766, in getStats oga_stats = self._getGuestStats() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 1967, in _getGuestStats stats = self.guestAgent.getGuestInfo() File "/usr/lib/python2.7/site-packages/vdsm/virt/guestagent.py", line 505, in getGuestInfo del qga['appsList'] KeyError: 'appsList' Just chiming in to say we are experiencing this same error on an upgraded (4.2.8 to 4.3.2) oVirt instance. The following patch and a subsequent restart of vdsmd fixes my issue. https://gerrit.ovirt.org/#/c/98499/1/lib/vdsm/virt/guestagent.py (In reply to Charles J. Gruener from comment #9) > The following patch and a subsequent restart of vdsmd fixes my issue. > > https://gerrit.ovirt.org/#/c/98499/1/lib/vdsm/virt/guestagent.py Hello, Just to say that I experienced this error in a non gluster context (pure oVirt on iSCSI), and the patch above solved it. I don't know how and when this patch will be embedded in the next oVirt release? (In reply to Nicolas Ecarnot from comment #10) > (In reply to Charles J. Gruener from comment #9) > > The following patch and a subsequent restart of vdsmd fixes my issue. > > > > https://gerrit.ovirt.org/#/c/98499/1/lib/vdsm/virt/guestagent.py > > Hello, > > Just to say that I experienced this error in a non gluster context (pure > oVirt on iSCSI), and the patch above solved it. > I don't know how and when this patch will be embedded in the next oVirt > release? Yes, this is addressed as part of Bug 1676893 in oVirt 4.3.3 Tested with RHVH 4.3 (rhvh-4.3.0.5-0.20190328.0+1) 1. HE deployment was successful and there are no issues seen with gluster storage domain Thanks Mugdha for verifying this scenario Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:1121 |