Bug 1690301
Summary: | VDSM fails to query the guest agent with KeyError: 'appsList' | ||||||
---|---|---|---|---|---|---|---|
Product: | [oVirt] vdsm | Reporter: | Mugdha Soni <musoni> | ||||
Component: | General | Assignee: | Dan Kenigsberg <danken> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Lukas Svaty <lsvaty> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 4.30.10 | CC: | bugs, pmatyas, rbarry, rhs-bugs, sabose, sankarshan, sasundar, stirabos, tgolembi | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | 1690294 | Environment: | |||||
Last Closed: | 2019-03-20 00:51:29 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | |||||||
Bug Blocks: | 1690294 | ||||||
Attachments: |
|
Description
Mugdha Soni
2019-03-19 09:06:06 UTC
ovirt-engine logs " http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/mugdha/ovirt-engine-log/ " Let's focus on engine.log. The engine see the bootstrap local VM as an external VM: 2019-03-18 15:59:06,530+05 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] VM 'fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43' was discovered as 'Up' on VDS '7a3ed5ac-786e-476c-8120-e815d9934f98'(rhsqa-grafton1.lab.eng.blr.redhat.com) 2019-03-18 15:59:06,550+05 INFO [org.ovirt.engine.core.bll.AddUnmanagedVmsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [136031b] Running command: AddUnmanagedVmsCommand internal: true. 2019-03-18 15:59:06,553+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [136031b] START, DumpXmlsVDSCommand(HostName = rhsqa-grafton1.lab.eng.blr.redhat.com, Params:{hostId='7a3ed5ac-786e-476c-8120-e815d9934f98', vmIds='[fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43]'}), log id: 95d0bca 2019-03-18 15:59:06,558+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [136031b] FINISH, DumpXmlsVDSCommand, return: {fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43=<domain type='kvm' id='1'> <name>HostedEngineLocal</name> <uuid>fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43</uuid> <memory unit='KiB'>16740352</memory> <currentMemory unit='KiB'>16740352</currentMemory> <vcpu placement='static'>4</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type> <boot dev='hd'/> <bootmenu enable='no'/> </os> <features> <acpi/> <apic/> </features> <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>Haswell-noTSX-IBRS</model> <feature policy='require' name='vme'/> <feature policy='require' name='f16c'/> <feature policy='require' name='rdrand'/> <feature policy='require' name='hypervisor'/> <feature policy='require' name='arat'/> <feature policy='require' name='xsaveopt'/> <feature policy='require' name='abm'/> </cpu> <clock offset='utc'> <timer name='kvmclock' present='yes'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <pm> <suspend-to-mem enabled='no'/> <suspend-to-disk enabled='no'/> </pm> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/tmp/localvmDxh30G/images/4276937a-3a05-4a4a-a4c9-771eb4f9a0a9/010d2981-0230-4672-92f7-9c6ed0ef6bb9'/> <backingStore/> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/var/tmp/localvmDxh30G/seed.iso'/> <backingStore/> <target dev='hda' bus='ide'/> <readonly/> <alias name='ide0-0-0'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='usb' index='0' model='none'/> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='ide' index='0'> <alias name='ide'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <controller type='virtio-serial' index='0'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <interface type='network'> <mac address='00:45:54:19:48:01'/> <source network='default' bridge='virbr0'/> <target dev='vnet0'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/4'/> <target type='isa-serial' port='0'> <model name='isa-serial'/> </target> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/4'> <source path='/dev/pts/4'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-1-HostedEngineLocal/org.qemu.guest_agent.0'/> <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <input type='mouse' bus='ps2'> <alias name='input0'/> </input> <input type='keyboard' bus='ps2'> <alias name='input1'/> </input> <graphics type='vnc' port='5900' autoport='yes' listen='127.0.0.1'> <listen type='address' address='127.0.0.1'/> </graphics> <video> <model type='vga' vram='16384' heads='1' primary='yes'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='none'/> <rng model='virtio'> <backend model='random'>/dev/random</backend> <alias name='rng0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </rng> </devices> <seclabel type='dynamic' model='selinux' relabel='yes'> <label>system_u:system_r:svirt_t:s0:c119,c535</label> <imagelabel>system_u:object_r:svirt_image_t:s0:c119,c535</imagelabel> </seclabel> <seclabel type='dynamic' model='dac' relabel='yes'> <label>+107:+107</label> <imagelabel>+107:+107</imagelabel> </seclabel> </domain> }, log id: 95d0bca 2019-03-18 15:59:06,727+05 INFO [org.ovirt.engine.core.bll.AddVmFromScratchCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [616c5678] Lock Acquired to object 'EngineLock:{exclusiveLocks='[external-HostedEngineLocal=VM_NAME]', sharedLocks=''}' 2019-03-18 15:59:06,815+05 INFO [org.ovirt.engine.core.bll.AddVmFromScratchCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] Running command: AddVmFromScratchCommand internal: true. Entities affected : ID: c7c98b60-4967-11e9-9569-004554194801 Type: ClusterAction group CREATE_VM with role type USER 2019-03-18 15:59:07,000+05 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] EVENT_ID: USER_ADD_VM(34), VM external-HostedEngineLocal was created by SYSTEM. 2019-03-18 15:59:07,001+05 INFO [org.ovirt.engine.core.bll.AddVmFromScratchCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] Lock freed to object 'EngineLock:{exclusiveLocks='[external-HostedEngineLocal=VM_NAME]', sharedLocks=''}' 2019-03-18 15:59:07,012+05 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmDevicesMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] Received a console Device without an address when processing VM fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43 devices, skipping device: {alias=serial0, type=console, device=console, deviceId=ec5ad979-5cd6-443c-97ca-6c005d0709f4} 2019-03-18 15:59:07,013+05 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmDevicesMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] Received a vga Device without an address when processing VM fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43 devices, skipping device: {alias=video0, type=video, device=vga, deviceId=1359c3ee-0ea3-45ce-a212-7e34192c95e3} 2019-03-18 15:59:07,013+05 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmDevicesMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] Received a vnc Device without an address when processing VM fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43 devices, skipping device: {alias=, type=graphics, device=vnc, deviceId=2bedfc6a-e134-4238-a5d1-e29aa8019d03} 2019-03-18 15:59:07,026+05 INFO [org.ovirt.engine.core.bll.AddUnmanagedVmsCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] Importing VM 'HostedEngineLocal' as 'external-HostedEngineLocal', as it is running on the on Host, but does not exist in the engine. 2019-03-18 15:59:07,041+05 INFO [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] VMs initialization finished for Host: 'rhsqa-grafton1.lab.eng.blr.redhat.com:7a3ed5ac-786e-476c-8120-e815d9934f98' That bootstrap VM in our case has uuid: fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43 Then we restart the engine on the target VM on the shared storage. So hosted-engine-setup tries to remove the bootstrap engine VM to avoid any leftover but it fails with: 2019-03-18 16:07:59,115+05 INFO [org.ovirt.engine.core.bll.ShutdownVmCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Running command: ShutdownVmCommand internal: false. Entities affected : ID: fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43 Type: VMAction group SHUT_DOWN_VM with role type USER 2019-03-18 16:07:59,138+05 INFO [org.ovirt.engine.core.bll.ShutdownVmCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Entered (VM 'external-HostedEngineLocal'). 2019-03-18 16:07:59,138+05 INFO [org.ovirt.engine.core.bll.ShutdownVmCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Sending shutdown command for VM 'external-HostedEngineLocal'. 2019-03-18 16:07:59,142+05 INFO [org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] START, DestroyVmVDSCommand( DestroyVmVDSCommandParameters:{hostId='7a3ed5ac-786e-476c-8120-e815d9934f98', vmId='fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43', secondsToWait='30', gracefully='true', reason='', ignoreNoVm='false'}), log id: 16d89021 2019-03-18 16:07:59,156+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] START, DestroyVDSCommand(HostName = rhsqa-grafton1.lab.eng.blr.redhat.com, DestroyVmVDSCommandParameters:{hostId='7a3ed5ac-786e-476c-8120-e815d9934f98', vmId='fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43', secondsToWait='30', gracefully='true', reason='', ignoreNoVm='false'}), log id: 66deda5d 2019-03-18 16:07:59,161+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Destroy VM couldn't find VM 'fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43'. If after Cancel Migration and VM is UP on source, ignore next error, it's just a clean-up call 2019-03-18 16:07:59,161+05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Failed in 'DestroyVDS' method 2019-03-18 16:07:59,169+05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM rhsqa-grafton1.lab.eng.blr.redhat.com command DestroyVDS failed: Virtual machine does not exist 2019-03-18 16:07:59,169+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Command 'org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand' return value 'StatusOnlyReturn [status=Status [code=1, message=Virtual machine does not exist]]' 2019-03-18 16:07:59,170+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] HostName = rhsqa-grafton1.lab.eng.blr.redhat.com 2019-03-18 16:07:59,170+05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Command 'DestroyVDSCommand(HostName = rhsqa-grafton1.lab.eng.blr.redhat.com, DestroyVmVDSCommandParameters:{hostId='7a3ed5ac-786e-476c-8120-e815d9934f98', vmId='fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43', secondsToWait='30', gracefully='true', reason='', ignoreNoVm='false'})' execution failed: VDSGenericException: VDSErrorException: Failed to DestroyVDS, error = Virtual machine does not exist, code = 1 2019-03-18 16:07:59,170+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] FINISH, DestroyVDSCommand, return: , log id: 66deda5d 2019-03-18 16:07:59,170+05 ERROR [org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Failed to destroy VM 'fdafa94f-07bc-4bfd-ae02-f44ddb2f7a43' in VDS = '7a3ed5ac-786e-476c-8120-e815d9934f98' , error = 'org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to DestroyVDS, error = Virtual machine does not exist, code = 1' 2019-03-18 16:07:59,170+05 INFO [org.ovirt.engine.core.vdsbroker.DestroyVmVDSCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] FINISH, DestroyVmVDSCommand, return: , log id: 16d89021 2019-03-18 16:07:59,170+05 ERROR [org.ovirt.engine.core.bll.ShutdownVmCommand] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] Command 'org.ovirt.engine.core.bll.ShutdownVmCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to DestroyVDS, error = Virtual machine does not exist, code = 1 (Failed with error noVM and code 1) 2019-03-18 16:07:59,199+05 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-1) [eeb95d8c-8bed-428a-9439-e4241f489a57] EVENT_ID: USER_FAILED_SHUTDOWN_VM(74), Failed to initiate shutdown on VM external-HostedEngineLocal (Host: rhsqa-grafton1.lab.eng.blr.redhat.com, User: admin@internal-authz). 2019-03-18 16:07:59,217+05 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-1) [] Operation Failed: [Desktop does not exist] but the root cause is here: 2019-03-18 16:08:07,344+05 WARN [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [] Unexpected return value: Status [code=-32603, message=Internal JSON-RPC error: {'reason': "'appsList'"}] 2019-03-18 16:08:07,344+05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [] Failed in 'GetAllVmStatsVDS' method 2019-03-18 16:08:07,344+05 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetAllVmStatsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [] Command 'GetAllVmStatsVDSCommand(HostName = rhsqa-grafton1.lab.eng.blr.redhat.com, VdsIdVDSCommandParametersBase:{hostId='7a3ed5ac-786e-476c-8120-e815d9934f98'})' execution failed: VDSGenericException: VDSErrorException: Failed to GetAllVmStatsVDS, error = Internal JSON-RPC error: {'reason': "'appsList'"}, code = -32603 2019-03-18 16:08:07,344+05 INFO [org.ovirt.engine.core.vdsbroker.monitoring.PollVmStatsRefresher] (EE-ManagedThreadFactory-engineScheduled-Thread-10) [] Failed to fetch vms info for host 'rhsqa-grafton1.lab.eng.blr.redhat.com' - skipping VMs monitoring. GetAllVmStatsVDS on rhsqa-grafton1.lab.eng.blr.redhat.com failed with error = Internal JSON-RPC error: {'reason': "'appsList'"}, code = -32603 and so the VM list for rhsqa-grafton1.lab.eng.blr.redhat.com was not up to date and so ovirt-hosted-engine-setup failed to remove the bootstrap local engine VM since the VM list was missing. Mugdha, can you please attach /var/log/vdsm/vdsm.log from rhsqa-grafton1.lab.eng.blr.redhat.com (for the relevant time) to better understand why GetAllVmStatsVDS there failed? It seems an issue of vdsm querying ovirt-guest-agent: 2019-03-18 16:07:48,165+0530 ERROR (jsonrpc/3) [api] FINISH getStats error='appsList' (api:134) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 124, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 361, in getStats stats = vm.getStats().copy() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 1766, in getStats oga_stats = self._getGuestStats() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 1967, in _getGuestStats stats = self.guestAgent.getGuestInfo() File "/usr/lib/python2.7/site-packages/vdsm/virt/guestagent.py", line 505, in getGuestInfo del qga['appsList'] KeyError: 'appsList' 2019-03-18 16:07:48,165+0530 INFO (jsonrpc/3) [api.virt] FINISH getStats return={'status': {'message': 'General Exception: ("\'appsList\'",)', 'code': 100}} from=::1,48218, vmId=44c126ae-e9d8-409a-85ae-5288060659dd (api:54) Please notice that 44c126ae-e9d8-409a-85ae-5288060659dd is the target VM where the engine is really running and not the bootstrap one which is down. 2019-03-18 16:11:07,595+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] START, DumpXmlsVDSCommand(HostName = rhsqa-grafton1.lab.eng.blr.redhat.com, Params:{hostId='7a3ed5ac-786e-476c-8120-e815d9934f98', vmIds='[44c126ae-e9d8-409a-85ae-5288060659dd]'}), log id: 54a08ec9 2019-03-18 16:11:07,600+05 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-77) [] FINISH, DumpXmlsVDSCommand, return: {44c126ae-e9d8-409a-85ae-5288060659dd=<domain type='kvm' id='2'> <name>HostedEngine</name> <uuid>44c126ae-e9d8-409a-85ae-5288060659dd</uuid> <metadata xmlns:ns0="http://ovirt.org/vm/tune/1.0" xmlns:ovirt-vm="http://ovirt.org/vm/1.0"> Moving to VDSM for further investigation. Created attachment 1545557 [details]
vdsm.log
Maybe https://gerrit.ovirt.org/#/c/98499/ fixes this as well. Tomáš, can you please take a look? Yes, that patch fixes exactly this issue. *** This bug has been marked as a duplicate of bug 1676893 *** *** Bug 1690379 has been marked as a duplicate of this bug. *** |