Bug 1870108
| Summary: | VM devices may get temporarily unplugged on VM boot | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [oVirt] vdsm | Reporter: | Arik <ahadas> | ||||||
| Component: | Core | Assignee: | Milan Zamazal <mzamazal> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Qin Yuan <qiyuan> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | --- | CC: | bugs, dholler, lrotenbe | ||||||
| Target Milestone: | ovirt-4.4.3 | Flags: | ahadas:
ovirt-4.4?
ahadas: planning_ack? ahadas: devel_ack+ ahadas: testing_ack? |
||||||
| Target Release: | 4.40.28 | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | vdsm-4.40.28 | Doc Type: | Bug Fix | ||||||
| Doc Text: |
When booting a newly created VM, Engine could log errors about some of the VM devices being unplugged and show them as unplugged in the Web UI temporarily, despite they are actually plugged. It has been fixed and it shouldn't happen anymore.
|
Story Points: | --- | ||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2020-11-11 06:41:38 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
Dominik, could you please attach VDSM and libvirt debug logs? More initial thoughts by Milan: I can think about two possibilities: - We report the XML obtained from Engine before we update it from libvirt. I don't think this can happen unless Engine can call dumpxmls on the VM before it is reported as fully started. - Timing issue in libvirt or so. Created attachment 1711874 [details] relevant vdsm, libvirt and engine logs (In reply to Arik from comment #1) > Dominik, could you please attach VDSM and libvirt debug logs? Please let me know soon if a logfile is missing. Dominik, I can't see vdsm.log in the attachment. Created attachment 1711883 [details] vdsm.log (In reply to Milan Zamazal from comment #4) > Dominik, I can't see vdsm.log in the attachment. thanks for checking quickly! Thank you, Dominik, for the logs. Engine calls getAllVmStats, followed by dumpxmls, perhaps because it sees an "updated" hash as Arik mentions in Comment 0. If it fits into the window between starting and finishing the VM creation in Vdsm then it gets the XML not yet updated from libvirt. If I insert sleep into Vdsm VM initialization, I obtain an XML without the address too. Now the question is what's the right way to fix it? I don't think attempting to compute the initial hash on the Engine side would be a good idea. Ignoring the missing addresses would help with that particular problem, but not processing XML without stuff added by libvirt at all would be better. Is there a way to achieve that without modifying both Engine and Vdsm? Yeah, it might be (unless I'm missing something) that if in that period of time (until we get an updated xml from libvirt) VDSM won't report the 'hash' in the stats, the engine wouldn't trigger dumpxmls calls Good, then it could be an easy fix, I'll check if it works. Verified with: vdsm-4.40.28-1.el8ev.x86_64 ovirt-engine-4.4.3.1-0.7.el8ev.noarch Steps: 1. Create a new VM and start it 2. Check engine log to see if there is no dumpxml without addresses and no device unplugged message. Results: 1. During VM starting process, there is no dumpxml without addresses, no device unplugged message. This bugzilla is included in oVirt 4.4.3 release, published on November 10th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.3 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. |
We've seen that VDSM reports first dumpxml without addresses, e.g., for a virtio-scsi controller: 1) On create we send: <controller type="scsi" model="virtio-scsi" index="0"> <driver iothread="1"/> <alias name="ua-b500575d-f8d1-4b0e-a011-f9215f4802f1"/> </controller> (2) Then we get xml report (dumpxml) with: <controller index="0" model="virtio-scsi" type="scsi"> <driver iothread="1" /> <alias name="ua-b500575d-f8d1-4b0e-a011-f9215f4802f1" /> </controller> (3) Then we get another xml report with: <controller type='scsi' index='0' model='virtio-scsi'> <driver iothread='1'/> <alias name='ua-b500575d-f8d1-4b0e-a011-f9215f4802f1'/> <address type='pci' domain='0x0000' bus='0x17' slot='0x00' function='0x0'/> </controller> And thus, after (2) we see in engine.log: ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmDevicesMonitoring] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-9) [] VM 'c5df2234-ae71-4b71-83bd-48210ad0f0ca' managed non pluggable device was removed unexpectedly from libvirt: 'VmDevice:{id='VmDeviceId:{deviceId='b500575d-f8d1-4b0e-a011-f9215f4802f1', vmId='c5df2234-ae71-4b71-83bd-48210ad0f0ca'}', device='virtio-scsi', type='CONTROLLER', specParams='[ioThreadId=1]', address='', managed='true', plugged='false', readOnly='false', deviceAlias='', customProperties='[]', snapshotId='null', logicalName='null', hostDevice='null'}' Some initial thoughts: 1. The engine doesn't hold the hash of the initial xml it sends so the first time VDSM reports 'stats' with a certain hash, the engine would query the dumpxml. 2. It may be a timing issue as it doesn't happen in all environments (e.g., VDSM sets the xml it gets from the engine and since it takes some time to get the updated xml from libvirt, that's what VDSM reports back and the engine doesn't realize that its the same xml it sent to VDSM)