Bug 1496395
Summary: | [Memory hot unplug] After commit snapshot with memory hot unplug failed since device not found | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-engine | Reporter: | Israel Pinto <ipinto> | ||||||||||||||
Component: | BLL.Virt | Assignee: | Milan Zamazal <mzamazal> | ||||||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Pedut <pchocron> | ||||||||||||||
Severity: | high | Docs Contact: | Rolfe Dlugy-Hegwer <rdlugyhe> | ||||||||||||||
Priority: | medium | ||||||||||||||||
Version: | 4.2.0 | CC: | ahadas, bugs, ipinto, mavital, mtessun, pchocron, rdlugyhe, tjelinek | ||||||||||||||
Target Milestone: | ovirt-4.3.0 | Flags: | rule-engine:
ovirt-4.3+
mtessun: planning_ack+ rule-engine: devel_ack+ rule-engine: testing_ack+ |
||||||||||||||
Target Release: | 4.3.0 | ||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||
OS: | Unspecified | ||||||||||||||||
Whiteboard: | |||||||||||||||||
Fixed In Version: | ovirt-engine-4.3.0_alpha | Doc Type: | Bug Fix | ||||||||||||||
Doc Text: |
Previously, memory hot unplug did not work in virtual machines started from snapshots.
This has been fixed in the current release: Memory hot unplug works in virtual machines started from snapshots.
|
Story Points: | --- | ||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | 2019-02-13 07:44:54 UTC | Type: | Bug | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Bug Depends On: | 1645022 | ||||||||||||||||
Bug Blocks: | |||||||||||||||||
Attachments: |
|
Created attachment 1331353 [details]
screenshot
Created attachment 1331356 [details]
vdsm
Created attachment 1331359 [details]
dumpxml
the xml dump is taken between which steps in your steps to reproduce? No, it was taken in the end. So, the reason it happens is this: - when the snapshot with memory is taken, libvirt stores its own OVF with VM configuration - engine stores its own in DB - than, when the VM is started again, engine builds a libvirtxml from the OVF stored in the DB and sends it to the VDSM - this libvirtxml does not contain the memory devices, because of this code in LibvirtVmXmlBuilder.writeDevices(): .... case WATCHDOG: writeWatchdog(device); break; case MEMORY: // memory devices are only used for hot-plug break; case VIDEO: writeVideo(device); break; .... so, the memory devices are skipped. - VDSM than builds the VM representation from this XML (without the memory devices) - libvirt builds the VM from it's own OVF (with the memory devices) So, it looks like the only missing part here is that the writeDevices() needs to write also the memory devices. @Arik: any thoughts? Do you think it should be done, or are there any risks doing it? memory devices should be restored correctly on libvirt side, it seems it should be enough if vdsm reports them correctly, they will re-appear in engine, and you can unplug. Milan, please check the vdsm side to use libvirt xml and not anything in vmconf It seems the problem is that _srcDomXML doesn't contain device aliases. Memory hotunplug identifies the DIMM by its alias, which is missing when devices are initialized, so it can't be found. After Vdsm is restarted and initializes devices from libvirt rather than _srcDomXML, hotunplug works. Perhaps a followup device update from libvirt is missing when restoring from snapshot. I'm not sure though whether the assigned aliases are the same between different runs (before and after snapshot). The problem still exists: The memory hotplug XML doesn't contain any alias, the alias is added to the domain XML by libvirt, and _srcDomXML doesn't contain the alias. The cause of the problem is that memory devices don't have user aliases. libvirt assigned aliases are removed from migratable domain XML returned by libvirt. That means memory devices can no longer be identified in snapshot's _srcDomXML. Live migration doesn't suffer from this problem. Storage and network devices have user aliases, lease devices don't have any aliases at all, so all the hotunpluggable devices other than memory should be fine. There are other devices that don't have user aliases and lose their aliases in snapshots. I'm not sure whether that is a problem or not. As for remedy, there are several options (in the order of preference): 1. Find out why live migrations are OK and to check whether there is some avoidable difference regarding device handling in Vdsm between file and host migrations. 2. Provide user aliases for memory devices from Engine. 3. Provide user aliases for memory devices in Vdsm. 4. Identify memory devices by something else than aliases. Solution 2. has been implemented and merged. After hot unplug memory the VM memory remains the same(the memory devices that exist under VM devices tab remains the same) Created attachment 1499339 [details]
updated logs
Pedut, I can't see any hot unplug action in the provided logs. So either the logs are wrong or it's a completely different problem and hot unplug wasn't initiated at all. Did you try to remove all the hot plugged memory devices by pressing the hot unplug buttons next to them? Milan you right, I did uploaded the wrong logs. Pedut, could you please upload the right logs? Created attachment 1499940 [details]
updated logs
The new failure is different from the original one, now the following error is reported: unplug of device was rejected by the guest. It may be a guest OS not properly set up for memory hot unplug. Pedut, does memory hot unplug work for the same VM without making a snapshot? Even after making the guest OS busy with I/O such as with find / -xdev -type f -exec cat {} \; >/dev/null And what version of guest OS do you run? It doesn't work even without making a snapshot I opened a bug related to this. The version of the guest OS is Red Hat Enterprise Linux Server 7.6 (Maipo). According to information provided by Pedut the "device not found" problem is no longer present and that memory hot unplug doesn't work in her testing problem is unrelated to snapshots, so moving the bug back to modified. Verified on 4.2.7.3-0.0.master.20181015151121.gitd6e9af9.el7 according to the described steps. This bugzilla is included in oVirt 4.3.0 release, published on February 4th 2019. Since the problem described in this bug report should be resolved in oVirt 4.3.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. |
Created attachment 1331352 [details] engine_log Description of problem: Failed to hot unplug memory device on VM with commit snapshot. Version-Release number of selected component (if applicable): Software version:4.2.0-0.0.master.20170917124606.gita804ef7.el7.centos Steps to Reproduce: 1. Create VM with OS and run it 2. Hotplug memory to VM 3. Create snapshot with memory 4. Stop VM 5. Preview Snapshot 6. Commit snapshot 7. Run VM 8. Check that the memory device exists under VM device tab 9. Hot unplug memory Actual results: General exception failed to hot unplug memory Expected results: Hot unplug memory will be succeed Additional info: engine log: 2017-09-27 12:05:54,612+03 ERROR [org.ovirt.engine.core.vdsbroker.HotUnplugMemoryVDSCommand] (default task-5) [5e2b4378-d242-4b02-b926-f616b843be54] Failed in 'HotUnplugMemoryVDS' method 2017-09-27 12:05:54,625+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-5) [5e2b4378-d242-4b02-b926-f616b843be54] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM host_mixed_1 command HotUnplugMemoryVDS failed: General Exception: ('Device instance for device identified by alias dimm1 and type memory not found',) 2017-09-27 12:05:54,626+03 ERROR [org.ovirt.engine.core.vdsbroker.HotUnplugMemoryVDSCommand] (default task-5) [5e2b4378-d242-4b02-b926-f616b843be54] Command 'HotUnplugMemoryVDSCommand(HostName = host_mixed_1, Params:{hostId='263620b9-1567-4e16-984d-2acc45487c50'})' execution failed: VDSGenericException: VDSErrorException: Failed to HotUnplugMemoryVDS, error = General Exception: ('Device instance for device identified by alias dimm1 and type memory not found',), code = 100 2017-09-27 12:05:54,626+03 INFO [org.ovirt.engine.core.vdsbroker.HotUnplugMemoryVDSCommand] (default task-5) [5e2b4378-d242-4b02-b926-f616b843be54] FINISH, HotUnplugMemoryVDSCommand, log id: 2c1b8769 2017-09-27 12:05:54,644+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-5) [5e2b4378-d242-4b02-b926-f616b843be54] EVENT_ID: MEMORY_HOT_UNPLUG_FAILED(2,047), Failed to hot unplug memory device (b1a5b471-eaa9-44ce-9e94-f534eda40815) of size 896 out of VM 'Test_memory_hot_plug_unplug': General Exception: ('Device instance for device identified by alias dimm1 and type memory not found',)