Bug 2158701
Summary: | Hotplugged dimm device has wrong alias name in some specific scenario | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Fangge Jin <fjin> | ||||
Component: | libvirt | Assignee: | Peter Krempa <pkrempa> | ||||
libvirt sub component: | General | QA Contact: | liang cong <lcong> | ||||
Status: | CLOSED ERRATA | Docs Contact: | |||||
Severity: | medium | ||||||
Priority: | unspecified | CC: | dzheng, jdenemar, jsuchane, lcong, lmen, nanli, virt-maint | ||||
Version: | 9.2 | Keywords: | Triaged | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | libvirt-9.0.0-2.el9 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2023-05-09 07:27:43 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Fixed upstream by: commit 5764930463eb8f450e45fa982651ef6b7a7afd7c Author: Peter Krempa <pkrempa> Date: Thu Jan 19 15:18:45 2023 +0100 qemu: Remove 'memAliasOrderMismatch' field from VM private data The field is no longer used so we can remove it and the code filling it. Signed-off-by: Peter Krempa <pkrempa> Reviewed-by: Martin Kletzander <mkletzan> commit 6d3f0b11b2b056313b123510c96f2924689341f9 Author: Peter Krempa <pkrempa> Date: Thu Jan 19 15:16:58 2023 +0100 qemu: alias: Remove 'oldAlias' argument of qemuAssignDeviceMemoryAlias All callers pass 'false' so we no longer need it. Signed-off-by: Peter Krempa <pkrempa> Reviewed-by: Martin Kletzander <mkletzan> commit 50ce3463d514950350143f03e8421c8c31889c5d Author: Peter Krempa <pkrempa> Date: Thu Jan 19 15:06:11 2023 +0100 qemu: hotplug: Remove legacy quirk for 'dimm' address generation Commit b7798a07f93 (in fall of 2016) changed the way we generate aliases for 'dimm' memory devices as the alias itself is part of the migration stream section naming and thus must be treated as ABI. The code added compatibility layer for VMs with memory hotplug started with the old scheme to prevent from generating wrong aliases. The compatibility layer broke though later when 'nvdimm' and 'pmem' devices were introduced as it wrongly detected them as old configuration. Now rather than attempting to fix the legacy compat layer to treat other devices properly we'll be better off simply removing it as it's extremely unlikely that somebody has a VM started in 2016 running with today's libvirt and attempts to hotplug more memory. This fixes a corner case when a user hot-adds a 'dimm' into a VM with a 'dimm' and a 'nvdimm' after restart of libvirtd and then attempts to migrate the VM. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2158701 Signed-off-by: Peter Krempa <pkrempa> Reviewed-by: Martin Kletzander <mkletzan> Preverified on upstream build libvirt v9.0.0-118-g9f8fba7501 Test steps: 1. Start a guest with a dimm device(alias=dimm0) and a nvdimm device(alias=nvdimm1) <memory model="dimm" access="private" discard="no"> <source> <nodemask>0-1</nodemask> <pagesize unit="KiB">2048</pagesize> </source> <target> <size unit="KiB">262144</size> <node>0</node> </target> <alias name="dimm0"/> <address type="dimm" slot="0" base="0x100000000"/> </memory> <memory model="nvdimm"> <source> <path>/tmp/nvdimm</path> </source> <target> <size unit="KiB">524288</size> <node>1</node> <label> <size unit="KiB">128</size> </label> </target> <alias name="nvdimm1"/> <address type="dimm" slot="1" base="0x110000000"/> </memory> 2. Restart virtqemud # systemctl restart virtqemud 3. Hotplug another dimm device to guest. # cat dimm.xml <memory model="dimm" access="private" discard="no"> <source> <nodemask>0-1</nodemask> <pagesize unit="KiB">2048</pagesize> </source> <target> <size unit="KiB">262144</size> <node>1</node> </target> </memory> # virsh attach-device vm1 dimm.xml Device attached successfully 4. Check the memory config # virsh dumpxml vm1 --xpath 'devices//memory' <memory model="dimm" access="private" discard="no"> <source> <nodemask>0-1</nodemask> <pagesize unit="KiB">2048</pagesize> </source> <target> <size unit="KiB">262144</size> <node>0</node> </target> <alias name="dimm0"/> <address type="dimm" slot="0" base="0x100000000"/> </memory> <memory model="nvdimm"> <source> <path>/tmp/nvdimm</path> </source> <target> <size unit="KiB">524288</size> <node>1</node> <label> <size unit="KiB">128</size> </label> </target> <alias name="nvdimm1"/> <address type="dimm" slot="1" base="0x110000000"/> </memory> <memory model="dimm" access="private" discard="no"> <source> <nodemask>0-1</nodemask> <pagesize unit="KiB">2048</pagesize> </source> <target> <size unit="KiB">262144</size> <node>1</node> </target> <alias name="dimm2"/> <address type="dimm" slot="2" base="0x130000000"/> </memory> 5. Query memory devices at qemu level: # virsh qemu-monitor-command vm1 '{"execute":"query-memory-devices"}' {"return":[{"type":"dimm","data":{"memdev":"/objects/memdimm0","hotplugged":false,"addr":4294967296,"hotpluggable":true,"size":268435456,"slot":0,"node":0,"id":"dimm0"}},{"type":"nvdimm","data":{"memdev":"/objects/memnvdimm1","hotplugged":false,"addr":4563402752,"hotpluggable":true,"size":536739840,"slot":1,"node":1,"id":"nvdimm1"}},{"type":"dimm","data":{"memdev":"/objects/memdimm2","hotplugged":true,"addr":5100273664,"hotpluggable":true,"size":268435456,"slot":2,"node":1,"id":"dimm2"}}],"id":"libvirt-18"} 6. Do virsh save & restore # virsh save vm1 /tmp/save Domain 'vm1' saved to /tmp/save # virsh restore /tmp/save Domain restored from /tmp/save 7. Do live migration # virsh migrate vm1 qemu+tcp://dell-per740xd-14.lab.eng.pek2.redhat.com/system --live --p2p --undefinesource --persistent Verified on upstream build libvirt v9.0.0-118-g9f8fba7501 Test steps: 1. Start a guest with a dimm device(alias=dimm0) and a nvdimm device(alias=nvdimm1) <memory model="dimm" access="private" discard="no"> <source> <nodemask>0-1</nodemask> <pagesize unit="KiB">2048</pagesize> </source> <target> <size unit="KiB">262144</size> <node>0</node> </target> <alias name="dimm0"/> <address type="dimm" slot="0" base="0x100000000"/> </memory> <memory model="nvdimm"> <source> <path>/tmp/nvdimm</path> </source> <target> <size unit="KiB">524288</size> <node>1</node> <label> <size unit="KiB">128</size> </label> </target> <alias name="nvdimm1"/> <address type="dimm" slot="1" base="0x110000000"/> </memory> 2. Restart virtqemud # systemctl restart virtqemud 3. Hotplug another dimm device to guest. # cat dimm.xml <memory model="dimm" access="private" discard="no"> <source> <nodemask>0-1</nodemask> <pagesize unit="KiB">2048</pagesize> </source> <target> <size unit="KiB">262144</size> <node>1</node> </target> </memory> # virsh attach-device vm1 dimm.xml Device attached successfully 4. Check the memory config # virsh dumpxml vm1 --xpath 'devices//memory' <memory model="dimm" access="private" discard="no"> <source> <nodemask>0-1</nodemask> <pagesize unit="KiB">2048</pagesize> </source> <target> <size unit="KiB">262144</size> <node>0</node> </target> <alias name="dimm0"/> <address type="dimm" slot="0" base="0x100000000"/> </memory> <memory model="nvdimm"> <source> <path>/tmp/nvdimm</path> </source> <target> <size unit="KiB">524288</size> <node>1</node> <label> <size unit="KiB">128</size> </label> </target> <alias name="nvdimm1"/> <address type="dimm" slot="1" base="0x110000000"/> </memory> <memory model="dimm" access="private" discard="no"> <source> <nodemask>0-1</nodemask> <pagesize unit="KiB">2048</pagesize> </source> <target> <size unit="KiB">262144</size> <node>1</node> </target> <alias name="dimm2"/> <address type="dimm" slot="2" base="0x130000000"/> </memory> 5. Query memory devices at qemu level: # virsh qemu-monitor-command vm1 '{"execute":"query-memory-devices"}' {"return":[{"type":"dimm","data":{"memdev":"/objects/memdimm0","hotplugged":false,"addr":4294967296,"hotpluggable":true,"size":268435456,"slot":0,"node":0,"id":"dimm0"}},{"type":"nvdimm","data":{"memdev":"/objects/memnvdimm1","hotplugged":false,"addr":4563402752,"hotpluggable":true,"size":536739840,"slot":1,"node":1,"id":"nvdimm1"}},{"type":"dimm","data":{"memdev":"/objects/memdimm2","hotplugged":true,"addr":5100273664,"hotpluggable":true,"size":268435456,"slot":2,"node":1,"id":"dimm2"}}],"id":"libvirt-18"} 6. Do virsh save & restore # virsh save vm1 /tmp/save Domain 'vm1' saved to /tmp/save # virsh restore /tmp/save Domain restored from /tmp/save 7. Do live migration # virsh migrate vm1 qemu+tcp://dell-per740xd-14.lab.eng.pek2.redhat.com/system --live --p2p --undefinesource --persistent Verified on libvirt-9.0.0-3.el9.x86_64 with steps same with comment 6 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (libvirt bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2171 |
Created attachment 1936113 [details] virtqemud and qemu log Description of problem: Start a guest with a dimm device(alias=dimm0) and a nvdimm device(alias=nvdimm1), restart virtqemud, then hotplug another dimm device, the hotplugged dimm device will have wrong alias: dimm1(the correct alias name is dimm2). Then do live migration(or save&restore), it fails with error from qemu: Unknown ramblock "memdimm1", cannot accept migration If virtqemud is not restarted, the hotplugged dimm device will have the correct alias: dimm2. And live migration will succeed. Version-Release number of selected component: libvirt-8.10.0-2.el9.x86_64 How reproducible: 100% Steps to Reproduce: 1. Start a guest with a dimm device(alias=dimm0) and a nvdimm device(alias=nvdimm1) <memory model="dimm" access="private" discard="no"> <source> <nodemask>0</nodemask> <pagesize unit="KiB">2048</pagesize> </source> <target> <size unit="KiB">262144</size> <node>0</node> </target> <alias name="dimm0"/> <address type="dimm" slot="0" base="0x100000000"/> </memory> <memory model="nvdimm"> <source> <path>/tmp/nvdimm</path> </source> <target> <size unit="KiB">524288</size> <node>1</node> <label> <size unit="KiB">128</size> </label> </target> <alias name="nvdimm1"/> <address type="dimm" slot="1" base="0x110000000"/> </memory> 2. Restart virtqemud 3. Hotplug another dimm device to guest. # cat dimm.xml <memory model="dimm" access="private" discard="no"> <source> <nodemask>0</nodemask> <pagesize unit="KiB">2048</pagesize> </source> <target> <size unit="KiB">262144</size> <node>1</node> </target> </memory> # virsh attach-device rhel9.0.0-full dimm.xml Device attached successfully # virsh dumpxml rhel9.0.0-full --xpath //memory ...skip.... <memory model="dimm" access="private" discard="no"> <source> <nodemask>0</nodemask> <pagesize unit="KiB">2048</pagesize> </source> <target> <size unit="KiB">262144</size> <node>1</node> </target> <alias name="dimm1"/> <address type="dimm" slot="2" base="0x130000000"/> </memory> 4. Query memory devices at qemu level: # virsh qemu-monitor-command rhel9.0.0-full '{"execute":"query-memory-devices"}' {"return":[{"type":"dimm","data":{"memdev":"/objects/memdimm0","hotplugged":false,"addr":4294967296,"hotpluggable":true,"size":268435456,"slot":0,"node":0,"id":"dimm0"}},{"type":"nvdimm","data":{"memdev":"/objects/memnvdimm1","hotplugged":false,"addr":4563402752,"hotpluggable":true,"size":536739840,"slot":1,"node":1,"id":"nvdimm1"}},{"type":"dimm","data":{"memdev":"/objects/memdimm1","hotplugged":true,"addr":5100273664,"hotpluggable":true,"size":268435456,"slot":2,"node":1,"id":"dimm1"}}],"id":"libvirt-19"} 5. Try to do live migration(or do save&restore): # virsh migrate rhel9.0.0-full qemu+tcp://dell-per750-04.lab.eng.pek2.redhat.com/system --live --p2p --undefinesource --persistent error: operation failed: job 'migration out' failed: Unable to write to socket: Bad file descriptor 6. Check qemu log on destination host: 2023-01-04T08:36:48.520108Z qemu-kvm: Unknown ramblock "memdimm1", cannot accept migration 2023-01-04T08:36:48.520182Z qemu-kvm: error while loading state for instance 0x0 of device 'ram' 2023-01-04T08:36:48.520499Z qemu-kvm: load of migration failed: Invalid argument 2023-01-04 08:36:48.922+0000: shutting down, reason=crashed Actual results: As above Expected results: Hotplugged dimm device has correct alias name(dimm2), and live migration succeeds. Additional info: