Bug 2158701
| Summary: | Hotplugged dimm device has wrong alias name in some specific scenario | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Fangge Jin <fjin> | ||||
| Component: | libvirt | Assignee: | Peter Krempa <pkrempa> | ||||
| libvirt sub component: | General | QA Contact: | liang cong <lcong> | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | medium | ||||||
| Priority: | unspecified | CC: | dzheng, jdenemar, jsuchane, lcong, lmen, nanli, virt-maint | ||||
| Version: | 9.2 | Keywords: | Triaged | ||||
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | libvirt-9.0.0-2.el9 | Doc Type: | If docs needed, set a value | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2023-05-09 07:27:43 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Fixed upstream by:
commit 5764930463eb8f450e45fa982651ef6b7a7afd7c
Author: Peter Krempa <pkrempa>
Date: Thu Jan 19 15:18:45 2023 +0100
qemu: Remove 'memAliasOrderMismatch' field from VM private data
The field is no longer used so we can remove it and the code filling it.
Signed-off-by: Peter Krempa <pkrempa>
Reviewed-by: Martin Kletzander <mkletzan>
commit 6d3f0b11b2b056313b123510c96f2924689341f9
Author: Peter Krempa <pkrempa>
Date: Thu Jan 19 15:16:58 2023 +0100
qemu: alias: Remove 'oldAlias' argument of qemuAssignDeviceMemoryAlias
All callers pass 'false' so we no longer need it.
Signed-off-by: Peter Krempa <pkrempa>
Reviewed-by: Martin Kletzander <mkletzan>
commit 50ce3463d514950350143f03e8421c8c31889c5d
Author: Peter Krempa <pkrempa>
Date: Thu Jan 19 15:06:11 2023 +0100
qemu: hotplug: Remove legacy quirk for 'dimm' address generation
Commit b7798a07f93 (in fall of 2016) changed the way we generate aliases
for 'dimm' memory devices as the alias itself is part of the migration
stream section naming and thus must be treated as ABI.
The code added compatibility layer for VMs with memory hotplug started
with the old scheme to prevent from generating wrong aliases. The
compatibility layer broke though later when 'nvdimm' and 'pmem' devices
were introduced as it wrongly detected them as old configuration.
Now rather than attempting to fix the legacy compat layer to treat other
devices properly we'll be better off simply removing it as it's
extremely unlikely that somebody has a VM started in 2016 running with
today's libvirt and attempts to hotplug more memory.
This fixes a corner case when a user hot-adds a 'dimm' into a VM with a
'dimm' and a 'nvdimm' after restart of libvirtd and then attempts to
migrate the VM.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2158701
Signed-off-by: Peter Krempa <pkrempa>
Reviewed-by: Martin Kletzander <mkletzan>
Preverified on upstream build libvirt v9.0.0-118-g9f8fba7501
Test steps:
1. Start a guest with a dimm device(alias=dimm0) and a nvdimm device(alias=nvdimm1)
<memory model="dimm" access="private" discard="no">
<source>
<nodemask>0-1</nodemask>
<pagesize unit="KiB">2048</pagesize>
</source>
<target>
<size unit="KiB">262144</size>
<node>0</node>
</target>
<alias name="dimm0"/>
<address type="dimm" slot="0" base="0x100000000"/>
</memory>
<memory model="nvdimm">
<source>
<path>/tmp/nvdimm</path>
</source>
<target>
<size unit="KiB">524288</size>
<node>1</node>
<label>
<size unit="KiB">128</size>
</label>
</target>
<alias name="nvdimm1"/>
<address type="dimm" slot="1" base="0x110000000"/>
</memory>
2. Restart virtqemud
# systemctl restart virtqemud
3. Hotplug another dimm device to guest.
# cat dimm.xml
<memory model="dimm" access="private" discard="no">
<source>
<nodemask>0-1</nodemask>
<pagesize unit="KiB">2048</pagesize>
</source>
<target>
<size unit="KiB">262144</size>
<node>1</node>
</target>
</memory>
# virsh attach-device vm1 dimm.xml
Device attached successfully
4. Check the memory config
# virsh dumpxml vm1 --xpath 'devices//memory'
<memory model="dimm" access="private" discard="no">
<source>
<nodemask>0-1</nodemask>
<pagesize unit="KiB">2048</pagesize>
</source>
<target>
<size unit="KiB">262144</size>
<node>0</node>
</target>
<alias name="dimm0"/>
<address type="dimm" slot="0" base="0x100000000"/>
</memory>
<memory model="nvdimm">
<source>
<path>/tmp/nvdimm</path>
</source>
<target>
<size unit="KiB">524288</size>
<node>1</node>
<label>
<size unit="KiB">128</size>
</label>
</target>
<alias name="nvdimm1"/>
<address type="dimm" slot="1" base="0x110000000"/>
</memory>
<memory model="dimm" access="private" discard="no">
<source>
<nodemask>0-1</nodemask>
<pagesize unit="KiB">2048</pagesize>
</source>
<target>
<size unit="KiB">262144</size>
<node>1</node>
</target>
<alias name="dimm2"/>
<address type="dimm" slot="2" base="0x130000000"/>
</memory>
5. Query memory devices at qemu level:
# virsh qemu-monitor-command vm1 '{"execute":"query-memory-devices"}'
{"return":[{"type":"dimm","data":{"memdev":"/objects/memdimm0","hotplugged":false,"addr":4294967296,"hotpluggable":true,"size":268435456,"slot":0,"node":0,"id":"dimm0"}},{"type":"nvdimm","data":{"memdev":"/objects/memnvdimm1","hotplugged":false,"addr":4563402752,"hotpluggable":true,"size":536739840,"slot":1,"node":1,"id":"nvdimm1"}},{"type":"dimm","data":{"memdev":"/objects/memdimm2","hotplugged":true,"addr":5100273664,"hotpluggable":true,"size":268435456,"slot":2,"node":1,"id":"dimm2"}}],"id":"libvirt-18"}
6. Do virsh save & restore
# virsh save vm1 /tmp/save
Domain 'vm1' saved to /tmp/save
# virsh restore /tmp/save
Domain restored from /tmp/save
7. Do live migration
# virsh migrate vm1 qemu+tcp://dell-per740xd-14.lab.eng.pek2.redhat.com/system --live --p2p --undefinesource --persistent
Verified on upstream build libvirt v9.0.0-118-g9f8fba7501
Test steps:
1. Start a guest with a dimm device(alias=dimm0) and a nvdimm device(alias=nvdimm1)
<memory model="dimm" access="private" discard="no">
<source>
<nodemask>0-1</nodemask>
<pagesize unit="KiB">2048</pagesize>
</source>
<target>
<size unit="KiB">262144</size>
<node>0</node>
</target>
<alias name="dimm0"/>
<address type="dimm" slot="0" base="0x100000000"/>
</memory>
<memory model="nvdimm">
<source>
<path>/tmp/nvdimm</path>
</source>
<target>
<size unit="KiB">524288</size>
<node>1</node>
<label>
<size unit="KiB">128</size>
</label>
</target>
<alias name="nvdimm1"/>
<address type="dimm" slot="1" base="0x110000000"/>
</memory>
2. Restart virtqemud
# systemctl restart virtqemud
3. Hotplug another dimm device to guest.
# cat dimm.xml
<memory model="dimm" access="private" discard="no">
<source>
<nodemask>0-1</nodemask>
<pagesize unit="KiB">2048</pagesize>
</source>
<target>
<size unit="KiB">262144</size>
<node>1</node>
</target>
</memory>
# virsh attach-device vm1 dimm.xml
Device attached successfully
4. Check the memory config
# virsh dumpxml vm1 --xpath 'devices//memory'
<memory model="dimm" access="private" discard="no">
<source>
<nodemask>0-1</nodemask>
<pagesize unit="KiB">2048</pagesize>
</source>
<target>
<size unit="KiB">262144</size>
<node>0</node>
</target>
<alias name="dimm0"/>
<address type="dimm" slot="0" base="0x100000000"/>
</memory>
<memory model="nvdimm">
<source>
<path>/tmp/nvdimm</path>
</source>
<target>
<size unit="KiB">524288</size>
<node>1</node>
<label>
<size unit="KiB">128</size>
</label>
</target>
<alias name="nvdimm1"/>
<address type="dimm" slot="1" base="0x110000000"/>
</memory>
<memory model="dimm" access="private" discard="no">
<source>
<nodemask>0-1</nodemask>
<pagesize unit="KiB">2048</pagesize>
</source>
<target>
<size unit="KiB">262144</size>
<node>1</node>
</target>
<alias name="dimm2"/>
<address type="dimm" slot="2" base="0x130000000"/>
</memory>
5. Query memory devices at qemu level:
# virsh qemu-monitor-command vm1 '{"execute":"query-memory-devices"}'
{"return":[{"type":"dimm","data":{"memdev":"/objects/memdimm0","hotplugged":false,"addr":4294967296,"hotpluggable":true,"size":268435456,"slot":0,"node":0,"id":"dimm0"}},{"type":"nvdimm","data":{"memdev":"/objects/memnvdimm1","hotplugged":false,"addr":4563402752,"hotpluggable":true,"size":536739840,"slot":1,"node":1,"id":"nvdimm1"}},{"type":"dimm","data":{"memdev":"/objects/memdimm2","hotplugged":true,"addr":5100273664,"hotpluggable":true,"size":268435456,"slot":2,"node":1,"id":"dimm2"}}],"id":"libvirt-18"}
6. Do virsh save & restore
# virsh save vm1 /tmp/save
Domain 'vm1' saved to /tmp/save
# virsh restore /tmp/save
Domain restored from /tmp/save
7. Do live migration
# virsh migrate vm1 qemu+tcp://dell-per740xd-14.lab.eng.pek2.redhat.com/system --live --p2p --undefinesource --persistent
Verified on libvirt-9.0.0-3.el9.x86_64 with steps same with comment 6 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (libvirt bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:2171 |
Created attachment 1936113 [details] virtqemud and qemu log Description of problem: Start a guest with a dimm device(alias=dimm0) and a nvdimm device(alias=nvdimm1), restart virtqemud, then hotplug another dimm device, the hotplugged dimm device will have wrong alias: dimm1(the correct alias name is dimm2). Then do live migration(or save&restore), it fails with error from qemu: Unknown ramblock "memdimm1", cannot accept migration If virtqemud is not restarted, the hotplugged dimm device will have the correct alias: dimm2. And live migration will succeed. Version-Release number of selected component: libvirt-8.10.0-2.el9.x86_64 How reproducible: 100% Steps to Reproduce: 1. Start a guest with a dimm device(alias=dimm0) and a nvdimm device(alias=nvdimm1) <memory model="dimm" access="private" discard="no"> <source> <nodemask>0</nodemask> <pagesize unit="KiB">2048</pagesize> </source> <target> <size unit="KiB">262144</size> <node>0</node> </target> <alias name="dimm0"/> <address type="dimm" slot="0" base="0x100000000"/> </memory> <memory model="nvdimm"> <source> <path>/tmp/nvdimm</path> </source> <target> <size unit="KiB">524288</size> <node>1</node> <label> <size unit="KiB">128</size> </label> </target> <alias name="nvdimm1"/> <address type="dimm" slot="1" base="0x110000000"/> </memory> 2. Restart virtqemud 3. Hotplug another dimm device to guest. # cat dimm.xml <memory model="dimm" access="private" discard="no"> <source> <nodemask>0</nodemask> <pagesize unit="KiB">2048</pagesize> </source> <target> <size unit="KiB">262144</size> <node>1</node> </target> </memory> # virsh attach-device rhel9.0.0-full dimm.xml Device attached successfully # virsh dumpxml rhel9.0.0-full --xpath //memory ...skip.... <memory model="dimm" access="private" discard="no"> <source> <nodemask>0</nodemask> <pagesize unit="KiB">2048</pagesize> </source> <target> <size unit="KiB">262144</size> <node>1</node> </target> <alias name="dimm1"/> <address type="dimm" slot="2" base="0x130000000"/> </memory> 4. Query memory devices at qemu level: # virsh qemu-monitor-command rhel9.0.0-full '{"execute":"query-memory-devices"}' {"return":[{"type":"dimm","data":{"memdev":"/objects/memdimm0","hotplugged":false,"addr":4294967296,"hotpluggable":true,"size":268435456,"slot":0,"node":0,"id":"dimm0"}},{"type":"nvdimm","data":{"memdev":"/objects/memnvdimm1","hotplugged":false,"addr":4563402752,"hotpluggable":true,"size":536739840,"slot":1,"node":1,"id":"nvdimm1"}},{"type":"dimm","data":{"memdev":"/objects/memdimm1","hotplugged":true,"addr":5100273664,"hotpluggable":true,"size":268435456,"slot":2,"node":1,"id":"dimm1"}}],"id":"libvirt-19"} 5. Try to do live migration(or do save&restore): # virsh migrate rhel9.0.0-full qemu+tcp://dell-per750-04.lab.eng.pek2.redhat.com/system --live --p2p --undefinesource --persistent error: operation failed: job 'migration out' failed: Unable to write to socket: Bad file descriptor 6. Check qemu log on destination host: 2023-01-04T08:36:48.520108Z qemu-kvm: Unknown ramblock "memdimm1", cannot accept migration 2023-01-04T08:36:48.520182Z qemu-kvm: error while loading state for instance 0x0 of device 'ram' 2023-01-04T08:36:48.520499Z qemu-kvm: load of migration failed: Invalid argument 2023-01-04 08:36:48.922+0000: shutting down, reason=crashed Actual results: As above Expected results: Hotplugged dimm device has correct alias name(dimm2), and live migration succeeds. Additional info: