Bug 2180679
| Summary: | Restore guest fails after unplugging dimm memory device with virtio-mem memory device attached | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | liang cong <lcong> |
| Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> |
| libvirt sub component: | General | QA Contact: | liang cong <lcong> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | unspecified | ||
| Priority: | unspecified | CC: | chayang, jdenemar, jinzhao, juzhang, lmen, menli, mprivozn, qizhu, virt-maint, zhguo |
| Version: | 9.2 | Keywords: | AutomationTriaged, Triaged, Upstream |
| Target Milestone: | rc | Flags: | pm-rhel:
mirror+
|
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | libvirt-9.4.0-1.el9 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-11-07 08:31:00 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | 9.4.0 |
| Embargoed: | |||
I believe this is a QEMU bug. I'm able to reproduce and the cmd line that's generated on the first start of the VM contains the following:
-object '{"qom-type":"memory-backend-file","id":"memdimm0","mem-path":"/var/lib/libvirt/qemu/ram/3-fedora/dimm0","discard-data":false,"share":false,"prealloc":true,"prealloc-threads":16,"size":268435456}' \
-device '{"driver":"pc-dimm","node":0,"memdev":"memdimm0","id":"dimm0","slot":0}' \
-object '{"qom-type":"memory-backend-file","id":"memvirtiomem0","mem-path":"/hugepages2M/libvirt/qemu/3-fedora","discard-data":true,"share":false,"reserve":false,"size":2147483648}' \
-device '{"driver":"virtio-mem-pci","node":0,"block-size":2097152,"requested-size":1073741824,"memdev":"memvirtiomem0","prealloc":true,"id":"virtiomem0","bus":"pci.0","addr":"0xa"}' \
and after the dimm0 is unplugged, saved and restored this is the cmd line that libvirt tries to run:
-object '{"qom-type":"memory-backend-file","id":"memvirtiomem0","mem-path":"/hugepages2M/libvirt/qemu/4-fedora","discard-data":true,"share":false,"reserve":false,"size":2147483648}' \
-device '{"driver":"virtio-mem-pci","node":0,"block-size":2097152,"requested-size":1073741824,"memdev":"memvirtiomem0","prealloc":true,"id":"virtiomem0","bus":"pci.0","addr":"0xa"}' \
IOW - the pc-dimm device is gone (which is expected - it was unplugged). Let me switch over to QEMU for further investigation.
QEMU will auto-assign memory addresses for memory devices, however these addresses cannot really get migrated, because they have to be known when plugging such a device. So upper layers have to provide these properties when creating the devices (e.g., read property on source and configure for destination). Similar to, DIMM slots or PCI addresses. That's nothing special about virtio-mem (it's the only one that actually checks instead of crashing the guest later :) ), it's the same for all memory devices, just the "address property" name differs. For DIMMs/NVDIMMS it's the "addr" property. For virtio-mem/virtio-pmem, it's the "memaddr" property. I would assume that such handling is already in place for DIMMs/NVDIMMs? Otherwise unplugging some DIMMs and migrating the VM would crash the VM on the destination when the location of some DIMMs changes in guest address space. @Michal, is such handling for DIMMs already in place? Then we need similar handling for virtio-mem/virtio-pmem in libvirt. If it's not around for DIMMs, then we need it there as well ... Yeah, it's implemented for DIMMs and indeed virtio-mem was the missing piece. After I made memaddr stable I can restore from a savefile. So this is indeed a libvirt bug. Let me switch back to libvirt and post patches. Patches posted on the list: https://listman.redhat.com/archives/libvir-list/2023-March/239051.html Merged upstream as: a1bdffdd96 qemu_command: Generate .memaddr for virtio-mem and virtio-pmem 2c15506254 qemu: Fill virtio-mem/virtio-pmem .memaddr at runtime 677156f662 conf: Introduce <address/> for virtio-mem and virtio-pmem v9.4.0-rc1-5-ga1bdffdd96 Pre-verified on upstream build libvirt v9.4.0-12-gf26923fb2e
Test steps:
1. In guest set kernel parameter memhp_default_state to online_movable
# grubby --update-kernel=ALL --remove-args=memhp_default_state --args=memhp_default_state=online_movable
# reboot
2. Define and start the guest with dimm, virtio-mem memory devices related config xml as below:
<memory model="dimm" discard="no">
<source>
<pagesize unit="KiB">4</pagesize>
</source>
<target>
<size unit="KiB">131072</size>
<node>0</node>
</target>
</memory>
<memory model="dimm" discard="no">
<source>
<pagesize unit="KiB">4</pagesize>
</source>
<target>
<size unit="KiB">131072</size>
<node>0</node>
</target>
</memory>
<memory model="dimm" discard="no">
<source>
<pagesize unit="KiB">4</pagesize>
</source>
<target>
<size unit="KiB">131072</size>
<node>0</node>
</target>
</memory>
<memory model="dimm" discard="no">
<source>
<pagesize unit="KiB">4</pagesize>
</source>
<target>
<size unit="KiB">131072</size>
<node>0</node>
</target>
</memory>
<memory model='virtio-mem'>
<target>
<size unit='KiB'>2097152</size>
<node>0</node>
<block unit='KiB'>2048</block>
<requested unit='KiB'>1048576</requested>
</target>
</memory>
...
3. Wait for guest booting up and virtio-mem current size is not 0.
# virsh dumpxml vm1 --xpath "//memory[@model='virtio-mem']"
<memory model="virtio-mem">
<target>
<size unit="KiB">2097152</size>
<node>0</node>
<block unit="KiB">2048</block>
<requested unit="KiB">1048576</requested>
<current unit="KiB">0</current>
<address base="0x120000000"/>
</target>
<alias name="virtiomem0"/>
<address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
</memory>
4. Prepare a dimm memory device with config xml:
# cat memory1.xml
<memory model="dimm" discard="no">
<source>
<pagesize unit="KiB">4</pagesize>
</source>
<target>
<size unit="KiB">131072</size>
<node>0</node>
</target>
<alias name="dimm2"/>
<address type="dimm" slot="2" base="0x110000000"/>
</memory>
5. Hot unplug the dimm memory device with config xml in step4
# virsh detach-device vm1 memory1.xml
Device detached successfully
6. Save and restore the domain.
# virsh save vm1 vm1.save
Domain 'vm1' saved to vm1.save
# virsh restore vm1.save
Verified on build:
# rpm -q libvirt qemu-kvm
libvirt-9.5.0-3.el9.x86_64
qemu-kvm-8.0.0-9.el9.x86_64
Test steps:
1. In guest set kernel parameter memhp_default_state to online_movable
# grubby --update-kernel=ALL --remove-args=memhp_default_state --args=memhp_default_state=online_movable
# reboot
2. Define and start the guest with dimm, virtio-mem memory devices related config xml as below:
<memory model="dimm" discard="no">
<source>
<pagesize unit="KiB">4</pagesize>
</source>
<target>
<size unit="KiB">131072</size>
<node>0</node>
</target>
</memory>
<memory model="dimm" discard="no">
<source>
<pagesize unit="KiB">4</pagesize>
</source>
<target>
<size unit="KiB">131072</size>
<node>0</node>
</target>
</memory>
<memory model="dimm" discard="no">
<source>
<pagesize unit="KiB">4</pagesize>
</source>
<target>
<size unit="KiB">131072</size>
<node>0</node>
</target>
</memory>
<memory model="dimm" discard="no">
<source>
<pagesize unit="KiB">4</pagesize>
</source>
<target>
<size unit="KiB">131072</size>
<node>0</node>
</target>
</memory>
<memory model='virtio-mem'>
<target>
<size unit='KiB'>2097152</size>
<node>0</node>
<block unit='KiB'>2048</block>
<requested unit='KiB'>1048576</requested>
</target>
</memory>
...
3. Wait for guest booting up and virtio-mem current size is not 0.
# virsh dumpxml vm1 --xpath "//memory[@model='virtio-mem']"
<memory model="virtio-mem">
<target>
<size unit="KiB">2097152</size>
<node>0</node>
<block unit="KiB">2048</block>
<requested unit="KiB">1048576</requested>
<current unit="KiB">1048576</current>
<address base="0x120000000"/>
</target>
<alias name="virtiomem0"/>
<address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
</memory>
4. Prepare a dimm memory device with config xml:
# cat mem.xml
<memory model="dimm" discard="no">
<source>
<pagesize unit="KiB">4</pagesize>
</source>
<target>
<size unit="KiB">131072</size>
<node>0</node>
</target>
<alias name="dimm3"/>
<address type="dimm" slot="3" base="0x118000000"/>
</memory>
5. Hot unplug the dimm memory device with config xml in step4
# virsh detach-device vm1 mem.xml
Device detached successfully
6. Save and restore the domain.
# virsh save vm1 vm1.save
Domain 'vm1' saved to vm1.save
# virsh restore vm1.save
7. Check memory device config by virsh dumpxml
# virsh dumpxml vm1 --xpath "//memory"
<memory unit="KiB">4587520</memory>
<memory model="dimm" discard="no">
<source>
<pagesize unit="KiB">4</pagesize>
</source>
<target>
<size unit="KiB">131072</size>
<node>0</node>
</target>
<alias name="dimm0"/>
<address type="dimm" slot="0" base="0x100000000"/>
</memory>
<memory model="dimm" discard="no">
<source>
<pagesize unit="KiB">4</pagesize>
</source>
<target>
<size unit="KiB">131072</size>
<node>0</node>
</target>
<alias name="dimm1"/>
<address type="dimm" slot="1" base="0x108000000"/>
</memory>
<memory model="dimm" discard="no">
<source>
<pagesize unit="KiB">4</pagesize>
</source>
<target>
<size unit="KiB">131072</size>
<node>0</node>
</target>
<alias name="dimm2"/>
<address type="dimm" slot="2" base="0x110000000"/>
</memory>
<memory model="virtio-mem">
<target>
<size unit="KiB">2097152</size>
<node>0</node>
<block unit="KiB">2048</block>
<requested unit="KiB">1048576</requested>
<current unit="KiB">1048576</current>
<address base="0x120000000"/>
</target>
<alias name="virtiomem0"/>
<address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
</memory>
mark it verified for comment 7 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: libvirt security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6409 |
Description of problem: Restore guest fails after unplugging dimm memory device with virtio-mem memory device attached Version-Release number of selected component (if applicable): libvirt-9.0.0-9.el9_2.x86_64 qemu-kvm-7.2.0-12.el9_2.x86_64 How reproducible: 100% Steps to Reproduce: 1. In guest set kernel parameter memhp_default_state to online_movable # grubby --update-kernel=ALL --remove-args=memhp_default_state --args=memhp_default_state=online_movable # reboot 2. Define and start the guest with dimm, virtio-mem memory devices related config xml as below: <memory model="dimm" discard="no"> <source> <pagesize unit="KiB">4</pagesize> </source> <target> <size unit="KiB">262144</size> <node>0</node> </target> </memory> <memory model='virtio-mem'> <target> <size unit='KiB'>2097152</size> <node>0</node> <block unit='KiB'>2048</block> <requested unit='KiB'>1048576</requested> </target> </memory> ... 3. Wait for guest booting up and virtio-mem current size is not 0. # virsh dumpxml vm1 --xpath "//memory[@model='virtio-mem']" <memory model="virtio-mem"> <target> <size unit="KiB">2097152</size> <node>0</node> <block unit="KiB">2048</block> <requested unit="KiB">1048576</requested> <current unit="KiB">1048576</current> </target> <alias name="virtiomem0"/> <address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/> </memory> 4. Prepare a dimm memory device with config xml: # cat memory1.xml <memory model='dimm' discard='no'> <source> <pagesize unit='KiB'>4</pagesize> </source> <target> <size unit='KiB'>262144</size> <node>0</node> </target> <alias name='dimm0'/> <address type='dimm' slot='0' base='0x100000000'/> </memory> 5. Hot unplug the dimm memory device with config xml in step4 # virsh detach-device vm1 memory1.xml Device detached successfully 6. Save and restore the domain. # virsh save vm1 vm1.save Domain 'vm1' saved to vm1.save # virsh restore vm1.save error: Failed to restore domain from vm1.save error: internal error: qemu unexpectedly closed the monitor: 2023-03-20T04:36:49.299229Z qemu-kvm: Property 'memaddr' changed from 0x110000000 to 0x100000000 2023-03-20T04:36:49.299264Z qemu-kvm: Failed to load virtio-mem-device:tmp 2023-03-20T04:36:49.299269Z qemu-kvm: Failed to load virtio-mem:virtio 2023-03-20T04:36:49.299274Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:02.6:00.0/virtio-mem' 2023-03-20T04:36:49.299755Z qemu-kvm: load of migration failed: Invalid argument Actual results: Restore fails with error info in step6 Expected results: Restore successfully Additional info: If dimm is defined after virtio-mem, or virtio-mem current size is 0 then no error.