Bug 2180679 - Restore guest fails after unplugging dimm memory device with virtio-mem memory device attached
Summary: Restore guest fails after unplugging dimm memory device with virtio-mem memor...
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: libvirt
Version: 9.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: liang cong
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-03-22 04:05 UTC by liang cong
Modified: 2023-08-07 03:23 UTC (History)
10 users (show)

Fixed In Version: libvirt-9.4.0-1.el9
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version: 9.4.0
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-152649 0 None None None 2023-03-22 04:07:22 UTC

Description liang cong 2023-03-22 04:05:18 UTC
Description of problem:
Restore guest fails after unplugging dimm memory device with virtio-mem memory device attached

Version-Release number of selected component (if applicable):
libvirt-9.0.0-9.el9_2.x86_64
qemu-kvm-7.2.0-12.el9_2.x86_64

How reproducible:
100%


Steps to Reproduce:
1. In guest set kernel parameter memhp_default_state to online_movable
# grubby --update-kernel=ALL --remove-args=memhp_default_state --args=memhp_default_state=online_movable
# reboot



2. Define and start the guest with dimm, virtio-mem memory devices related config xml as below:
<memory model="dimm" discard="no">
  <source>
    <pagesize unit="KiB">4</pagesize>
  </source>
  <target>
    <size unit="KiB">262144</size>
    <node>0</node>
   </target>
 </memory>

<memory model='virtio-mem'>
    <target>
      <size unit='KiB'>2097152</size>
      <node>0</node>
      <block unit='KiB'>2048</block>
      <requested unit='KiB'>1048576</requested>
    </target>
</memory>
...


3. Wait for guest booting up and virtio-mem current size is not 0. 
# virsh dumpxml vm1 --xpath "//memory[@model='virtio-mem']"
<memory model="virtio-mem">
  <target>
    <size unit="KiB">2097152</size>
    <node>0</node>
    <block unit="KiB">2048</block>
    <requested unit="KiB">1048576</requested>
    <current unit="KiB">1048576</current>
  </target>
  <alias name="virtiomem0"/>
  <address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
</memory>



4. Prepare a dimm memory device with config xml:
# cat memory1.xml
<memory model='dimm' discard='no'>
      <source>
        <pagesize unit='KiB'>4</pagesize>
      </source>
      <target>
        <size unit='KiB'>262144</size>
        <node>0</node>
      </target>
      <alias name='dimm0'/>
      <address type='dimm' slot='0' base='0x100000000'/>
    </memory>


5. Hot unplug the dimm memory device with config xml in step4
# virsh detach-device vm1 memory1.xml
Device detached successfully


6. Save and restore the domain.
# virsh save vm1 vm1.save
Domain 'vm1' saved to vm1.save

# virsh restore vm1.save
error: Failed to restore domain from vm1.save
error: internal error: qemu unexpectedly closed the monitor: 2023-03-20T04:36:49.299229Z qemu-kvm: Property 'memaddr' changed from 0x110000000 to 0x100000000
2023-03-20T04:36:49.299264Z qemu-kvm: Failed to load virtio-mem-device:tmp
2023-03-20T04:36:49.299269Z qemu-kvm: Failed to load virtio-mem:virtio
2023-03-20T04:36:49.299274Z qemu-kvm: error while loading state for instance 0x0 of device '0000:00:02.6:00.0/virtio-mem'
2023-03-20T04:36:49.299755Z qemu-kvm: load of migration failed: Invalid argument



Actual results:
Restore fails with error info in step6


Expected results:
Restore successfully


Additional info:
If dimm is defined after virtio-mem, or virtio-mem current size is 0 then no error.

Comment 1 Michal Privoznik 2023-03-23 12:35:05 UTC
I believe this is a QEMU bug. I'm able to reproduce and the cmd line that's generated on the first start of the VM contains the following:

-object '{"qom-type":"memory-backend-file","id":"memdimm0","mem-path":"/var/lib/libvirt/qemu/ram/3-fedora/dimm0","discard-data":false,"share":false,"prealloc":true,"prealloc-threads":16,"size":268435456}' \
-device '{"driver":"pc-dimm","node":0,"memdev":"memdimm0","id":"dimm0","slot":0}' \
-object '{"qom-type":"memory-backend-file","id":"memvirtiomem0","mem-path":"/hugepages2M/libvirt/qemu/3-fedora","discard-data":true,"share":false,"reserve":false,"size":2147483648}' \
-device '{"driver":"virtio-mem-pci","node":0,"block-size":2097152,"requested-size":1073741824,"memdev":"memvirtiomem0","prealloc":true,"id":"virtiomem0","bus":"pci.0","addr":"0xa"}' \

and after the dimm0 is unplugged, saved and restored this is the cmd line that libvirt tries to run:

-object '{"qom-type":"memory-backend-file","id":"memvirtiomem0","mem-path":"/hugepages2M/libvirt/qemu/4-fedora","discard-data":true,"share":false,"reserve":false,"size":2147483648}' \
-device '{"driver":"virtio-mem-pci","node":0,"block-size":2097152,"requested-size":1073741824,"memdev":"memvirtiomem0","prealloc":true,"id":"virtiomem0","bus":"pci.0","addr":"0xa"}' \

IOW - the pc-dimm device is gone (which is expected - it was unplugged). Let me switch over to QEMU for further investigation.

Comment 2 David Hildenbrand 2023-03-24 22:20:29 UTC
QEMU will auto-assign memory addresses for memory devices, however these addresses cannot really get migrated, because they have to be known when plugging such a device. So upper layers have to provide these properties when creating the devices (e.g., read property on source and configure for destination). Similar to, DIMM slots or PCI addresses.

That's nothing special about virtio-mem (it's the only one that actually checks instead of crashing the guest later :) ), it's the same for all memory devices, just the "address property" name differs.

For DIMMs/NVDIMMS it's the "addr" property. For virtio-mem/virtio-pmem, it's the "memaddr" property.

I would assume that such handling is already in place for DIMMs/NVDIMMs? Otherwise unplugging some DIMMs and migrating the VM would crash the VM on the destination when the location of some DIMMs changes in guest address space.

@Michal, is such handling for DIMMs already in place? Then we need similar handling for virtio-mem/virtio-pmem in libvirt. If it's not around for DIMMs, then we need it there as well ...

Comment 3 Michal Privoznik 2023-03-28 08:49:04 UTC
Yeah, it's implemented for DIMMs and indeed virtio-mem was the missing piece. After I made memaddr stable I can restore from a savefile. So this is indeed a libvirt bug. Let me switch back to libvirt and post patches.

Comment 4 Michal Privoznik 2023-03-28 11:58:37 UTC
Patches posted on the list:

https://listman.redhat.com/archives/libvir-list/2023-March/239051.html

Comment 5 Michal Privoznik 2023-05-26 14:48:36 UTC
Merged upstream as:

a1bdffdd96 qemu_command: Generate .memaddr for virtio-mem and virtio-pmem
2c15506254 qemu: Fill virtio-mem/virtio-pmem .memaddr at runtime
677156f662 conf: Introduce <address/> for virtio-mem and virtio-pmem

v9.4.0-rc1-5-ga1bdffdd96

Comment 6 liang cong 2023-06-05 06:58:41 UTC
Pre-verified on upstream build libvirt v9.4.0-12-gf26923fb2e 

Test steps:
1. In guest set kernel parameter memhp_default_state to online_movable
# grubby --update-kernel=ALL --remove-args=memhp_default_state --args=memhp_default_state=online_movable
# reboot



2. Define and start the guest with dimm, virtio-mem memory devices related config xml as below:
   <memory model="dimm" discard="no">
  <source>
    <pagesize unit="KiB">4</pagesize>
  </source>
  <target>
    <size unit="KiB">131072</size>
    <node>0</node>
   </target>
 </memory>
<memory model="dimm" discard="no">
  <source>
    <pagesize unit="KiB">4</pagesize>
  </source>
  <target>
    <size unit="KiB">131072</size>
    <node>0</node>
   </target>
 </memory>
<memory model="dimm" discard="no">
  <source>
    <pagesize unit="KiB">4</pagesize>
  </source>
  <target>
    <size unit="KiB">131072</size>
    <node>0</node>
   </target>
 </memory>
<memory model="dimm" discard="no">
  <source>
    <pagesize unit="KiB">4</pagesize>
  </source>
  <target>
    <size unit="KiB">131072</size>
    <node>0</node>
   </target>
 </memory>
<memory model='virtio-mem'>
    <target>
      <size unit='KiB'>2097152</size>
      <node>0</node>
      <block unit='KiB'>2048</block>
      <requested unit='KiB'>1048576</requested>
    </target>
</memory>
...


3. Wait for guest booting up and virtio-mem current size is not 0. 
# virsh dumpxml vm1 --xpath "//memory[@model='virtio-mem']"
<memory model="virtio-mem">
  <target>
    <size unit="KiB">2097152</size>
    <node>0</node>
    <block unit="KiB">2048</block>
    <requested unit="KiB">1048576</requested>
    <current unit="KiB">0</current>
    <address base="0x120000000"/>
  </target>
  <alias name="virtiomem0"/>
  <address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
</memory>


4. Prepare a dimm memory device with config xml:
# cat memory1.xml
<memory model="dimm" discard="no">
  <source>
    <pagesize unit="KiB">4</pagesize>
  </source>
  <target>
    <size unit="KiB">131072</size>
    <node>0</node>
  </target>
  <alias name="dimm2"/>
  <address type="dimm" slot="2" base="0x110000000"/>
</memory>


5. Hot unplug the dimm memory device with config xml in step4
# virsh detach-device vm1 memory1.xml
Device detached successfully


6. Save and restore the domain.
# virsh save vm1 vm1.save
Domain 'vm1' saved to vm1.save

# virsh restore vm1.save

Comment 7 liang cong 2023-07-25 07:40:48 UTC
Verified on build:
# rpm -q libvirt qemu-kvm
libvirt-9.5.0-3.el9.x86_64
qemu-kvm-8.0.0-9.el9.x86_64

Test steps:
1. In guest set kernel parameter memhp_default_state to online_movable
# grubby --update-kernel=ALL --remove-args=memhp_default_state --args=memhp_default_state=online_movable
# reboot



2. Define and start the guest with dimm, virtio-mem memory devices related config xml as below:
   <memory model="dimm" discard="no">
  <source>
    <pagesize unit="KiB">4</pagesize>
  </source>
  <target>
    <size unit="KiB">131072</size>
    <node>0</node>
   </target>
 </memory>
<memory model="dimm" discard="no">
  <source>
    <pagesize unit="KiB">4</pagesize>
  </source>
  <target>
    <size unit="KiB">131072</size>
    <node>0</node>
   </target>
 </memory>
<memory model="dimm" discard="no">
  <source>
    <pagesize unit="KiB">4</pagesize>
  </source>
  <target>
    <size unit="KiB">131072</size>
    <node>0</node>
   </target>
 </memory>
<memory model="dimm" discard="no">
  <source>
    <pagesize unit="KiB">4</pagesize>
  </source>
  <target>
    <size unit="KiB">131072</size>
    <node>0</node>
   </target>
 </memory>
<memory model='virtio-mem'>
    <target>
      <size unit='KiB'>2097152</size>
      <node>0</node>
      <block unit='KiB'>2048</block>
      <requested unit='KiB'>1048576</requested>
    </target>
</memory>
...


3. Wait for guest booting up and virtio-mem current size is not 0. 
# virsh dumpxml vm1 --xpath "//memory[@model='virtio-mem']"
<memory model="virtio-mem">
  <target>
    <size unit="KiB">2097152</size>
    <node>0</node>
    <block unit="KiB">2048</block>
    <requested unit="KiB">1048576</requested>
    <current unit="KiB">1048576</current>
    <address base="0x120000000"/>
  </target>
  <alias name="virtiomem0"/>
  <address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
</memory>



4. Prepare a dimm memory device with config xml:
# cat mem.xml
<memory model="dimm" discard="no">
  <source>
    <pagesize unit="KiB">4</pagesize>
  </source>
  <target>
    <size unit="KiB">131072</size>
    <node>0</node>
  </target>
  <alias name="dimm3"/>
  <address type="dimm" slot="3" base="0x118000000"/>
</memory>



5. Hot unplug the dimm memory device with config xml in step4
# virsh detach-device vm1 mem.xml
Device detached successfully


6. Save and restore the domain.
# virsh save vm1 vm1.save
Domain 'vm1' saved to vm1.save

# virsh restore vm1.save

7. Check memory device config by virsh dumpxml
# virsh dumpxml vm1 --xpath "//memory"
<memory unit="KiB">4587520</memory>
<memory model="dimm" discard="no">
  <source>
    <pagesize unit="KiB">4</pagesize>
  </source>
  <target>
    <size unit="KiB">131072</size>
    <node>0</node>
  </target>
  <alias name="dimm0"/>
  <address type="dimm" slot="0" base="0x100000000"/>
</memory>
<memory model="dimm" discard="no">
  <source>
    <pagesize unit="KiB">4</pagesize>
  </source>
  <target>
    <size unit="KiB">131072</size>
    <node>0</node>
  </target>
  <alias name="dimm1"/>
  <address type="dimm" slot="1" base="0x108000000"/>
</memory>
<memory model="dimm" discard="no">
  <source>
    <pagesize unit="KiB">4</pagesize>
  </source>
  <target>
    <size unit="KiB">131072</size>
    <node>0</node>
  </target>
  <alias name="dimm2"/>
  <address type="dimm" slot="2" base="0x110000000"/>
</memory>
<memory model="virtio-mem">
  <target>
    <size unit="KiB">2097152</size>
    <node>0</node>
    <block unit="KiB">2048</block>
    <requested unit="KiB">1048576</requested>
    <current unit="KiB">1048576</current>
    <address base="0x120000000"/>
  </target>
  <alias name="virtiomem0"/>
  <address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
</memory>

Comment 10 liang cong 2023-08-07 03:23:01 UTC
mark it verified for comment 7


Note You need to log in before you can comment on or make changes to this bug.