Bug 1175397 - memdev= option is not supported on rhel6 machine-types
Summary: memdev= option is not supported on rhel6 machine-types
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.1
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On: 1170093
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-12-17 16:06 UTC by Eduardo Habkost
Modified: 2015-03-05 07:48 UTC (History)
24 users (show)

Fixed In Version: libvirt-1.2.8-12.el7
Doc Type: Bug Fix
Doc Text:
Clone Of: 1170093
Environment:
Last Closed: 2015-03-05 07:48:25 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:0323 normal SHIPPED_LIVE Low: libvirt security, bug fix, and enhancement update 2015-03-05 12:10:54 UTC

Comment 1 Daniel Berrangé 2014-12-17 16:10:29 UTC
Per discussion on IRC, I think it is reasonable for libvirt to only attempt to use memdev if we actually need to set the NUMA binding policy, or sharing policy.

Comment 4 Michal Privoznik 2014-12-18 09:17:35 UTC
So, I've added the XML to a domain, and this is the command line that's been generated by libvirt:

LC_ALL=C LD_LIBRARY_PATH=/home/zippy/work/libvirt/libvirt.git/src/.libs PATH=/sbin:/bin:/usr/sbin:/usr/bin HOME=/root USER=root LOGNAME=root QEMU_AUDIO_DRV=none /home/zippy/work/qemu/qemu.git/x86_64-softmmu/qemu-system-x86_64 -name dummy -S -machine pc-i440fx-2.1,accel=kvm,usb=off -drive file=/usr/share/OVMF/OVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/var/lib/libvirt/qemu/nvram/dummy_VARS.fd,if=pflash,format=raw,unit=1 -m 3072 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -object memory-backend-ram,size=1024M,id=ram-node0 -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object memory-backend-ram,size=2048M,id=ram-node1 -numa node,nodeid=1,cpus=2-3,memdev=ram-node1 -uuid 64cb6277-713f-4e13-8b14-75667b8c9630 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/dummy.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:3a:c2,bus=pci.0,addr=0x3 -netdev tap,id=hostnet1 -device rtl8139,netdev=hostnet1,id=net1,mac=00:11:22:33:44:55,bus=pci.0,addr=0x5 -vnc 127.0.0.1:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on

I explicitly point out the problematic part:

-object memory-backend-ram,size=1024M,id=ram-node0 -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object memory-backend-ram,size=2048M,id=ram-node1 -numa node,nodeid=1,cpus=2-3,memdev=ram-node1

Eduardo, how else would you like to see it? I mean, user has configured 2 NUMA nodes for the guest, so how should libvirt communicate that to qemu?

Comment 5 Daniel Berrangé 2014-12-18 10:02:40 UTC
The memory-backend-ram is not related to the <numa> topology section. That's done using the -numa.

The memory-backend is only required if we're doing fine grained mapping of individual guest NUMA nodes to individual host NUMA nodes isn't it ?

Comment 6 Michal Privoznik 2014-12-18 11:32:41 UTC
Okay, so I've got a patch, that generates this command line:

LC_ALL=C LD_LIBRARY_PATH=/home/zippy/work/libvirt/libvirt.git/src/.libs PATH=/sbin:/bin:/usr/sbin:/usr/bin HOME=/root USER=root LOGNAME=root QEMU_AUDIO_DRV=none /home/zippy/work/qemu/qemu.git/x86_64-softmmu/qemu-system-x86_64 -name dummy -S -machine pc-i440fx-2.1,accel=kvm,usb=off -drive file=/usr/share/OVMF/OVMF_CODE.fd,if=pflash,format=raw,unit=0,readonly=on -drive file=/var/lib/libvirt/qemu/nvram/dummy_VARS.fd,if=pflash,format=raw,unit=1 -m 3072 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -numa node,nodeid=0,cpus=0-1,mem=1024 -numa node,nodeid=1,cpus=2-3,mem=2048 -uuid 64cb6277-713f-4e13-8b14-75667b8c9630 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/dummy.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:89:3a:c2,bus=pci.0,addr=0x3 -netdev tap,id=hostnet1 -device rtl8139,netdev=hostnet1,id=net1,mac=00:11:22:33:44:55,bus=pci.0,addr=0x5 -vnc 127.0.0.1:0 -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on

And again, the interesting part is:

-numa node,nodeid=0,cpus=0-1,mem=1024 -numa node,nodeid=1,cpus=2-3,mem=2048

So I guess this is what's needed, right? I'll post the patch to the upstream list for review.

Comment 7 Michal Privoznik 2014-12-18 11:47:22 UTC
I've proposed patch on the libvirt's upstream list:

https://www.redhat.com/archives/libvir-list/2014-December/msg00931.html

Comment 8 Michal Privoznik 2014-12-19 07:05:36 UTC
And just pushed the patch:

commit f309db1f4d51009bad0d32e12efc75530b66836b
Author:     Michal Privoznik <mprivozn@redhat.com>
AuthorDate: Thu Dec 18 12:36:48 2014 +0100
Commit:     Michal Privoznik <mprivozn@redhat.com>
CommitDate: Fri Dec 19 07:44:44 2014 +0100

    qemu: Create memory-backend-{ram,file} iff needed
    
    Libvirt BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1175397
    QEMU BZ:    https://bugzilla.redhat.com/show_bug.cgi?id=1170093
    
    In qemu there are two interesting arguments:
    
    1) -numa to create a guest NUMA node
    2) -object memory-backend-{ram,file} to tell qemu which memory
    region on which host's NUMA node it should allocate the guest
    memory from.
    
    Combining these two together we can instruct qemu to create a
    guest NUMA node that is tied to a host NUMA node. And it works
    just fine. However, depending on machine type used, there might
    be some issued during migration when OVMF is enabled (see QEMU
    BZ). While this truly is a QEMU bug, we can help avoiding it. The
    problem lies within the memory backend objects somewhere. Having
    said that, fix on our side consists on putting those objects on
    the command line if and only if needed. For instance, while
    previously we would construct this (in all ways correct) command
    line:
    
        -object memory-backend-ram,size=256M,id=ram-node0 \
        -numa node,nodeid=0,cpus=0,memdev=ram-node0
    
    now we create just:
    
        -numa node,nodeid=0,cpus=0,mem=256
    
    because the backend object is obviously not tied to any specific
    host NUMA node.
    
    Signed-off-by: Michal Privoznik <mprivozn@redhat.com>

v1.2.11-60-gf309db1

Comment 10 Eduardo Habkost 2014-12-19 15:06:04 UTC
Quickly tested the libvirt fix using savevm:

Before:

# virsh restore /tmp/rhel7.save
error: Failed to restore domain from /tmp/rhel7.save
error: internal error: early end of file from monitor: possible problem:
RHEL-6 compat: ich9-usb-uhci1: irq_pin = 3
RHEL-6 compat: ich9-usb-uhci2: irq_pin = 3
RHEL-6 compat: ich9-usb-uhci3: irq_pin = 3
2014-12-19T14:40:46.841498Z qemu-kvm: Unknown ramblock "pc.ram", cannot accept migration
2014-12-19T14:40:46.841530Z qemu-kvm: Ack, bad migration stream!
2014-12-19T14:40:46.841538Z qemu-kvm: Illegal RAM offset 87667612e767000
qemu: warning: error while loading state for instance 0x0 of device 'ram'
2014-12-19T14:40:46.841555Z qemu-kvm: load of migration failed: Invalid argument


After upgrading to qemu-kvm-rhev-2.1.2-17.el7.x86_64:

# virsh restore /tmp/rhel7.save 
error: Failed to restore domain from /tmp/rhel7.save
error: internal error: early end of file from monitor: possible problem:
2014-12-19T15:01:47.735475Z qemu-kvm: -numa memdev is not supported by machine rhel6.5.0


After upgrading to patched libvirt:

# virsh restore /tmp/rhel7.save 
Domain restored from /tmp/rhel7.save

Comment 12 Luyao Huang 2015-01-09 03:50:33 UTC
I can reproduce this issue with libvirt-1.2.8-11.el7:
test with qemu-kvm-rhev-2.1.2-17.el7.x86_64

Step:

1.try to start a vm(-M rhel6.5.0) have xml like this(no hugepage):
# virsh dumpxml test4

...
<type arch='x86_64' machine='rhel6.5.0'>hvm</type>
...
  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1024000'/>
    </numa>
  </cpu>
...

2.# virsh start test4
error: Failed to start domain test4
error: internal error: process exited while connecting to monitor: 2015-01-09T02:47:29.444016Z qemu-kvm: -numa memdev is not supported by machine rhel6.5.0

3.check the qemu command line:
# vim /var/log/libvirt/qemu/test4.log
-object memory-backend-ram,size=1000M,id=ram-node0 -numa node,nodeid=0,cpus=0-1,memdev=ram-node0



And cannot reproduce this issue with libvirt-1.2.8-12.el7:

1.# virsh start test4
Domain test4 started

2.# ps aux|grep qemu
...
-numa node,nodeid=0,cpus=0-1,mem=1000
...

3.test with migrate and save:
# virsh migrate test4 --live qemu+ssh://lhuang/system --verbose
Migration: [100 %]

# virsh managedsave test4

Domain test4 state saved by libvirt

# virsh start test4
Domain test4 started

4. if we set a hugepage in XML guest will fail to start(from qemu side):
# virsh dumpxml test4

  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='0'/>
    </hugepages>
    <nosharepages/>
    <locked/>
  </memoryBacking>

<type arch='x86_64' machine='rhel6.5.0'>hvm</type>

  <cpu>
    <numa>
      <cell id='0' cpus='0-1' memory='1024000'/>
    </numa>
  </cpu>


# virsh start test4
error: Failed to start domain test4
error: internal error: early end of file from monitor: possible problem:
2015-01-09T03:23:04.778963Z qemu-kvm: -numa memdev is not supported by machine rhel6.5.0

5.check the qemu command line:
# vim /var/log/libvirt/qemu/test4.log
 -object memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,size=1000M,id=ram-node0 -numa node,nodeid=0,cpus=0-1,memdev=ram-node0

But i think there are some problem with OVMF enable, i notice you think this patch can avoid a OVMF migrate issue:

    However, depending on machine type used, there might
    be some issued during migration when OVMF is enabled (see QEMU
    BZ). While this truly is a QEMU bug, we can help avoiding it.


But after test, i think this patch cannot avoid this:

Steps:

1.install OVMF and add OVMF to XML:
    <type arch='x86_64' machine='rhel6.5.0'>hvm</type> 
<loader readonly='yes' type='pflash'>/usr/share/OVMF/OVMF_CODE.fd</loader>
    <nvram template='/usr/share/OVMF/OVMF_VARS.fd'/>

2.start vm:
# virsh start test4
Domain test4 started

3. save and restore:
# virsh managedsave test4

Domain test4 state saved by libvirt
# virsh start test4
error: Failed to start domain test4
error: internal error: early end of file from monitor: possible problem:
qemu-kvm: /builddir/build/BUILD/qemu-2.1.2/savevm.c:906: shadow_bios: Assertion `bios != ((void *)0)' failed.

4.migrate

# virsh migrate test4 --live qemu+ssh://lhuang/system
error: internal error: early end of file from monitor: possible problem:
qemu-kvm: /builddir/build/BUILD/qemu-2.1.2/savevm.c:906: shadow_bios: Assertion `bios != ((void *)0)' failed.

So Michal, would you please help me to check out if we can verify this bug ?

Comment 13 Michal Privoznik 2015-01-09 08:17:56 UTC
(In reply to Luyao Huang from comment #12)

> So Michal, would you please help me to check out if we can verify this bug ?

Yes, I think your steps prove that the bug is fixed.

Comment 14 Luyao Huang 2015-01-09 08:28:16 UTC
(In reply to Michal Privoznik from comment #13)
> (In reply to Luyao Huang from comment #12)
> 
> > So Michal, would you please help me to check out if we can verify this bug ?
> 
> Yes, I think your steps prove that the bug is fixed.

Thanks Michal!

verify this bug , Steps is in comment 12

Comment 16 errata-xmlrpc 2015-03-05 07:48:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0323.html


Note You need to log in before you can comment on or make changes to this bug.