Bug 1194982

Summary: numa-enabled domains cannot be migrated from RHEL hosts older than 7.1 to 7.1
Product: Red Hat Enterprise Linux 7 Reporter: Jan Kurik <jkurik>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.1CC: bmcclain, dgilbert, dyuan, gklein, jdenemar, jherrman, jmiao, jsuchane, lhuang, michal.skrivanek, mprivozn, mzhan, pm-eus, rbalakri, rgolan, sherold, zhwang, zpeng
Target Milestone: rcKeywords: Upstream, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-1.2.8-16.el7_1.1 Doc Type: Bug Fix
Doc Text:
A prior QEMU update introduced one-to-one Non-Uniform Memory Access (NUMA) memory pinning of guest NUMA nodes and host NUMA nodes, which also included a new way of NUMA specification at QEMU startup. However, the libvirt library previously always used the newer NUMA specification, even if one-on-one NUMA pinning was not specified in the libvirt configuration XML file. This caused the guest to have an incompatible application binary interface (ABI), which in turn led to failed migration of NUMA domains from Red Hat Enterprise Linux 6 to Red Hat Enterprise Linux 7. With this update, libvirt only uses the newer NUMA specification when it is specified in the configuration, and the described NUMA domains migrate correctly.
Story Points: ---
Clone Of: 1191567
: 1196644 (view as bug list) Environment:
Last Closed: 2015-03-05 14:09:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1191567    
Bug Blocks: 1196644    
Attachments:
Description Flags
libvirtd.log none

Description Jan Kurik 2015-02-21 15:33:56 UTC
This bug has been copied from bug #1191567 and has been proposed
to be backported to 7.1 z-stream (EUS).

Comment 5 Jiri Denemark 2015-02-24 12:52:56 UTC
Fixing bug summary because this bug is a bit more general and affects all machine types. Migrating any domain with the following XML from RHEL 6 or 7.0 to 7.1 will fail:

   <numatune>
     <memory mode='strict' nodeset='0'/>
   </numatune>
   <cpu>
     <numa>
       <cell id='0' cpus='0-1' memory='2048000'/>
     </numa>
   </cpu>

Comment 8 Jincheng Miao 2015-02-26 07:20:26 UTC
For guest NUMA, if domain xml doesn't contain hugepage or memnode, libvirtd will not generate qemu cmdline with '-object':

# virsh dumpxml dummy
<domain type='kvm' id='18'>
  <name>dummy</name>
  <uuid>1fd33000-27cb-425c-94f9-c7454914acdb</uuid>
  <memory unit='KiB'>2097152</memory>
  <currentMemory unit='KiB'>2097152</currentMemory>
  <vcpu placement='auto' current='1'>16</vcpu>
  <numatune>
    <memory mode='strict' nodeset='0-1'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.1.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <cpu>
    <numa>
      <cell id='0' cpus='0-3' memory='524288'/>
      <cell id='1' cpus='4-7' memory='524288'/>
      <cell id='2' cpus='8-11' memory='524288'/>
      <cell id='3' cpus='12-15' memory='524288'/>
    </numa>
  </cpu>
...

For libvirt-1.2.8-16.el7, '-object memory-backend-ram' will be added to qemu cmdline firstly.

# rpm -q libvirt
libvirt-1.2.8-16.el7.x86_64

# virsh start dummy

# ps -ef | grep qemu
qemu     26986     1  4 15:00 ?        00:00:00 /usr/libexec/qemu-kvm -name dummy -S -machine pc-i440fx-rhel7.1.0,accel=kvm,usb=off -m 2048 -realtime mlock=off -smp 1,maxcpus=16,sockets=16,cores=1,threads=1 -object memory-backend-ram,size=512M,id=ram-node0,host-nodes=0-1,policy=bind -numa node,nodeid=0,cpus=0-3,memdev=ram-node0 -object memory-backend-ram,size=512M,id=ram-node1,host-nodes=0-1,policy=bind -numa node,nodeid=1,cpus=4-7,memdev=ram-node1 -object memory-backend-ram,size=512M,id=ram-node2,host-nodes=0-1,policy=bind -numa node,nodeid=2,cpus=8-11,memdev=ram-node2 -object memory-backend-ram,size=512M,id=ram-node3,host-nodes=0-1,policy=bind -numa node,nodeid=3,cpus=12-15,memdev=ram-node3 -uuid 1fd33000-27cb-425c-94f9-c7454914acdb -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/dummy.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -no-acpi -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x2 -msg timestamp=on


But for libvirt-1.2.8-16.el7_1.1.x86_64, it used 'mem=XXX'

# rpm -q libvirt
libvirt-1.2.8-16.el7_1.1.x86_64

# virsh start dummy

# ps -ef | grep qemu
qemu     27696     1  0 15:13 ?        00:00:01 /usr/libexec/qemu-kvm -name dummy -S -machine pc-i440fx-rhel7.1.0,accel=kvm,usb=off -m 2048 -realtime mlock=off -smp 1,maxcpus=16,sockets=16,cores=1,threads=1 -numa node,nodeid=0,cpus=0-3,mem=512 -numa node,nodeid=1,cpus=4-7,mem=512 -numa node,nodeid=2,cpus=8-11,mem=512 -numa node,nodeid=3,cpus=12-15,mem=512 -uuid 1fd33000-27cb-425c-94f9-c7454914acdb -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/dummy.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -no-acpi -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x2 -msg timestamp=on

Comment 9 Luyao Huang 2015-02-26 09:08:04 UTC
I can reproduce this issue with libvirt-1.2.8-16.el7.x86_64:

1.prepare a running happy vm in rhel6 host:
# rpm -q libvirt
libvirt-0.10.2-48.el6.x86_64

# virsh list
 Id    Name                           State
----------------------------------------------------
 6     r6                             running

2.make sure this vm has settings like this:
# virsh dumpxml r6
  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>
  <os>
    <type arch='x86_64' machine='rhel6.5.0'>hvm</type>
    <boot dev='hd'/>
  </os>

  <cpu>
    <numa>
      <cell cpus='0-1' memory='1024000'/>
    </numa>
  </cpu>

3.migrate to rhel7.1 host (version: libvirt-1.2.8-16.el7.x86_64):
# virsh migrate r6 --live qemu+ssh://10.66.6.19/system
root.6.19's password: 
error: internal error: process exited while connecting to monitor: 2015-02-26T07:57:23.869659Z qemu-kvm: -numa memdev is not supported by machine rhel6.5.0

4.also test migrate this from 7.0 host to 7.1 host, get the same result.


try to verify this bug with libvirt-1.2.8-16.el7_1.1.x86_64 and qemu-kvm-rhev-2.1.2-23.el7_1.1.x86_64:

1.first rebuild the src rpm to make sure all the tests will pass:
# rpm -ivh libvirt-1.2.8-16.el7_1.1.src.rpm

# rpmbuild -bb SPECS/libvirt.spec

...
============================================================================
Testsuite summary for libvirt 1.2.8
============================================================================
# TOTAL: 116
# PASS:  116
# SKIP:  0
# XFAIL: 0
# FAIL:  0
# XPASS: 0
# ERROR: 0
============================================================================
...

2. vm have numatune settings and vm numa settings in rhel6.6
# virsh dumpxml r6
  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>
...
  <os>
    <type arch='x86_64' machine='rhel6.5.0'>hvm</type>
    <boot dev='hd'/>
  </os>
...
  <cpu>
    <numa>
      <cell cpus='0-1' memory='1024000'/>
    </numa>
  </cpu>


3. migrate will success and try to migrate back from 7.1 to 6.6(sometimes will failed, i don't know why)

IN 6.6 host:
 # virsh migrate r6 --live qemu+ssh://10.66.6.19/system
root.6.19's password: 

IN 7.1 host:
# virsh migrate r6 --live qemu+ssh://10.66.100.118/system
root.100.118's password: 
error: operation failed: migration job: unexpectedly failed

# virsh migrate r6 --live qemu+ssh://10.66.100.118/system
root.100.118's password: 

4. check the running vm qemu cmd line in 7.1 host, and libvirt won't use memory-backend-file in this case:

# ps aux|grep qemu
...-numa node,nodeid=0,cpus=0-1,mem=1000...

5. for hugepage (memnode do not support in rhel7.0 and rhel6):
# virsh dumpxml test3
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='0'/>
    </hugepages>
  </memoryBacking>
...
  <os>
    <type arch='x86_64' machine='rhel6.5.0'>hvm</type>
    <boot dev='hd'/>
  </os>
...
  <cpu>
    <numa>
      <cell id='0' cpus='0,2' memory='512000'/>
      <cell id='1' cpus='1,3' memory='512000'/>
    </numa>
  </cpu>



6. vm will fail to start in rhel7.1 host

# virsh start test3
error: Failed to start domain test3
error: internal error: early end of file from monitor: possible problem:
2015-02-26T08:34:53.624748Z qemu-kvm: -numa memdev is not supported by machine rhel6.5.0

7.change machine type to rhel7.0.0 than start it, libvirt will use memory-backend-file in this caseļ¼š

# virsh edit test3
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.0.0'>hvm</type>
    <boot dev='hd'/>
  </os>
# virsh start test3
Domain test3 started

# ps aux|grep test3
...
-object memory-backend-file,prealloc=yes,mem-path=/dev/hugepages/libvirt/qemu,size=500M,id=ram-node0,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0,cpus=2,memdev=ram-node0 -object memory-backend-ram,size=500M,id=ram-node1,host-nodes=0,policy=bind -numa node,nodeid=1,cpus=1,cpus=3,memdev=ram-node1
...

8. cross migration test for vm have hugepages settings in rhel6.6 host:
# virsh dumpxml r6
  <memoryBacking>
    <hugepages/>
  </memoryBacking>
...
  <os>
    <type arch='x86_64' machine='rhel6.5.0'>hvm</type>
    <boot dev='hd'/>
  </os>
...
  <cpu>
    <numa>
      <cell cpus='0-1' memory='1024000'/>
    </numa>
  </cpu>
...

9.migrate to rhel7.1 host(will failed but seems not numa node settings issue):
# virsh migrate r6 --live qemu+ssh://10.66.6.19/system
root.6.19's password: 
error: internal error: Unable to find any usable hugetlbfs mount for 0 KiB


So the problem has come:

Hi Michal,

would you please help to check out these two issue when i try to verify this bug, maybe both of them will affect the test result:

1. when i try to verify this issue with libvirt-1.2.8-16.el7_1.1.x86_64, i found cross migrate cannot success every times, steps was in verify step 3
and i can only find a warnning in libvirtd.log in target host(os is rhel6)

2015-02-26 08:25:29.796+0000: 302: warning : qemuDomainObjEnterMonitorInternal:1062 : This thread seems to be the async job owner; entering monitor without asking for a nested job is dangerous

Is this issue will affect this bug verify ?


2. i test cross migrate with hugepages, however i found i cannot migrate
a vm have hugepages settings from rhel6 to rhel7.1 or from rhel7.0 to rhel7.1, but it will work well if i do migrate from rhel6 to rhel7.0, the reason seems to be rhel7.1 libvirt forbid vm start with xml like this:
...
  <memoryBacking>
    <hugepages/>
  </memoryBacking>
...

Is this issue need fix in rhel7.1.z? and is this issue will affect this bug verify ?

Thanks in advance for your answer!!

Comment 10 Luyao Huang 2015-02-26 09:35:42 UTC
r6 XML for issue 2:

<domain type='kvm' id='5'>
  <name>r6</name>
  <uuid>63b566d4-40e9-4152-b784-f46cc953abb0</uuid>
  <memory unit='KiB'>1024000</memory>
  <currentMemory unit='KiB'>1024000</currentMemory>
  <memoryBacking>
    <hugepages/>
  </memoryBacking>
  <vcpu placement='static' cpuset='1-3' current='1'>4</vcpu>
  <numatune>
    <memory mode='strict' nodeset='0'/>
  </numatune>
  <os>
    <type arch='x86_64' machine='rhel6.5.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <pae/>
  </features>
  <cpu>
    <numa>
      <cell cpus='0-1' memory='1024000'/>
    </numa>
  </cpu>
  <clock offset='localtime'>
    <timer name='rtc' tickpolicy='catchup' track='guest'>
      <catchup  threshold='123' slew='120' limit='10000'/>
    </timer>
    <timer name='pit' tickpolicy='delay'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <pm>
    <suspend-to-mem enabled='yes'/>
    <suspend-to-disk enabled='yes'/>
  </pm>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/nfs/lhuang/test3.img'>
        <seclabel model='selinux' relabel='no'/>
      </source>
      <target dev='hda' bus='ide'/>
      <alias name='ide0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>
    <controller type='scsi' index='0' model='virtio-scsi'>
      <alias name='scsi0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0c' function='0x0'/>
    </controller>
    <controller type='virtio-serial' index='0'>
      <alias name='virtio-serial0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </controller>
    <controller type='ide' index='0'>
      <alias name='ide0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <controller type='ccid' index='0'>
      <alias name='ccid0'/>
    </controller>
    <controller type='usb' index='0'>
      <alias name='usb0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <interface type='network'>
      <mac address='52:54:00:cc:a3:82'/>
      <source network='default'/>
      <target dev='vnet0'/>
      <model type='e1000'/>
      <filterref filter='clean-traffic'/>
      <link state='up'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0e' function='0x0' multifunction='on'/>
    </interface>
    <interface type='bridge'>
      <mac address='52:54:00:19:99:53'/>
      <source bridge='virbr0'/>
      <target dev='vnet1'/>
      <model type='rtl8139'/>
      <filterref filter='clean-traffic'/>
      <alias name='net1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </interface>
    <smartcard mode='passthrough' type='spicevmc'>
      <alias name='smartcard0'/>
      <address type='ccid' controller='0' slot='0'/>
    </smartcard>
    <serial type='pty'>
      <source path='/dev/pts/1'/>
      <target port='0'/>
      <alias name='serial0'/>
    </serial>
    <serial type='pty'>
      <source path='/dev/pts/2'/>
      <target port='0'/>
      <alias name='serial1'/>
    </serial>
    <console type='pty' tty='/dev/pts/1'>
      <source path='/dev/pts/1'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <channel type='unix'>
      <source mode='bind' path='/var/lib/libvirt/qemu/r6.agent'/>
      <target type='virtio' name='org.qemu.guest_agent.0'/>
      <alias name='channel0'/>
      <address type='virtio-serial' controller='0' bus='0' port='1'/>
    </channel>
    <channel type='spicevmc'>
      <target type='virtio' name='com.redhat.spice.0'/>
      <alias name='channel1'/>
      <address type='virtio-serial' controller='0' bus='0' port='2'/>
    </channel>
    <input type='tablet' bus='usb'>
      <alias name='input0'/>
    </input>
    <input type='mouse' bus='ps2'/>
    <graphics type='vnc' port='5900' autoport='yes' listen='127.0.0.1'>
      <listen type='address' address='127.0.0.1'/>
    </graphics>
    <graphics type='spice' port='5901' autoport='yes' listen='127.0.0.1'>
      <listen type='address' address='127.0.0.1'/>
    </graphics>
    <video>
      <model type='qxl' ram='65536' vram='65536' heads='1'/>
      <alias name='video0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <watchdog model='i6300esb' action='poweroff'>
      <alias name='watchdog0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </watchdog>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
    </memballoon>
    <rng model='virtio'>
      <backend model='random'>/dev/random</backend>
      <alias name='rng0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </rng>
    <panic>
    </panic>
  </devices>
  <seclabel type='dynamic' model='dac' relabel='yes'>
    <label>107:107</label>
    <imagelabel>107:107</imagelabel>
  </seclabel>
  <seclabel type='dynamic' model='selinux' relabel='yes'>
    <label>unconfined_u:system_r:svirt_t:s0:c1004,c1016</label>
    <imagelabel>unconfined_u:object_r:svirt_image_t:s0:c1004,c1016</imagelabel>
  </seclabel>
</domain>

Comment 11 Luyao Huang 2015-02-26 10:16:14 UTC
Created attachment 995514 [details]
libvirtd.log

Comment 12 Michal Privoznik 2015-02-26 10:44:43 UTC
(In reply to Luyao Huang from comment #9)

> Hi Michal,
> 
> would you please help to check out these two issue when i try to verify this
> bug, maybe both of them will affect the test result:
> 
> 1. when i try to verify this issue with libvirt-1.2.8-16.el7_1.1.x86_64, i
> found cross migrate cannot success every times, steps was in verify step 3
> and i can only find a warnning in libvirtd.log in target host(os is rhel6)
> 
> 2015-02-26 08:25:29.796+0000: 302: warning :
> qemuDomainObjEnterMonitorInternal:1062 : This thread seems to be the async
> job owner; entering monitor without asking for a nested job is dangerous

Despite what the message says, it's harmless.

> 
> Is this issue will affect this bug verify ?

That's okay and probably a qemu bug. If migration can finish successfully sometimes and sometimes not, it's likely to be a qemu bug anyway. So this is okay on libvirt side.

> 
> 
> 2. i test cross migrate with hugepages, however i found i cannot migrate
> a vm have hugepages settings from rhel6 to rhel7.1 or from rhel7.0 to
> rhel7.1, but it will work well if i do migrate from rhel6 to rhel7.0, the
> reason seems to be rhel7.1 libvirt forbid vm start with xml like this:
> ...
>   <memoryBacking>
>     <hugepages/>
>   </memoryBacking>
> ...
> 
> Is this issue need fix in rhel7.1.z? and is this issue will affect this bug
> verify ?

This is not okay, but not much related to this bug. So I suggest cloning this bug to cover this second part and let this original through.

Comment 13 Dr. David Alan Gilbert 2015-02-26 10:57:55 UTC
Luyao:
  For case (1) where it sometimes works and sometimes doesn't, please open a qemu bug with the details and also include the log file for a failing migration from /etc/libvirt/qemu/guestname.xml  from both source/dest.
  Please cc me on the bug.

Comment 14 Luyao Huang 2015-02-26 12:53:50 UTC
(In reply to Michal Privoznik from comment #12)
> (In reply to Luyao Huang from comment #9)
> 
> > Hi Michal,
> > 
> > would you please help to check out these two issue when i try to verify this
> > bug, maybe both of them will affect the test result:
> > 
> > 1. when i try to verify this issue with libvirt-1.2.8-16.el7_1.1.x86_64, i
> > found cross migrate cannot success every times, steps was in verify step 3
> > and i can only find a warnning in libvirtd.log in target host(os is rhel6)
> > 
> > 2015-02-26 08:25:29.796+0000: 302: warning :
> > qemuDomainObjEnterMonitorInternal:1062 : This thread seems to be the async
> > job owner; entering monitor without asking for a nested job is dangerous
> 
> Despite what the message says, it's harmless.
> 
> > 
> > Is this issue will affect this bug verify ?
> 
> That's okay and probably a qemu bug. If migration can finish successfully
> sometimes and sometimes not, it's likely to be a qemu bug anyway. So this is
> okay on libvirt side.
> 
> > 
> > 
> > 2. i test cross migrate with hugepages, however i found i cannot migrate
> > a vm have hugepages settings from rhel6 to rhel7.1 or from rhel7.0 to
> > rhel7.1, but it will work well if i do migrate from rhel6 to rhel7.0, the
> > reason seems to be rhel7.1 libvirt forbid vm start with xml like this:
> > ...
> >   <memoryBacking>
> >     <hugepages/>
> >   </memoryBacking>
> > ...
> > 
> > Is this issue need fix in rhel7.1.z? and is this issue will affect this bug
> > verify ?
> 
> This is not okay, but not much related to this bug. So I suggest cloning
> this bug to cover this second part and let this original through.

Okay, seems these two issue not related to this bug, i will verify this bug.
and open/clone new bug for the new issue. And thanks a lot for your reply

Comment 15 Luyao Huang 2015-02-26 14:40:06 UTC
(In reply to Dr. David Alan Gilbert from comment #13)
> Luyao:
>   For case (1) where it sometimes works and sometimes doesn't, please open a
> qemu bug with the details and also include the log file for a failing
> migration from /etc/libvirt/qemu/guestname.xml  from both source/dest.
>   Please cc me on the bug.
Hi David,

Okay, i have open a qemu bug 1196692 for this issue and find some useful log in source /var/log/libvirt/qemu/r6.log and attach them in the new bug, also include vm xml and libvirtd.log.

Luyao

Comment 16 Dr. David Alan Gilbert 2015-02-26 14:49:51 UTC
Luyao: Thanks; however migrating from 7.x->6.x is NOT supported so it's not expected to work (even for rhel6 machine types) - only 6.x->7.x and 7.x-7.x is supported.  If you have any cases where it fails in those directions please open another qemu bug.

Comment 18 errata-xmlrpc 2015-03-05 14:09:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0625.html