Bug 1175709

Summary: Unable to start guest with hugepages and strict numa pinning
Product: Red Hat Enterprise Linux 7 Reporter: Daniel Berrangé <berrange>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.1CC: dyuan, honzhang, jdenemar, jmiao, jshortt, kimi.zhang, knoel, ovasik, rbalakri, sgordon, tvvcox
Target Milestone: rcKeywords: TestOnly
Target Release: 7.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-1.2.8-11.el7 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-03-05 07:48:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1170484    
Bug Blocks:    

Description Daniel Berrangé 2014-12-18 12:47:47 UTC
Description of problem:
# virsh dumpxml serial1
<domain type='qemu'>
  <name>serial1</name>
  <uuid>68e2a27e-2f92-5545-2c8e-38b2cec76487</uuid>
  <metadata>
    <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
      <nova:package version="REDHATNOVAVERSION"/>
      <nova:name>i1</nova:name>
      <nova:creationTime>2014-12-18 10:25:43</nova:creationTime>
      <nova:flavor name="m1.micro">
        <nova:memory>128</nova:memory>
        <nova:disk>0</nova:disk>
        <nova:swap>0</nova:swap>
        <nova:ephemeral>0</nova:ephemeral>
        <nova:vcpus>1</nova:vcpus>
      </nova:flavor>
      <nova:owner>
        <nova:user uuid="c04446deb95f4ffeb7108bde31ac9d86">admin</nova:user>
        <nova:project uuid="db7860f7c8384185a76b695a7ed0051c">demo</nova:project>
      </nova:owner>
      <nova:root type="image" uuid="132be356-3c38-4f20-9324-ccf0aad23da5"/>
    </nova:instance>
  </metadata>
  <memory unit='KiB'>131072</memory>
  <currentMemory unit='KiB'>131072</currentMemory>
  <memoryBacking>
    <hugepages>
      <page size='2048' unit='KiB' nodeset='0'/>
    </hugepages>
  </memoryBacking>
  <vcpu placement='static'>1</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0-5'/>
    <emulatorpin cpuset='0-5'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
    <sysinfo type='smbios'>
      <system>
        <entry name='manufacturer'>OpenStack Foundation</entry>
        <entry name='product'>OpenStack Nova</entry>
        <entry name='version'>REDHATNOVAVERSION</entry>
        <entry name='serial'>776b62a6-0e26-4120-827a-ef28c12042b9</entry>
        <entry name='uuid'>68e2a27e-2f92-5545-2c8e-38b2cec76487</entry>
      </system>
    </sysinfo>
  <os>
    <type arch='x86_64' machine='pc-i440fx-rhel7.1.0'>hvm</type>
    <boot dev='hd'/>
    <smbios mode='sysinfo'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu>
    <topology sockets='1' cores='1' threads='1'/>
    <numa>
      <cell id='0' cpus='0' memory='131072'/>
    </numa>
  </cpu>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/var/lib/libvirt/images/demo.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
    </disk>
    <controller type='usb' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'/>
    <controller type='ide' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <interface type='bridge'>
      <mac address='fa:16:3e:72:b4:6f'/>
      <source bridge='virbr0'/>
      <model type='virtio'/>
      <driver name='qemu'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </interface>
    <serial type='pty'>
      <target port='0'/>
    </serial>
    <serial type='pty'>
      <target port='1'/>
    </serial>
    <console type='pty'>
      <target type='serial' port='0'/>
    </console>
    <input type='mouse' bus='ps2'/>
    <input type='keyboard' bus='ps2'/>
    <graphics type='vnc' port='-1' autoport='yes' listen='127.0.0.1' keymap='en-us'>
      <listen type='address' address='127.0.0.1'/>
    </graphics>
    <video>
      <model type='cirrus' vram='16384' heads='1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </video>
    <memballoon model='virtio'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
      <stats period='10'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'/>
</domain>

# virsh start serial1
error: Failed to start domain serial1
error: Failed to create controller cpu for group: No such file or directory


This is a regression in the libvirt-1.2.8-10.el7ost.x86_64  RPM of libvirt. It works fine in libvirt-1.2.8-9.el7ost.x86_64


The following patch in -10 is the cause

commit 3dfa22b502f065f5171475b6459a4cc18d74c125
Author: Wang Rui <moon.wangrui>
Date:   Fri Nov 28 14:36:26 2014 +0100

    qemu: fix domain startup failing with 'strict' mode in numatune
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1168866
    
    If the memory mode is specified as 'strict' and with one node, we
    get the following error when starting domain.
    
    error: Unable to write to '$cgroup_path/cpuset.mems': Device or resource busy
    
    XML is configured with numatune as follows:
      <numatune>
        <memory mode='strict' nodeset='0'/>
      </numatune>
    
    It's broken by Commit 411cea638f6ec8503b7142a31e58b1cd85dbeaba
    which moved qemuSetupCgroupForEmulator() before setting cpuset.mems
    in qemuSetupCgroupPostInit.
    
    Directory '$cgroup_path/emulator/' is created in qemuSetupCgroupForEmulator.
    But '$cgroup_path/emulator/cpuset.mems' it not set and has a default value
    (all nodes, such as 0-1). Then we setup '$cgroup_path/cpuset.mems' to the
    nodemask (in this case it's '0') in qemuSetupCgroupPostInit. It must fail.
    
    This patch makes '$cgroup_path/emulator/cpuset.mems' is set before
    '$cgroup_path/cpuset.mems'. The action is similar with that in
    qemuDomainSetNumaParamsLive.
    
    Signed-off-by: Wang Rui <moon.wangrui>
    (cherry picked from commit c6e90248676126c209b3b6017ad27cf6c6a0ab8f)
    Signed-off-by: Martin Kletzander <mkletzan>
    Signed-off-by: Jiri Denemark <jdenemar>


Reverting that patch makes it work.

Interestingly upstream has that patch and still works.

So there is obviously some other patch upstream that we lack in the rhel-7 branch that is causing bad interactions with this fix.



Version-Release number of selected component (if applicable):
libvirt-1.2.8-10.el7ost.x86_64

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 Daniel Berrangé 2014-12-18 13:30:45 UTC
The patch for https://bugzilla.redhat.com/show_bug.cgi?id=1170484 has coincidentally fixed this problem too

Comment 4 Jiri Denemark 2014-12-18 13:36:33 UTC
This is just another scenario which triggers the same code path fixed for bug 1170484. I'm changing this bug to test-only so that both scenarios are tested.

Comment 11 Jincheng Miao 2014-12-25 15:52:15 UTC
This bug is fixed in latest libvirt-1.2.8-11.el7ost.x86_64:

1. add <numatune> to guest xml:
#rpm -q libvirt
libvirt-1.2.8-11.el7ost.x86_64

# virsh edit a
...
  <vcpu placement='auto'>4</vcpu>
  <numatune>
    <memory mode='strict' placement='auto'/>
  </numatune>
...

2. start guest
# virsh start a
Domain a started

So the guest could start, and I change the status to VERIFIED.

Comment 13 errata-xmlrpc 2015-03-05 07:48:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0323.html