Bug 1168672

Summary: "libvirtError: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dinstance\x2d00000002.scope/cpuset.mems': Device or resource busy"
Product: [Fedora] Fedora Reporter: Kashyap Chamarthy <kchamart>
Component: libvirtAssignee: Martin Kletzander <mkletzan>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 21CC: agedosier, berrange, clalancette, crobinso, itamar, jforbes, laine, libvirt-maint, ndipanov, veillard, virt-maint
Target Milestone: ---Keywords: Reopened, Upstream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-1.2.9.2-1.fc21 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1168866 1168944 (view as bug list) Environment:
Last Closed: 2015-02-15 03:06:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1168866, 1168944    
Attachments:
Description Flags
libvirt debug log with 'virCgroupSetValueStr' failure when Nova instance is launched
none
`virsh dumpxml` of DevStack VM where OpenStack setup is running.
none
libvirt XML Nova attempted to set when tried to boot an instance, but failed. Obtained from DevStack screen-n-cpu.log.
none
Another Nova instance XML (this time with <numa> attribute), attempted to set by Nova libvirt driver none

Description Kashyap Chamarthy 2014-11-27 14:48:28 UTC
Description of problem
----------------------

This occurs when you boot a Nova instance with NUMA topology.

[. . .]
libvirtError: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dinstance\x2d00000002.scope/cpuset.mems': Device or resource busy
[. . .'

Version
-------

Versions of libvirt, QEMU, systemd in the OpenStack setup (running in a
DevStack VM):

  $ uname -r; rpm -q libvirt-daemon-kvm qemu-system-x86 systemd
  3.18.0-0.rc6.git0.1.fc22.x86_64
  libvirt-daemon-kvm-1.2.10-3.fc22.x86_64
  qemu-system-x86-2.2.0-0.1.rc1.fc22.x86_64
  systemd-216-11.fc21.x86_64


How reproducible: Atleast twice.


Steps to reproduce
------------------

This occurred in when booting a Nova instance in a DevStack (OpenStack
developer setup) environment. The DevStack machine is a KVM guest
hypervisor, and the Nova guest is a nested guest running on this.

It's a fairly involved test environment, details here:

http://docs-draft.openstack.org/18/131818/1/check/gate-nova-docs/2ddc418/doc/build/html/devref/testing/libvirt-numa.html#testing-basis-non-numa-usage
docs-draft.openstack.org/18/131818/1/check/gate-nova-docs/2ddc418/doc/build/html/devref/testing/libvirt-numa.html#testing-basis-non-numa-usage


Actual results
--------------

[. . .]
2014-11-26 20:17:28.722 ERROR nova.compute.manager [-] [instance: bcb53b78-452c-4695-b39d-754389cd3dd5] Instance failed to spawn
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5] Traceback (most recent call last):
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]   File "/home/kashyapc/src/cloud/nova/nova/compute
/manager.py", line 2247, in _build_resources
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]     yield resources
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]   File "/home/kashyapc/src/cloud/nova/nova/compute
/manager.py", line 2117, in _build_and_run_instance
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]     instance_type=instance_type)
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]   File "/home/kashyapc/src/cloud/nova/nova/virt/li
bvirt/driver.py", line 2640, in spawn
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]     block_device_info, disk_info=disk_info)
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]   File "/home/kashyapc/src/cloud/nova/nova/virt/libvirt/driver.py", line 4500, in _create_domain_and_network
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]     power_on=power_on)
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]   File "/home/kashyapc/src/cloud/nova/nova/virt/libvirt/driver.py", line 4433, in _create_domain
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]     LOG.error(err)
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]   File "/usr/lib/python2.7/site-packages/oslo/utils/excutils.py", line 82, in __exit__
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]     six.reraise(self.type_, self.value, self.tb)
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]   File "/home/kashyapc/src/cloud/nova/nova/virt/libvirt/driver.py", line 4423, in _create_domain
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]     domain.createWithFlags(launch_flags)
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]     result = proxy_call(self._autowrap, f, *args, **kwargs)
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]     rv = execute(f, *args, **kwargs)
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]     six.reraise(c, e, tb)
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]     rv = meth(*args, **kwargs)
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1033, in createWithFlags
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5]     if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2014-11-26 20:17:28.722 TRACE nova.compute.manager [instance: bcb53b78-452c-4695-b39d-754389cd3dd5] libvirtError: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dinstance\x2d00000002.scope/cpuset.mems': Device or resource busy
[. . .]


Expected results
----------------

Nova instance (libvirt nested guest) should boot successfully.


Additional info
---------------

[1] Inventory of available NUMA nodes on the physical host:

$ numactl --hardware
available: 4 nodes (0-3)
node 0 cpus: 0 4 8 12 16 20 24 28 32 36 40 44
node 0 size: 257954 MB
node 0 free: 248486 MB
node 1 cpus: 1 5 9 13 17 21 25 29 33 37 41 45
node 1 size: 258045 MB
node 1 free: 256470 MB
node 2 cpus: 2 6 10 14 18 22 26 30 34 38 42 46
node 2 size: 258045 MB
node 2 free: 256507 MB
node 3 cpus: 3 7 11 15 19 23 27 31 35 39 43 47
node 3 size: 258040 MB
node 3 free: 256457 MB
node distances:
node   0   1   2   3 
  0:  10  20  20  20 
  1:  20  10  20  20 
  2:  20  20  10  20 
  3:  20  20  20  10 

[2] From sytemd `journactl`:

$ sudo journalctl -u libvirtd -l -p err
[. . .]
Nov 26 09:19:49 devstack libvirtd[32697]: driver in virRegisterStorageDriver must not be NULL
Nov 26 09:19:49 devstack libvirtd[32697]: Failed module registration vboxStorageRegister
Nov 26 10:33:51 devstack libvirtd[32697]: End of file while reading data: Input/output error
Nov 26 11:12:36 devstack libvirtd[32697]: End of file while reading data: Input/output error
Nov 26 20:17:28 devstack libvirtd[32697]: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dinstance\x2d00000002.scope/cpuset.mems': Device or resource busy
Nov 26 20:17:28 devstack libvirtd[32697]: error from service: TerminateMachine: No machine 'qemu-instance-00000002' known
[. . .]


[3] From an existing Nova instance (that was booted _without_ NUMA)

    $ find /sys/fs/cgroup/cpuset/machine.slice/machine-qemu*/ -name cpuset.mems 
    /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dinstance\x2d00000001.scope/vcpu0/cpuset.mems
    /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dinstance\x2d00000001.scope/cpuset.mems
    /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dinstance\x2d00000001.scope/emulator/cpuset.mems
    
    $ cd /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\\x2dinstance\\x2d00000001.scope/
    $ cat cpuset.mems vcpu0/cpuset.mems emulator/cpuset.mems 
    0-2
    0-2
    0-2


[4] libvirt developer Martin Kletzander confirmed on IRC (#virt, OFTC) that
this is a bug.

Comment 1 Kashyap Chamarthy 2014-11-27 15:16:02 UTC
Created attachment 962112 [details]
libvirt debug log with 'virCgroupSetValueStr' failure when Nova instance is launched

Contextual snippet related to cgroups from the attached libvirtd debug log:

[. . .]
2014-11-27 14:18:20.835+0000: 25475: debug : virCgroupSetValueStr:718 : Set value '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dinstance\x2d00000003.scope/cpuset.mems' to '0'
2014-11-27 14:18:20.836+0000: 25475: debug : virFileClose:99 : Closed fd 25
2014-11-27 14:18:20.836+0000: 25475: error : virCgroupSetValueStr:728 : Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dinstance\x2d00000003.scope/cpuset.mems': Device or resource busy
2014-11-27 14:18:20.836+0000: 25475: debug : virFileClose:99 : Closed fd 24
[. . .]
2014-11-27 14:18:21.041+0000: 25475: debug : virDBusMessageIterEncode:640 : Popping iter=0x7fd43ef944d0
2014-11-27 14:18:21.042+0000: 25475: error : virDBusCall:1537 : error from service: TerminateMachine: No machine 'qemu-instance-00000003' known
2014-11-27 14:18:21.042+0000: 25475: debug : qemuRemoveCgroup:1222 : Failed to terminate cgroup for instance-00000003
2014-11-27 14:18:21.042+0000: 25475: debug : virObjectUnref:259 : OBJECT_UNREF: obj=0x7fd4300e6890
2014-11-27 14:18:21.042+0000: 25475: debug : virCgroupRemove:3331 : Removing cgroup /machine.slice/machine-qemu\x2dinstance\x2d00000003.scope
2014-11-27 14:18:21.042+0000: 25475: debug : virCgroupRemove:3352 : Removing cgroup /sys/fs/cgroup/cpu,cpuacct/machine.slice/machine-qemu\x2dinstance\x2d00000003.scope/ and all child cgroups
2014-11-27 14:18:21.042+0000: 25475: debug : virCgroupRemove:3352 : Removing cgroup /sys/fs/cgroup/cpu,cpuacct/machine.slice/machine-qemu\x2dinstance\x2d00000003.scope/ and all child cgroups
2014-11-27 14:18:21.042+0000: 25475: debug : virCgroupRemove:3352 : Removing cgroup /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dinstance\x2d00000003.scope/ and all child cgroups
2014-11-27 14:18:21.042+0000: 25475: debug : virCgroupRemoveRecursively:3302 : Removing cgroup /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2dinstance\x2d00000003.scope//emulator
[. . .]

Comment 2 Kashyap Chamarthy 2014-11-27 15:28:09 UTC
DevStack machine's capabilities where the Nova Compute service is running and a libvirt instance (a nested guest) failed to started)

---------------------
DevStack>$ virsh capabilities
<capabilities>

  <host>
    <uuid>7dae2ee3-8950-9a41-9e24-a463a8563bbd</uuid>
    <cpu>
      <arch>x86_64</arch>
      <model>Nehalem</model>
      <vendor>Intel</vendor>
      <topology sockets='1' cores='8' threads='1'/>
      <feature name='rdtscp'/>
      <feature name='hypervisor'/>
      <feature name='x2apic'/>
      <feature name='vmx'/>
      <feature name='ss'/>
      <feature name='vme'/>
      <pages unit='KiB' size='4'/>
      <pages unit='KiB' size='2048'/>
    </cpu>
    <power_management>
      <suspend_mem/>
      <suspend_disk/>
      <suspend_hybrid/>
    </power_management>
    <migration_features>
      <live/>
      <uri_transports>
        <uri_transport>tcp</uri_transport>
        <uri_transport>rdma</uri_transport>
      </uri_transports>
    </migration_features>
    <topology>
      <cells num='3'>
        <cell id='0'>
          <memory unit='KiB'>3949740</memory>
          <pages unit='KiB' size='4'>987435</pages>
          <pages unit='KiB' size='2048'>0</pages>
          <distances>
            <sibling id='0' value='10'/>
            <sibling id='1' value='20'/>
            <sibling id='2' value='20'/>
          </distances>
          <cpus num='4'>
            <cpu id='0' socket_id='0' core_id='0' siblings='0'/>
            <cpu id='1' socket_id='1' core_id='0' siblings='1'/>
            <cpu id='2' socket_id='2' core_id='0' siblings='2'/>
            <cpu id='3' socket_id='3' core_id='0' siblings='3'/>
          </cpus>
        </cell>
        <cell id='1'>
          <memory unit='KiB'>2016864</memory>
          <pages unit='KiB' size='4'>504216</pages>
          <pages unit='KiB' size='2048'>0</pages>
          <distances>
            <sibling id='0' value='20'/>
            <sibling id='1' value='10'/>
            <sibling id='2' value='20'/>
          </distances>
          <cpus num='2'>
            <cpu id='4' socket_id='4' core_id='0' siblings='4'/>
            <cpu id='5' socket_id='5' core_id='0' siblings='5'/>
          </cpus>
        </cell>
        <cell id='2'>
          <memory unit='KiB'>2014304</memory>
          <pages unit='KiB' size='4'>503576</pages>
          <pages unit='KiB' size='2048'>0</pages>
          <distances>
            <sibling id='0' value='20'/>
            <sibling id='1' value='20'/>
            <sibling id='2' value='10'/>
          </distances>
          <cpus num='2'>
            <cpu id='6' socket_id='6' core_id='0' siblings='6'/>
            <cpu id='7' socket_id='7' core_id='0' siblings='7'/>
          </cpus>
        </cell>
      </cells>
    </topology>
    <secmodel>
      <model>selinux</model>
      <doi>0</doi>
      <baselabel type='kvm'>system_u:system_r:svirt_t:s0</baselabel>
      <baselabel type='qemu'>system_u:system_r:svirt_tcg_t:s0</baselabel>
    </secmodel>
    <secmodel>
      <model>dac</model>
      <doi>0</doi>
      <baselabel type='kvm'>+107:+107</baselabel>
      <baselabel type='qemu'>+107:+107</baselabel>
    </secmodel>
  </host>

  <guest>
    <os_type>hvm</os_type>
    <arch name='i686'>
      <wordsize>32</wordsize>
      <emulator>/usr/bin/qemu-system-i386</emulator>
      <machine canonical='pc-i440fx-2.2' maxCpus='255'>pc</machine>
      <machine maxCpus='255'>pc-0.12</machine>
      <machine maxCpus='255'>pc-1.3</machine>
      <machine maxCpus='255'>pc-q35-1.6</machine>
      <machine maxCpus='255'>pc-q35-1.5</machine>
      <machine maxCpus='255'>pc-i440fx-1.6</machine>
      <machine canonical='pc-q35-2.2' maxCpus='255'>q35</machine>
      <machine maxCpus='1'>xenpv</machine>
      <machine maxCpus='255'>pc-i440fx-1.7</machine>
      <machine maxCpus='255'>pc-q35-2.1</machine>
      <machine maxCpus='255'>pc-0.11</machine>
      <machine maxCpus='255'>pc-0.10</machine>
      <machine maxCpus='255'>pc-1.2</machine>
      <machine maxCpus='1'>isapc</machine>
      <machine maxCpus='255'>pc-q35-1.4</machine>
      <machine maxCpus='128'>xenfv</machine>
      <machine maxCpus='255'>pc-0.15</machine>
      <machine maxCpus='255'>pc-i440fx-1.5</machine>
      <machine maxCpus='255'>pc-0.14</machine>
      <machine maxCpus='255'>pc-q35-2.0</machine>
      <machine maxCpus='255'>pc-i440fx-1.4</machine>
      <machine maxCpus='255'>pc-1.1</machine>
      <machine maxCpus='255'>pc-q35-1.7</machine>
      <machine maxCpus='255'>pc-i440fx-2.1</machine>
      <machine maxCpus='255'>pc-1.0</machine>
      <machine maxCpus='255'>pc-i440fx-2.0</machine>
      <machine maxCpus='255'>pc-0.13</machine>
      <domain type='qemu'>
      </domain>
      <domain type='kvm'>
        <emulator>/usr/bin/qemu-kvm</emulator>
        <machine canonical='pc-i440fx-2.2' maxCpus='255'>pc</machine>
        <machine maxCpus='255'>pc-1.3</machine>
        <machine maxCpus='255'>pc-0.12</machine>
        <machine maxCpus='255'>pc-q35-1.6</machine>
        <machine maxCpus='255'>pc-q35-1.5</machine>
        <machine maxCpus='255'>pc-i440fx-1.6</machine>
        <machine canonical='pc-q35-2.2' maxCpus='255'>q35</machine>
        <machine maxCpus='255'>pc-i440fx-1.7</machine>
        <machine maxCpus='1'>xenpv</machine>
        <machine maxCpus='255'>pc-q35-2.1</machine>
        <machine maxCpus='255'>pc-0.11</machine>
        <machine maxCpus='255'>pc-0.10</machine>
        <machine maxCpus='255'>pc-1.2</machine>
        <machine maxCpus='1'>isapc</machine>
        <machine maxCpus='255'>pc-q35-1.4</machine>
        <machine maxCpus='128'>xenfv</machine>
        <machine maxCpus='255'>pc-0.15</machine>
        <machine maxCpus='255'>pc-i440fx-1.5</machine>
        <machine maxCpus='255'>pc-i440fx-1.4</machine>
        <machine maxCpus='255'>pc-q35-2.0</machine>
        <machine maxCpus='255'>pc-0.14</machine>
        <machine maxCpus='255'>pc-1.1</machine>
        <machine maxCpus='255'>pc-q35-1.7</machine>
        <machine maxCpus='255'>pc-i440fx-2.1</machine>
        <machine maxCpus='255'>pc-1.0</machine>
        <machine maxCpus='255'>pc-i440fx-2.0</machine>
        <machine maxCpus='255'>pc-0.13</machine>
      </domain>
    </arch>
    <features>
      <cpuselection/>
      <deviceboot/>
      <disksnapshot default='on' toggle='no'/>
      <acpi default='on' toggle='yes'/>
      <apic default='on' toggle='no'/>
      <pae/>
      <nonpae/>
    </features>
  </guest>

  <guest>
    <os_type>hvm</os_type>
    <arch name='x86_64'>
      <wordsize>64</wordsize>
      <emulator>/usr/bin/qemu-system-x86_64</emulator>
      <machine canonical='pc-i440fx-2.2' maxCpus='255'>pc</machine>
      <machine maxCpus='255'>pc-1.3</machine>
      <machine maxCpus='255'>pc-0.12</machine>
      <machine maxCpus='255'>pc-q35-1.6</machine>
      <machine maxCpus='255'>pc-q35-1.5</machine>
      <machine maxCpus='255'>pc-i440fx-1.6</machine>
      <machine canonical='pc-q35-2.2' maxCpus='255'>q35</machine>
      <machine maxCpus='255'>pc-i440fx-1.7</machine>
      <machine maxCpus='1'>xenpv</machine>
      <machine maxCpus='255'>pc-q35-2.1</machine>
      <machine maxCpus='255'>pc-0.11</machine>
      <machine maxCpus='255'>pc-0.10</machine>
      <machine maxCpus='255'>pc-1.2</machine>
      <machine maxCpus='1'>isapc</machine>
      <machine maxCpus='255'>pc-q35-1.4</machine>
      <machine maxCpus='128'>xenfv</machine>
      <machine maxCpus='255'>pc-0.15</machine>
      <machine maxCpus='255'>pc-i440fx-1.5</machine>
      <machine maxCpus='255'>pc-i440fx-1.4</machine>
      <machine maxCpus='255'>pc-q35-2.0</machine>
      <machine maxCpus='255'>pc-0.14</machine>
      <machine maxCpus='255'>pc-1.1</machine>
      <machine maxCpus='255'>pc-q35-1.7</machine>
      <machine maxCpus='255'>pc-i440fx-2.1</machine>
      <machine maxCpus='255'>pc-1.0</machine>
      <machine maxCpus='255'>pc-i440fx-2.0</machine>
      <machine maxCpus='255'>pc-0.13</machine>
      <domain type='qemu'>
      </domain>
      <domain type='kvm'>
        <emulator>/usr/bin/qemu-kvm</emulator>
        <machine canonical='pc-i440fx-2.2' maxCpus='255'>pc</machine>
        <machine maxCpus='255'>pc-1.3</machine>
        <machine maxCpus='255'>pc-0.12</machine>
        <machine maxCpus='255'>pc-q35-1.6</machine>
        <machine maxCpus='255'>pc-q35-1.5</machine>
        <machine maxCpus='255'>pc-i440fx-1.6</machine>
        <machine canonical='pc-q35-2.2' maxCpus='255'>q35</machine>
        <machine maxCpus='255'>pc-i440fx-1.7</machine>
        <machine maxCpus='1'>xenpv</machine>
        <machine maxCpus='255'>pc-q35-2.1</machine>
        <machine maxCpus='255'>pc-0.11</machine>
        <machine maxCpus='255'>pc-0.10</machine>
        <machine maxCpus='255'>pc-1.2</machine>
        <machine maxCpus='1'>isapc</machine>
        <machine maxCpus='255'>pc-q35-1.4</machine>
        <machine maxCpus='128'>xenfv</machine>
        <machine maxCpus='255'>pc-0.15</machine>
        <machine maxCpus='255'>pc-i440fx-1.5</machine>
        <machine maxCpus='255'>pc-i440fx-1.4</machine>
        <machine maxCpus='255'>pc-q35-2.0</machine>
        <machine maxCpus='255'>pc-0.14</machine>
        <machine maxCpus='255'>pc-1.1</machine>
        <machine maxCpus='255'>pc-q35-1.7</machine>
        <machine maxCpus='255'>pc-i440fx-2.1</machine>
        <machine maxCpus='255'>pc-1.0</machine>
        <machine maxCpus='255'>pc-i440fx-2.0</machine>
        <machine maxCpus='255'>pc-0.13</machine>
      </domain>
    </arch>
    <features>
      <cpuselection/>
      <deviceboot/>
      <disksnapshot default='on' toggle='no'/>
      <acpi default='on' toggle='yes'/>
      <apic default='on' toggle='no'/>
    </features>
  </guest>

</capabilities>
---------------------

Comment 3 Kashyap Chamarthy 2014-11-27 15:29:07 UTC
Created attachment 962119 [details]
`virsh dumpxml` of DevStack VM where OpenStack setup is running.

Comment 4 Kashyap Chamarthy 2014-11-27 15:33:39 UTC
Created attachment 962126 [details]
libvirt XML Nova attempted to set when tried to boot an instance, but failed. Obtained from DevStack screen-n-cpu.log.

Contextual snippet from the attachment:

[. . .]
  <memory>1048576</memory>
  <numatune>
    <memory mode="strict" nodeset="0"/>
    <memnode cellid="0" mode="strict" nodeset="0"/>
  </numatune>
  <vcpu>4</vcpu>
  <metadata>
    <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
      <nova:package version="2015.1"/>
      <nova:name>cirrvm3</nova:name>
      <nova:creationTime>2014-11-27 14:18:18</nova:creationTime>
      <nova:flavor name="m1.numa">
        <nova:memory>1024</nova:memory>
        <nova:disk>1</nova:disk>
        <nova:swap>0</nova:swap>
        <nova:ephemeral>0</nova:ephemeral>
        <nova:vcpus>4</nova:vcpus>
      </nova:flavor>
      <nova:owner>
        <nova:user uuid="e2ab0e48d003456da53e892366651175">admin</nova:user>
        <nova:project uuid="a9a2cd5511214089a290ccfcac47502c">admin</nova:project>
      </nova:owner>
      <nova:root type="image" uuid="178c675a-d5fb-459f-a850-f7ffa6e2c9d2"/>
    </nova:instance>
  </metadata>
[. . .]
  <cputune>
    <emulatorpin cpuset="0-3"/>
    <vcpupin vcpu="0" cpuset="0-3"/>
    <vcpupin vcpu="1" cpuset="0-3"/>
    <vcpupin vcpu="2" cpuset="0-3"/>
    <vcpupin vcpu="3" cpuset="0-3"/>
  </cputune>
[. . .]

Comment 5 Nikola Dipanov 2014-11-27 17:10:46 UTC
It might also be relevant that in this case, the domain XML will as well have a <numa> element specified.

Comment 6 Kashyap Chamarthy 2014-11-28 12:58:58 UTC
Created attachment 962491 [details]
Another Nova instance XML (this time with <numa> attribute), attempted to set by Nova libvirt driver

Contextual snippet of Nova guest XML:
[. . .]
  <vcpu placement='static'>4</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0-3'/>
    <vcpupin vcpu='1' cpuset='0-3'/>
    <vcpupin vcpu='2' cpuset='0-3'/>
    <vcpupin vcpu='3' cpuset='0-3'/>
    <emulatorpin cpuset='0-3'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
  </numatune>
[. . .]
  <cpu>
    <topology sockets='4' cores='1' threads='1'/>
    <numa>
      <cell id='0' cpus='0-3' memory='1048576'/>
    </numa>
  </cpu>
[. . .]


I was here in Nova's git when I tested this time:

  nova]$ git describe
  2014.2-995-g5d2ea10


Previously, I was at:

  nova]$ git describe
  2014.2-973-g922ca3c

Comment 7 Martin Kletzander 2014-11-28 13:31:01 UTC
Fixed upstream with v1.2.10-75-gc6e9024:

commit c6e90248676126c209b3b6017ad27cf6c6a0ab8f
Author: Wang Rui <moon.wangrui>
Date:   Mon Nov 10 21:53:19 2014 +0800

    qemu: fix domain startup failing with 'strict' mode in numatune

Comment 8 Kashyap Chamarthy 2014-11-28 18:49:23 UTC
Tested with libvirt RPMs built from git:

  $ git describe
  CVE-2014-7823-193-g6085d91

  $ git rev-parse --short HEAD
  6085d91

which has the commit mentioned in comment #7. 

Re-testing Nova with these RPMs (libvirt-1.2.11 -- yet to be released) version, Nova instance with NUMA topology boots successfully:

1. Create a Nova flavor with NUMA topology: 

    $ nova flavor-create m1.numa 999 1024 1 4
    $ nova flavor-key m1.numa set hw:numa_nodes=1
    $ nova flavor-show m1.numa
    +----------------------------+------------------------+
    | Property                   | Value                  |
    +----------------------------+------------------------+
    | OS-FLV-DISABLED:disabled   | False                  |
    | OS-FLV-EXT-DATA:ephemeral  | 0                      |
    | disk                       | 1                      |
    | extra_specs                | {"hw:numa_nodes": "1"} |
    | id                         | 999                    |
    | name                       | m1.numa                |
    | os-flavor-access:is_public | True                   |
    | ram                        | 1024                   |
    | rxtx_factor                | 1.0                    |
    | swap                       |                        |
    | vcpus                      | 4                      |
    +----------------------------+------------------------+


2. Boot a Nova guest:

$ nova boot --image cirros-0.3.1-x86_64-disk --flavor m1.numa cirrvm5


3. Find the Nova instance:

$ nova list | grep cirrvm5
| 5d4c50ff-301c-44cb-826f-ffa07266d85f | cirrvm5 | ACTIVE | -          | Running     | public=172.24.4.6 |


4. Find the libvirt ID for the Nova instance:

$ grep -i 5d4c50ff-301c-44cb-826f-ffa07266d85f /etc/libvirt/qemu/*.xml | grep uuid
/etc/libvirt/qemu/instance-00000004.xml:  <uuid>5d4c50ff-301c-44cb-826f-ffa07266d85f</uuid>
[. . .]


5. Examine Nova instance's libvirt XML:

--------------------
$ sudo virsh dumpxml instance-00000004
<domain type='kvm' id='3'>
  <name>instance-00000004</name>
  <uuid>5d4c50ff-301c-44cb-826f-ffa07266d85f</uuid>
  <metadata>
    <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">
      <nova:package version="2015.1"/>
      <nova:name>cirrvm5</nova:name>
      <nova:creationTime>2014-11-28 18:11:27</nova:creationTime>
      <nova:flavor name="m1.numa">
        <nova:memory>1024</nova:memory>
        <nova:disk>1</nova:disk>
        <nova:swap>0</nova:swap>
        <nova:ephemeral>0</nova:ephemeral>
        <nova:vcpus>4</nova:vcpus>
      </nova:flavor>
      <nova:owner>
        <nova:user uuid="f9a0644c9e9540828fbd8249dc9a92a2">admin</nova:user>
        <nova:project uuid="9157c0a4cf194d02bb4aa8023fb9db8b">admin</nova:project>
      </nova:owner>
      <nova:root type="image" uuid="e15c9d88-5de0-4f4b-8d15-708cd32f4ea9"/>
    </nova:instance>
  </metadata>
  <memory unit='KiB'>1048576</memory>
  <currentMemory unit='KiB'>1048576</currentMemory>
  <vcpu placement='static'>4</vcpu>
  <cputune>
    <vcpupin vcpu='0' cpuset='0-3'/>
    <vcpupin vcpu='1' cpuset='0-3'/>
    <vcpupin vcpu='2' cpuset='0-3'/>
    <vcpupin vcpu='3' cpuset='0-3'/>
    <emulatorpin cpuset='0-3'/>
  </cputune>
  <numatune>
    <memory mode='strict' nodeset='0'/>
    <memnode cellid='0' mode='strict' nodeset='0'/>
  </numatune>
  <resource>
    <partition>/machine</partition>
  </resource>
  <sysinfo type='smbios'>
    <system>
      <entry name='manufacturer'>OpenStack Foundation</entry>
      <entry name='product'>OpenStack Nova</entry>
      <entry name='version'>2015.1</entry>
      <entry name='serial'>bf6b5391-2390-df4f-b3dc-aa80d05468bb</entry>
      <entry name='uuid'>5d4c50ff-301c-44cb-826f-ffa07266d85f</entry>
    </system>
  </sysinfo>
  <os>
    <type arch='x86_64' machine='pc-i440fx-2.2'>hvm</type>
    <boot dev='hd'/>
    <smbios mode='sysinfo'/>
  </os>
  <features>
    <acpi/>
    <apic/>
  </features>
  <cpu>
    <topology sockets='4' cores='1' threads='1'/>
    <numa>
      <cell id='0' cpus='0-3' memory='1048576' unit='KiB'/>
    </numa>
  </cpu>
  <clock offset='utc'>
    <timer name='pit' tickpolicy='delay'/>
    <timer name='rtc' tickpolicy='catchup'/>
    <timer name='hpet' present='no'/>
  </clock>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>destroy</on_crash>
  <devices>
    <emulator>/usr/bin/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none'/>
      <source file='/home/kashyapc/src/cloud/data/nova/instances/5d4c50ff-301c-44cb-826f-ffa07266d85f/disk'/>
      <backingStore type='file' index='1'>
        <format type='raw'/>
        <source file='/home/kashyapc/src/cloud/data/nova/instances/_base/f187eddcdb76fcfd896c0916b9288e666014ce2b'/>
        <backingStore/>
      </backingStore>
      <target dev='vda' bus='virtio'/>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
    </disk>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/home/kashyapc/src/cloud/data/nova/instances/5d4c50ff-301c-44cb-826f-ffa07266d85f/disk.config'/>
      <backingStore/>
      <target dev='hdd' bus='ide'/>
      <readonly/>
      <alias name='ide0-1-1'/>
      <address type='drive' controller='0' bus='1' target='0' unit='1'/>
    </disk>
    <controller type='usb' index='0'>
      <alias name='usb0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
    </controller>
    <controller type='pci' index='0' model='pci-root'>
      <alias name='pci.0'/>
    </controller>
    <controller type='ide' index='0'>
      <alias name='ide0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
    </controller>
    <interface type='bridge'>
      <mac address='fa:16:3e:fd:50:8f'/>
      <source bridge='qbr1ee94b1e-29'/>
      <target dev='tap1ee94b1e-29'/>
      <model type='virtio'/>
      <alias name='net0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
    </interface>
    <serial type='file'>
      <source path='/home/kashyapc/src/cloud/data/nova/instances/5d4c50ff-301c-44cb-826f-ffa07266d85f/console.log'/>
      <target port='0'/>
      <alias name='serial0'/>
    </serial>
    <serial type='pty'>
      <source path='/dev/pts/17'/>
      <target port='1'/>
      <alias name='serial1'/>
    </serial>
    <console type='file'>
      <source path='/home/kashyapc/src/cloud/data/nova/instances/5d4c50ff-301c-44cb-826f-ffa07266d85f/console.log'/>
      <target type='serial' port='0'/>
      <alias name='serial0'/>
    </console>
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
      <stats period='10'/>
    </memballoon>
  </devices>
  <seclabel type='dynamic' model='selinux' relabel='yes'>
    <label>system_u:system_r:svirt_t:s0:c176,c615</label>
    <imagelabel>system_u:object_r:svirt_image_t:s0:c176,c615</imagelabel>
  </seclabel>
</domain>
--------------------

6. Check what Nova recorded in the database (PostgreSQL in this case):

$  sudo -u postgres psql nova
nova=#
nova=# SELECT numa_topology FROM instance_extra;
[. . .]
 {"nova_object.version": "1.1", "nova_object.changes": ["instance_uuid"], "nova_object.name": "InstanceNUMATopology", "nova_object.data": {"instance_u
uid": "5d4c50ff-301c-44cb-826f-ffa07266d85f", "cells": [{"nova_object.version": "1.1", "nova_object.changes": ["cpuset", "id", "pagesize", "memory"], 
"nova_object.name": "InstanceNUMACell", "nova_object.data": {"cpuset": [0, 1, 2, 3], "id": 0, "pagesize": null, "memory": 1024}, "nova_object.namespac
e": "nova"}]}, "nova_object.namespace": "nova"}

Comment 9 Kashyap Chamarthy 2014-12-11 09:13:31 UTC
Additional info
---------------

QEMU CLI of the Nova guest booted with a single NUMA node:

-----------------------------------------------------------------------
/usr/bin/qemu-system-x86_64 -machine accel=kvm -name instance-00000001 -S -machine pc-i440fx-2.2,accel=kvm,usb=off -m 1024 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -object memory-backend-ram,size=1024M,id=ram-node0,host-nodes=0,policy=bind -numa node,nodeid=0,cpus=0-3,memdev=ram-node0 -uuid 7646f836-7cb4-4f8b-bb69-ce4976af1081 -smbios type=1,manufacturer=OpenStack Foundation,product=OpenStack Nova,version=2015.1,serial=7dae2ee3-8950-9a41-9e24-a463a8563bbd,uuid=7646f836-7cb4-4f8b-bb69-ce4976af1081 -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/instance-00000001.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/home/kashyapc/src/cloud/data/nova/instances/7646f836-7cb4-4f8b-bb69-ce4976af1081/disk,if=none,id=drive-virtio-disk0,format=qcow2,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive file=/home/kashyapc/src/cloud/data/nova/instances/7646f836-7cb4-4f8b-bb69-ce4976af1081/disk.config,if=none,id=drive-ide0-1-1,readonly=on,format=raw,cache=none -device ide-cd,bus=ide.1,unit=1,drive=drive-ide0-1-1,id=ide0-1-1 -netdev tap,fd=24,id=hostnet0,vhost=on,vhostfd=25 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:77:84:88,bus=pci.0,addr=0x2 -chardev file,id=charserial0,path=/home/kashyapc/src/cloud/data/nova/instances/7646f836-7cb4-4f8b-bb69-ce4976af1081/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on
-----------------------------------------------------------------------

Comment 10 Fedora Update System 2015-02-08 16:34:14 UTC
libvirt-1.2.9.2-1.fc21 has been submitted as an update for Fedora 21.
https://admin.fedoraproject.org/updates/libvirt-1.2.9.2-1.fc21

Comment 11 Fedora Update System 2015-02-09 05:32:08 UTC
Package libvirt-1.2.9.2-1.fc21:
* should fix your issue,
* was pushed to the Fedora 21 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing libvirt-1.2.9.2-1.fc21'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2015-1892/libvirt-1.2.9.2-1.fc21
then log in and leave karma (feedback).

Comment 12 Fedora Update System 2015-02-15 03:06:21 UTC
libvirt-1.2.9.2-1.fc21 has been pushed to the Fedora 21 stable repository.  If problems still persist, please make note of it in this bug report.