Description of problem: ======================= While trying to create several nova instances requesting a PCI passthrough device, one of the instances failed to boot with the following error: 2015-04-30 05:19:48.035 27033 ERROR nova.compute.manager [-] [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] Instance failed to spawn 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] Traceback (most recent call last): 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2288, in _build_resources 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] yield resources 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2158, in _build_and_run_instance 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] block_device_info=block_device_info) 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2635, in spawn 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] block_device_info, disk_info=disk_info) 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4558, in _create_domain_and_network 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] power_on=power_on) 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4482, in _create_domain 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] LOG.error(err) 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] File "/usr/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 82, in __exit__ 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] six.reraise(self.type_, self.value, self.tb) 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4472, in _create_domain 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] domain.createWithFlags(launch_flags) 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] result = proxy_call(self._autowrap, f, *args, **kwargs) 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] rv = execute(f, *args, **kwargs) 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] six.reraise(c, e, tb) 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] rv = meth(*args, **kwargs) 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] File "/usr/lib64/python2.7/site-packages/libvirt.py", line 996, in createWithFlags 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) 2015-04-30 05:19:48.035 27033 TRACE nova.compute.manager [instance: cb7bda2d-7b92-4c77-94a3-753389991b2b] libvirtError: Requested operation is not valid: PCI device 0000:07:10.3 is in use by driver QEMU, domain instance-00000003 It appears that the database is not tracking which VF's have already been taken. Version-Release number of selected component (if applicable): ============================================================= openstack-nova-novncproxy-2014.2.3-9.el7ost.noarch python-nova-2014.2.3-9.el7ost.noarch python-novaclient-2.20.0-1.el7ost.noarch openstack-nova-console-2014.2.3-9.el7ost.noarch openstack-nova-scheduler-2014.2.3-9.el7ost.noarch openstack-nova-api-2014.2.3-9.el7ost.noarch openstack-nova-conductor-2014.2.3-9.el7ost.noarch openstack-nova-cert-2014.2.3-9.el7ost.noarch openstack-nova-common-2014.2.3-9.el7ost.noarch openstack-nova-compute-2014.2.3-9.el7ost.noarch How reproducible: ================= It seems to be random how nova selects which VF to use. So depending on how many available VFs there are for the PCI device, it may take longer or shorter to hit this. I recommend setting your ethernet driver to not set the VFs too high. For example, on intel nics do this: modprobe igb max_vfs=2 That will set 2 VFs per PF. So if you have a 2 port nic card, you will create 4 VFs. Steps to Reproduce: =================== 1. Follow the how-to directions attached 2. Start booting instances off of the flavor one by one Actual results: =============== Nova might try to create an instance using a VF that has already been assigned to another instance and so creation will fail Expected results: ================= You should always be able to create as many instances with the requested PCI devices as there are VFs. For example, if your flavor has extra_specs of pci_passthrough:alias="my_igb:2" and you have 8 VFs, you should always be able to create 4 instances with that flavor.
Created attachment 1020593 [details] How to do pci passthrough and boot instance
Does it appear to make a difference how many seconds elapse between instance creation requests?
I'm not sure if this helps , but these are my findings: *****pci device domain="0x0000" bus="0x06" slot="0x11" function="0x7" or 0000:06:11.7 is assigned to instance-00000594 2016-02-19 08:21:59.669 47185 DEBUG nova.virt.libvirt.config [req-74ab5a83-a732-4ff2-bc9a-481ac711dc97 233a8c747d034e3a87b03f10663df397 d86bd63166e0413e97eca88a1cd39639 - - -] Generated XML ('<domain type="kvm">\n <uuid>680dad83-a9f5-418d-a1f9-5938f84c5306</uuid>\n <name>instance-00000594</name>\n <memory>134217728</memory>\n <vcpu>8</vcpu>\n <metadata>\n <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0">\n <nova:package version="2015.1.1-3.el7ost"/>\n <nova:name>FRIPCRF1v-PCF-L-CI-ENT-MD--1-SM-01</nova:name>\n <nova:creationTime>2016-02-19 16:21:59</nova:creationTime>\n <nova:flavor name="PCF-CI-ENT-SESSIONMGR">\n <nova:memory>131072</nova:memory>\n <nova:disk>0</nova:disk>\n <nova:swap>0</nova:swap>\n <nova:ephemeral>0</nova:ephemeral>\n <nova:vcpus>8</nova:vcpus>\n </nova:flavor>\n <nova:owner>\n <nova:user uuid="233a8c747d034e3a87b03f10663df397">ericsson-1</nova:user>\n <nova:project uuid="d86bd63166e0413e97eca88a1cd39639">ericsson-orchestrator</nova:project>\n </nova:owner>\n <nova:root type="image" uuid="9a07966d-b4ac-479a-9d35-f804eaf5df91"/>\n </nova:instance>\n </metadata>\n <sysinfo type="smbios">\n <system>\n <entry name="manufacturer">Red Hat</entry>\n <entry name="product">OpenStack Compute</entry>\n <entry name="version">2015.1.1-3.el7ost</entry>\n <entry name="serial">21be02b4-12a0-4541-9db7-f9e32150a2dd</entry>\n <entry name="uuid">680dad83-a9f5-418d-a1f9-5938f84c5306</entry>\n </system>\n </sysinfo>\n <os>\n <type>hvm</type>\n <boot dev="hd"/>\n <smbios mode="sysinfo"/>\n </os>\n <features>\n <acpi/>\n <apic/>\n </features>\n <cputune>\n <shares>8192</shares>\n </cputune>\n <clock offset="utc">\n <timer name="pit" tickpolicy="delay"/>\n <timer name="rtc" tickpolicy="catchup"/>\n <timer name="hpet" present="no"/>\n </clock>\n <cpu mode="host-model" match="exact">\n <topology sockets="8" cores="1" threads="1"/>\n </cpu>\n <devices>\n <disk type="file" device="disk">\n <driver name="qemu" type="qcow2" cache="none"/>\n <source file="/var/lib/nova/instances/680dad83-a9f5-418d-a1f9-5938f84c5306/disk"/>\n <target bus="virtio" dev="vda"/>\n </disk>\n <disk type="file" device="cdrom">\n <driver name="qemu" type="raw" cache="none"/>\n <source file="/var/lib/nova/instances/680dad83-a9f5-418d-a1f9-5938f84c5306/disk.config"/>\n <target bus="ide" dev="hdd"/>\n </disk>\n <interface type="bridge">\n <mac address="fa:16:3e:6b:d0:b6"/>\n <model type="virtio"/>\n <source bridge="qbrbbbaee14-6e"/>\n <target dev="tapbbbaee14-6e"/>\n </interface>\n <interface type="hostdev" managed="yes">\n <mac address="fa:16:3e:99:bf:38"/>\n <source>\n <address type="pci" domain="0x0000" bus="0x06" slot="0x11" function="0x7"/>\n </source>\n <vlan>\n <tag id="0"/>\n </vlan>\n </interface>\n <interface type="hostdev" managed="yes">\n <mac address="fa:16:3e:30:ea:72"/>\n <source>\n <address type="pci" domain="0x0000" bus="0x06" slot="0x12" function="0x5"/>\n </source>\n <vlan>\n <tag id="0"/>\n </vlan>\n </interface>\n <serial type="file">\n <source path="/var/lib/nova/instances/680dad83-a9f5-418d-a1f9-5938f84c5306/console.log"/>\n </serial>\n <serial type="pty"/>\n <input type="tablet" bus="usb"/>\n <graphics type="vnc" autoport="yes" keymap="en-us" listen="0.0.0.0"/>\n <video>\n <model type="cirrus"/>\n </video>\n <memballoon model="virtio">\n <stats period="10"/>\n </memballoon>\n </devices>\n</domain>\n',) to_xml /usr/lib/python2.7/site-packages/nova/virt/libvirt/config.py:82 2016-02-19 08:21:59.705 47185 DEBUG nova.virt.libvirt.driver [req-74ab5a83-a732-4ff2-bc9a-481ac711dc97 233a8c747d034e3a87b03f10663df397 d86bd63166e0413e97eca88a1cd39639 - - -] [instance: 680dad83-a9f5-418d-a1f9-5938f84c5306] End _get_guest_xml xml=<domain type="kvm"> <uuid>680dad83-a9f5-418d-a1f9-5938f84c5306</uuid> <name>instance-00000594</name> <memory>134217728</memory> <vcpu>8</vcpu> <metadata> <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0"> <nova:package version="2015.1.1-3.el7ost"/> <nova:name>FRIPCRF1v-PCF-L-CI-ENT-MD--1-SM-01</nova:name> <nova:creationTime>2016-02-19 16:21:59</nova:creationTime> <nova:flavor name="PCF-CI-ENT-SESSIONMGR"> <nova:memory>131072</nova:memory> <nova:disk>0</nova:disk> <nova:swap>0</nova:swap> <nova:ephemeral>0</nova:ephemeral> <nova:vcpus>8</nova:vcpus> </nova:flavor> <nova:owner> <nova:user uuid="233a8c747d034e3a87b03f10663df397">ericsson-1</nova:user> <nova:project uuid="d86bd63166e0413e97eca88a1cd39639">ericsson-orchestrator</nova:project> </nova:owner> <nova:root type="image" uuid="9a07966d-b4ac-479a-9d35-f804eaf5df91"/> </nova:instance> </metadata> <sysinfo type="smbios"> <system> <entry name="manufacturer">Red Hat</entry> <entry name="product">OpenStack Compute</entry> <entry name="version">2015.1.1-3.el7ost</entry> <entry name="serial">21be02b4-12a0-4541-9db7-f9e32150a2dd</entry> <entry name="uuid">680dad83-a9f5-418d-a1f9-5938f84c5306</entry> </system> </sysinfo> <os> <type>hvm</type> <boot dev="hd"/> <smbios mode="sysinfo"/> </os> <features> <acpi/> <apic/> </features> <cputune> <shares>8192</shares> </cputune> <clock offset="utc"> <timer name="pit" tickpolicy="delay"/> <timer name="rtc" tickpolicy="catchup"/> <timer name="hpet" present="no"/> </clock> <cpu mode="host-model" match="exact"> <topology sockets="8" cores="1" threads="1"/> </cpu> <devices> <disk type="file" device="disk"> <driver name="qemu" type="qcow2" cache="none"/> <source file="/var/lib/nova/instances/680dad83-a9f5-418d-a1f9-5938f84c5306/disk"/> <target bus="virtio" dev="vda"/> </disk> <disk type="file" device="cdrom"> <driver name="qemu" type="raw" cache="none"/> <source file="/var/lib/nova/instances/680dad83-a9f5-418d-a1f9-5938f84c5306/disk.config"/> <target bus="ide" dev="hdd"/> </disk> <interface type="bridge"> <mac address="fa:16:3e:6b:d0:b6"/> <model type="virtio"/> <source bridge="qbrbbbaee14-6e"/> <target dev="tapbbbaee14-6e"/> </interface> <interface type="hostdev" managed="yes"> <mac address="fa:16:3e:99:bf:38"/> <source> <address type="pci" domain="0x0000" bus="0x06" slot="0x11" function="0x7"/> *****resource tracker says 0000:06:11.7 is assignable?? 2016-02-19 08:22:44.238 47185 DEBUG nova.compute.resource_tracker [req-9977ce40-b8c5-46d2-816e-bd0c8f74cf7b - - - - -] Hypervisor: assignable PCI devices: [{"dev_id": "pci_0000_00_00_0", "product_id": "0e00", "dev_type": "type-PCI", "numa_node": 0, "vendor_id": "8086", "label": "label_8086_0e00", "address": "0000:00:00.0"}, {"dev_id": "pci_0000_00_01_0", "product_id": "0e02", "dev_type": "type-PCI", "numa_node": 0, "vendor_id": "8086", "label": "label_8086_0e02", "address": "0000:00:01.0"}, {"dev_id": "pci_0000_00_01_1", "product_id": "0e03", "dev_type": "type-PCI", "numa_node": 0, "vendor_id": "8086", "label": "label_8086_0e03", "address": "0000:00:01.1"}, {"dev_id": "pci_0000_00_02_0", "product_id": "0e04", "dev_type": "type-PCI", "numa_node": 0, "vendor_id": "8086", "label": "label_8086_0e04", "address": "0000:00:02.0"}, {"dev_id": "pci_0000_06_00_0", "product_id": "10fb", "dev_type": "type-PCI", "numa_node": 0, "vendor_id": "8086", "label": "label_8086_10fb", "address": "0000:06:00.0"}, {"dev_id": "pci_0000_06_00_1", "product_id": "10fb", "dev_type": "type-PF", "numa_node": 0, "vendor_id": "8086", "label": "label_8086_10fb", "address": "0000:06:00.1"}, {"dev_id": "pci_0000_06_10_1", "product_id": "10ed", "dev_type": "type-VF", "numa_node": 0, "vendor_id": "8086", "label": "label_8086_10ed", "phys_function": "0000:06:00.1", "address": "0000:06:10.1"}, {"dev_id": "pci_0000_06_10_3", "product_id": "10ed", "dev_type": "type-VF", "numa_node": 0, "vendor_id": "8086", "label": "label_8086_10ed", "phys_function": "0000:06:00.1", "address": "0000:06:10.3"}, {"dev_id": "pci_0000_06_10_5", "product_id": "10ed", "dev_type": "type-VF", "numa_node": 0, "vendor_id": "8086", "label": "label_8086_10ed", "phys_function": "0000:06:00.1", "address": "0000:06:10.5"}, {"dev_id": "pci_0000_06_10_7", "product_id": "10ed", "dev_type": "type-VF", "numa_node": 0, "vendor_id": "8086", "label": "label_8086_10ed", "phys_function": "0000:06:00.1", "address": "0000:06:10.7"}, {"dev_id": "pci_0000_06_11_1", "product_id": "10ed", "dev_type": "type-VF", "numa_node": 0, "vendor_id": "8086", "label": "label_8086_10ed", "phys_function": "0000:06:00.1", "address": "0000:06:11.1"}, {"dev_id": "pci_0000_06_11_3", "product_id": "10ed", "dev_type": "type-VF", "numa_node": 0, "vendor_id": "8086", "label": "label_8086_10ed", "phys_function": "0000:06:00.1", "address": "0000:06:11.3"}, {"dev_id": "pci_0000_06_11_5", "product_id": "10ed", "dev_type": "type-VF", "numa_node": 0, "vendor_id": "8086", "label": "label_8086_10ed", "phys_function": "0000:06:00.1", "address": "0000:06:11.5"}, {"dev_id": "pci_0000_06_11_7", "product_id": "10ed", "dev_type": "type-VF", "numa_node": 0, "vendor_id": "8086", "label": "label_8086_10ed", "phys_function": "0000:06:00.1", "address": "0000:06:11.7"} **** 0000:06:11.7 is attempted to assign to new d4a8262f-958a-4802-84e1-c09d22b7d6df, we get ERROR!!! 2016-02-19 08:37:16.263 47185 DEBUG nova.compute.utils [req-0949b11d-d9be-4cdb-afc2-b94a981597a6 233a8c747d034e3a87b03f10663df397 d86bd63166e0413e97eca88a1cd39639 - - -] [instance: d4a8262f-958a-4802-84e1-c09d22b7d6df] Requested operation is not valid: PCI device 0000:06:11.7 is in use by driver QEMU, domain instance-00000594 notify_about_instance_usage /usr/lib/python2.7/site-packages/nova/compute/utils.py:310 2016-02-19 08:37:16.263 47185 DEBUG nova.compute.manager [req-0949b11d-d9be-4cdb-afc2-b94a981597a6 233a8c747d034e3a87b03f10663df397 d86bd63166e0413e97eca88a1cd39639 - - -] [instance: d4a8262f-958a-4802-84e1-c09d22b7d6df] Build of instance d4a8262f-958a-4802-84e1-c09d22b7d6df was re-scheduled: Requested operation is not valid: PCI device 0000:06:11.7 is in use by driver QEMU, domain instance-00000594 _do_build_and_run_instance /usr/lib/python2.7/site-packages/nova/compute/manager.py:2268
The device-$date files were created using: mysql -u root nova -e "SELECT hypervisor_hostname, address, instance_uuid, status FROM pci_devices JOIN compute_nodes on compute_nodes.id=compute_node_id" > devices-$(date +%Y%m%d%H%M%S) on the controller. Also, it appears there are duplicate PCI device entries in the DB: MariaDB [nova]> select hypervisor_hostname,address,count(*) from pci_devices JOIN compute_nodes on compute_nodes.id=compute_node_id group by hypervisor_hostname,address having count(*) > 1; +-----------------------------+--------------+----------+ | hypervisor_hostname | address | count(*) | +-----------------------------+--------------+----------+ | l3-compute1.vz.rhelosp.demo | 0000:06:10.1 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:10.3 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:10.5 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:10.7 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:11.1 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:11.3 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:11.5 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:11.7 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:12.1 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:12.3 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:12.5 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:12.7 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:13.1 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:13.3 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:13.5 | 2 | | l3-compute1.vz.rhelosp.demo | 0000:06:13.7 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:10.1 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:10.3 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:10.5 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:10.7 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:11.1 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:11.3 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:11.5 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:11.7 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:12.1 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:12.3 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:12.5 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:12.7 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:13.1 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:13.3 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:13.5 | 2 | | l3-compute2.vz.rhelosp.demo | 0000:06:13.7 | 2 | +-----------------------------+--------------+----------+
After looking at the logs again and the db dump, I agree with Sahid. It looks like we are reusing the same pci devices on reschedule. Considering this, there will be no re-allocation of the devices, since the network is already allocated. It just looked like a weird coincidence that the pci devices that are being chosen for the instances are exactly the same on all of the nodes, while there are so many available.. Also, I can confirm that the problem is not related to the allocation of the pci devices, as it is possible to see that the status of the first instance's (the one that has been successfully spawned on the l3-compute4 host) pci devices has been changed to 'allocated' before the rescheduled instance was launched on the host, later the status is getting cleared.. (still need to look into this point..) l3-compute4.vz.rhelosp.demo 0000:06:11.7 680dad83-a9f5-418d-a1f9-5938f84c5306 allocated l3-compute4.vz.rhelosp.demo 0000:06:12.5 680dad83-a9f5-418d-a1f9-5938f84c5306 allocated I will submit a patch upstream to address the deal location of the networks on reschedule. Thanks, Vladik
(In reply to Vladik Romanovsky from comment #25) > After looking at the logs again and the db dump, I agree with Sahid. > It looks like we are reusing the same pci devices on reschedule. > Considering this, there will be no re-allocation of the devices, since the > network is already allocated. > > It just looked like a weird coincidence that the pci devices that are being > chosen for the instances are exactly the same on all of the nodes, while > there are so many available.. > > Also, I can confirm that the problem is not related to the allocation of the > pci devices, as it is possible to see that the status of the first > instance's (the one that has been successfully spawned on the l3-compute4 > host) pci devices has been changed to 'allocated' before the rescheduled > instance was launched on the host, later the status is getting cleared.. > (still need to look into this point..) > > l3-compute4.vz.rhelosp.demo 0000:06:11.7 > 680dad83-a9f5-418d-a1f9-5938f84c5306 allocated > l3-compute4.vz.rhelosp.demo 0000:06:12.5 > 680dad83-a9f5-418d-a1f9-5938f84c5306 allocated > > I will submit a patch upstream to address the deal location of the networks > on reschedule. > > Thanks, > Vladik The reason the allocated status was cleared for 680dad83-a9f5-418d-a1f9-5938f84c5306 / instance-00000594 pci devices is the instance was deleted. The orchestration tool being used deletes the entire set of instances it spawned when there is a failure. Let me know if you need anything else from the environment. Thanks! -tom
You can see here [1] test-packages. Please let us to know any feedbacks from customer. Thanks [1] https://brewweb.devel.redhat.com/taskinfo?taskID=10538873
Verified as follows - Used "modprobe igb max_vfs=2" and since the setup had 2 port nic card, there were a total of 4VFs available. The flavor was set up as "pci_pass_test:1". We were able to create 4 instances successfully (as expected). ******** VERSION ******** [root@rhos-compute-node-02 ~(keystone_admin)]# yum list installed | grep openstack-nova openstack-nova-api.noarch 2015.1.4-5.el7ost @rhelosp-7.0-puddle openstack-nova-cert.noarch 2015.1.4-5.el7ost @rhelosp-7.0-puddle openstack-nova-common.noarch 2015.1.4-5.el7ost @rhelosp-7.0-puddle openstack-nova-compute.noarch 2015.1.4-5.el7ost @rhelosp-7.0-puddle openstack-nova-conductor.noarch 2015.1.4-5.el7ost @rhelosp-7.0-puddle openstack-nova-console.noarch 2015.1.4-5.el7ost @rhelosp-7.0-puddle openstack-nova-novncproxy.noarch 2015.1.4-5.el7ost @rhelosp-7.0-puddle openstack-nova-scheduler.noarch 2015.1.4-5.el7ost @rhelosp-7.0-puddle ****** LOGS ****** [root@serverA ~(keystone_admin)]# nova flavor-show pci-pass +----------------------------+----------------------------------------------+ | Property | Value | +----------------------------+----------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | disk | 5 | | extra_specs | {"pci_passthrough:alias": "pci_pass_test:1"} | | id | 100 | | name | pci-pass | | os-flavor-access:is_public | True | | ram | 512 | | rxtx_factor | 1.0 | | swap | | | vcpus | 1 | +----------------------------+----------------------------------------------+ [root@serverA ~(keystone_admin)]# nova boot --flavor pci-pass --image cirros --nic net-id=8ad820a7-9382-4319-afbb-153058c3271d vm_pci +--------------------------------------+-----------------------------------------------+ | Property | Value | +--------------------------------------+-----------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | | | OS-EXT-SRV-ATTR:host | - | | OS-EXT-SRV-ATTR:hypervisor_hostname | - | | OS-EXT-SRV-ATTR:instance_name | instance-00000003 | | OS-EXT-STS:power_state | 0 | | OS-EXT-STS:task_state | scheduling | | OS-EXT-STS:vm_state | building | | OS-SRV-USG:launched_at | - | | OS-SRV-USG:terminated_at | - | | accessIPv4 | | | accessIPv6 | | | adminPass | v6thAL3ycwJP | | config_drive | | | created | 2016-06-15T18:32:27Z | | flavor | pci-pass (100) | | hostId | | | id | 5cbb85af-dbf0-4a0f-b175-64afa147c984 | | image | cirros (1b484e42-8e2d-4f89-aab5-c74d8e874a1b) | | key_name | - | | metadata | {} | | name | vm_pci | | os-extended-volumes:volumes_attached | [] | | progress | 0 | | security_groups | default | | status | BUILD | | tenant_id | 539312767fc24dd78293d660373a7293 | | updated | 2016-06-15T18:32:27Z | | user_id | 621dbc15638845779419793637abab66 | +--------------------------------------+-----------------------------------------------+ [root@serverA ~(keystone_admin)]# nova delete vm_pci Request to delete server vm_pci has been accepted. [root@serverA ~(keystone_admin)]# nova list +----+------+--------+------------+-------------+----------+ | ID | Name | Status | Task State | Power State | Networks | +----+------+--------+------------+-------------+----------+ +----+------+--------+------------+-------------+----------+ [root@serverA ~(keystone_admin)]# nova boot --flavor pci-pass --image cirros --nic net-id=8ad820a7-9382-4319-afbb-153058c3271d vm_pci +--------------------------------------+-----------------------------------------------+ | Property | Value | +--------------------------------------+-----------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | | | OS-EXT-SRV-ATTR:host | - | | OS-EXT-SRV-ATTR:hypervisor_hostname | - | | OS-EXT-SRV-ATTR:instance_name | instance-00000004 | | OS-EXT-STS:power_state | 0 | | OS-EXT-STS:task_state | scheduling | | OS-EXT-STS:vm_state | building | | OS-SRV-USG:launched_at | - | | OS-SRV-USG:terminated_at | - | | accessIPv4 | | | accessIPv6 | | | adminPass | dHn6Hf42VzWS | | config_drive | | | created | 2016-06-15T18:35:02Z | | flavor | pci-pass (100) | | hostId | | | id | fae0b702-5b7b-48ca-b881-1ad5884f289c | | image | cirros (1b484e42-8e2d-4f89-aab5-c74d8e874a1b) | | key_name | - | | metadata | {} | | name | vm_pci | | os-extended-volumes:volumes_attached | [] | | progress | 0 | | security_groups | default | | status | BUILD | | tenant_id | 539312767fc24dd78293d660373a7293 | | updated | 2016-06-15T18:35:02Z | | user_id | 621dbc15638845779419793637abab66 | +--------------------------------------+-----------------------------------------------+ [root@serverA ~(keystone_admin)]# nov alist^C [root@serverA ~(keystone_admin)]# nova list +--------------------------------------+--------+--------+------------+-------------+------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+--------+--------+------------+-------------+------------------+ | fae0b702-5b7b-48ca-b881-1ad5884f289c | vm_pci | ACTIVE | - | Running | private=10.0.0.6 | +--------------------------------------+--------+--------+------------+-------------+------------------+ [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# nova boot --flavor pci-pass --image cirros --nic net-id=8ad820a7-9382-4319-afbb-153058c3271d vm_pci1 +--------------------------------------+-----------------------------------------------+ | Property | Value | +--------------------------------------+-----------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | | | OS-EXT-SRV-ATTR:host | - | | OS-EXT-SRV-ATTR:hypervisor_hostname | - | | OS-EXT-SRV-ATTR:instance_name | instance-00000005 | | OS-EXT-STS:power_state | 0 | | OS-EXT-STS:task_state | scheduling | | OS-EXT-STS:vm_state | building | | OS-SRV-USG:launched_at | - | | OS-SRV-USG:terminated_at | - | | accessIPv4 | | | accessIPv6 | | | adminPass | Qk86kHT8zsnj | | config_drive | | | created | 2016-06-15T18:38:59Z | | flavor | pci-pass (100) | | hostId | | | id | 3b5fbe04-46c0-403b-863b-621d490283f3 | | image | cirros (1b484e42-8e2d-4f89-aab5-c74d8e874a1b) | | key_name | - | | metadata | {} | | name | vm_pci1 | | os-extended-volumes:volumes_attached | [] | | progress | 0 | | security_groups | default | | status | BUILD | | tenant_id | 539312767fc24dd78293d660373a7293 | | updated | 2016-06-15T18:38:59Z | | user_id | 621dbc15638845779419793637abab66 | +--------------------------------------+-----------------------------------------------+ [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# nova list +--------------------------------------+---------+--------+------------+-------------+------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+---------+--------+------------+-------------+------------------+ | fae0b702-5b7b-48ca-b881-1ad5884f289c | vm_pci | ACTIVE | - | Running | private=10.0.0.6 | | 3b5fbe04-46c0-403b-863b-621d490283f3 | vm_pci1 | BUILD | spawning | NOSTATE | private=10.0.0.7 | +--------------------------------------+---------+--------+------------+-------------+------------------+ [root@serverA ~(keystone_admin)]# nova list +--------------------------------------+---------+--------+------------+-------------+------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+---------+--------+------------+-------------+------------------+ | fae0b702-5b7b-48ca-b881-1ad5884f289c | vm_pci | ACTIVE | - | Running | private=10.0.0.6 | | 3b5fbe04-46c0-403b-863b-621d490283f3 | vm_pci1 | ACTIVE | - | Running | private=10.0.0.7 | +--------------------------------------+---------+--------+------------+-------------+------------------+ [root@serverA ~(keystone_admin)]# nova boot --flavor pci-pass --image cirros --nic net-id=8ad820a7-9382-4319-afbb-153058c3271d vm_pci2 +--------------------------------------+-----------------------------------------------+ | Property | Value | +--------------------------------------+-----------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | | | OS-EXT-SRV-ATTR:host | - | | OS-EXT-SRV-ATTR:hypervisor_hostname | - | | OS-EXT-SRV-ATTR:instance_name | instance-00000006 | | OS-EXT-STS:power_state | 0 | | OS-EXT-STS:task_state | scheduling | | OS-EXT-STS:vm_state | building | | OS-SRV-USG:launched_at | - | | OS-SRV-USG:terminated_at | - | | accessIPv4 | | | accessIPv6 | | | adminPass | JHN2a7XtZVny | | config_drive | | | created | 2016-06-15T18:39:17Z | | flavor | pci-pass (100) | | hostId | | | id | 67833c76-ebe8-4be6-9480-bc89e3ba8679 | | image | cirros (1b484e42-8e2d-4f89-aab5-c74d8e874a1b) | | key_name | - | | metadata | {} | | name | vm_pci2 | | os-extended-volumes:volumes_attached | [] | | progress | 0 | | security_groups | default | | status | BUILD | | tenant_id | 539312767fc24dd78293d660373a7293 | | updated | 2016-06-15T18:39:17Z | | user_id | 621dbc15638845779419793637abab66 | +--------------------------------------+-----------------------------------------------+ [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# nova list +--------------------------------------+---------+--------+------------+-------------+------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+---------+--------+------------+-------------+------------------+ | fae0b702-5b7b-48ca-b881-1ad5884f289c | vm_pci | ACTIVE | - | Running | private=10.0.0.6 | | 3b5fbe04-46c0-403b-863b-621d490283f3 | vm_pci1 | ACTIVE | - | Running | private=10.0.0.7 | | 67833c76-ebe8-4be6-9480-bc89e3ba8679 | vm_pci2 | BUILD | spawning | NOSTATE | private=10.0.0.8 | +--------------------------------------+---------+--------+------------+-------------+------------------+ [root@serverA ~(keystone_admin)]# nova list +--------------------------------------+---------+--------+------------+-------------+------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+---------+--------+------------+-------------+------------------+ | fae0b702-5b7b-48ca-b881-1ad5884f289c | vm_pci | ACTIVE | - | Running | private=10.0.0.6 | | 3b5fbe04-46c0-403b-863b-621d490283f3 | vm_pci1 | ACTIVE | - | Running | private=10.0.0.7 | | 67833c76-ebe8-4be6-9480-bc89e3ba8679 | vm_pci2 | ACTIVE | - | Running | private=10.0.0.8 | +--------------------------------------+---------+--------+------------+-------------+------------------+ [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# nova boot --flavor pci-pass --image cirros --nic net-id=8ad820a7-9382-4319-afbb-153058c3271d vm_pci3 +--------------------------------------+-----------------------------------------------+ | Property | Value | +--------------------------------------+-----------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | | | OS-EXT-SRV-ATTR:host | - | | OS-EXT-SRV-ATTR:hypervisor_hostname | - | | OS-EXT-SRV-ATTR:instance_name | instance-00000007 | | OS-EXT-STS:power_state | 0 | | OS-EXT-STS:task_state | scheduling | | OS-EXT-STS:vm_state | building | | OS-SRV-USG:launched_at | - | | OS-SRV-USG:terminated_at | - | | accessIPv4 | | | accessIPv6 | | | adminPass | 4DxENjbLC8uo | | config_drive | | | created | 2016-06-15T18:39:58Z | | flavor | pci-pass (100) | | hostId | | | id | d8607f43-7c0f-42fb-816a-7f2b0bdcdedb | | image | cirros (1b484e42-8e2d-4f89-aab5-c74d8e874a1b) | | key_name | - | | metadata | {} | | name | vm_pci3 | | os-extended-volumes:volumes_attached | [] | | progress | 0 | | security_groups | default | | status | BUILD | | tenant_id | 539312767fc24dd78293d660373a7293 | | updated | 2016-06-15T18:39:58Z | | user_id | 621dbc15638845779419793637abab66 | +--------------------------------------+-----------------------------------------------+ [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# [root@serverA ~(keystone_admin)]# nova list +--------------------------------------+---------+--------+------------+-------------+------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+---------+--------+------------+-------------+------------------+ | fae0b702-5b7b-48ca-b881-1ad5884f289c | vm_pci | ACTIVE | - | Running | private=10.0.0.6 | | 3b5fbe04-46c0-403b-863b-621d490283f3 | vm_pci1 | ACTIVE | - | Running | private=10.0.0.7 | | 67833c76-ebe8-4be6-9480-bc89e3ba8679 | vm_pci2 | ACTIVE | - | Running | private=10.0.0.8 | | d8607f43-7c0f-42fb-816a-7f2b0bdcdedb | vm_pci3 | BUILD | spawning | NOSTATE | private=10.0.0.9 | +--------------------------------------+---------+--------+------------+-------------+------------------+ [root@serverA ~(keystone_admin)]# nova list +--------------------------------------+---------+--------+------------+-------------+------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+---------+--------+------------+-------------+------------------+ | fae0b702-5b7b-48ca-b881-1ad5884f289c | vm_pci | ACTIVE | - | Running | private=10.0.0.6 | | 3b5fbe04-46c0-403b-863b-621d490283f3 | vm_pci1 | ACTIVE | - | Running | private=10.0.0.7 | | 67833c76-ebe8-4be6-9480-bc89e3ba8679 | vm_pci2 | ACTIVE | - | Running | private=10.0.0.8 | | d8607f43-7c0f-42fb-816a-7f2b0bdcdedb | vm_pci3 | ACTIVE | - | Running | private=10.0.0.9 | +--------------------------------------+---------+--------+------------+-------------+------------------+ [root@serverA ~(keystone_admin)]# nova list +--------------------------------------+---------+--------+------------+-------------+------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+---------+--------+------------+-------------+------------------+ | fae0b702-5b7b-48ca-b881-1ad5884f289c | vm_pci | ACTIVE | - | Running | private=10.0.0.6 | | 3b5fbe04-46c0-403b-863b-621d490283f3 | vm_pci1 | ACTIVE | - | Running | private=10.0.0.7 | | 67833c76-ebe8-4be6-9480-bc89e3ba8679 | vm_pci2 | ACTIVE | - | Running | private=10.0.0.8 | | d8607f43-7c0f-42fb-816a-7f2b0bdcdedb | vm_pci3 | ACTIVE | - | Running | private=10.0.0.9 | +--------------------------------------+---------+--------+------------+-------------+------------------+
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1313