What problem/issue/behavior are you having trouble with? What do you expect to see? While deploying our appliance, one VM gets in error state: 2015-12-03 14:57:01.658 47781 INFO nova.osapi_compute.wsgi.server [req-50ae6529-7e00-4c05-82be-1bcb0637958b None] 198.18.1.21 "GET /v2/1d5d10b642704b17aacbeef3592f1b13/servers/09de0c82-fe4d-4ecd-b077-d6ed9661135a/os-volume_attachments HTTP/1.1" status: 200 len: 221 time: 0.0632861 2015-12-03 14:46:19.088 47991 ERROR nova.scheduler.utils [req-e8f9a434-8d3a-41c1-b4d8-46ca76af07a7 None] [instance: 09de0c82-fe4d-4ecd-b077-d6ed9661135a] Error from last host: vanessa-compute-1.nokia.ncio.localdomain (node vanessa-compute-1.nokia.ncio.localdomain): [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2091, in _do_build_and_run_instance\n filter_properties)\n', u' File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2185, in _build_and_run_instance\n instance_uuid=instance.uuid, reason=e.format_message())\n', u'RescheduledException: Build of instance 09de0c82-fe4d-4ecd-b077-d6ed9661135a was re-scheduled: Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology.\n'] 2015-12-03 14:46:19.197 47991 WARNING nova.scheduler.driver [req-e8f9a434-8d3a-41c1-b4d8-46ca76af07a7 None] [instance: 09de0c82-fe4d-4ecd-b077-d6ed9661135a] NoValidHost exception with message: 'No valid host was found.' 2015-12-03 14:46:19.197 47991 WARNING nova.scheduler.driver [req-e8f9a434-8d3a-41c1-b4d8-46ca76af07a7 None] [instance: 09de0c82-fe4d-4ecd-b077-d6ed9661135a] Setting instance to ERROR state. Instance needs 14 cCPUs but 2 computes in this AZ have the available resources: compute-1: 32 /46 vCPUs compute-2: 24 /46 vCPUs compute-3: 42 /46 vCPUs Also our computes are with SRIOV so we have many pci_devices available: # mysql -u root nova -e "SELECT * FROM pci_devices;" |wc -l 3781 It seems like scheduler algorithm did not work properly and although we had free resources in 2 computes the VMs was not scheduled to get launched in them. It is occasional. There is no workaround undeployment fails. Usually on next deployment with the exact same configuration (same templates, same computes, same flavors, same VMs) deployment succeeds. Where are you experiencing the behavior? What environment? Hardware: BL460 Gen9 When does the behavior occur? Frequently? Repeatedly? At certain times? Occasionally What information can you provide around timeframes and urgency? It is very urgent. It is a blocker for the reporter, who is actually behind its deadline.
From: nova-compute.log 2016-01-05 14:51:51.940 15262 TRACE nova.compute.manager [instance: ee9d5d79-0fe2-4ab7-abdb-bda7bcafcaaf] if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) 2016-01-05 14:51:51.940 15262 TRACE nova.compute.manager [instance: ee9d5d79-0fe2-4ab7-abdb-bda7bcafcaaf] libvirtError: Requested operation is not valid: PCI device 0000:08:1f.2 is in use by driver QEMU, domain instance-0000046f 2016-01-05 14:51:51.940 15262 TRACE nova.compute.manager [instance: ee9d5d79-0fe2-4ab7-abdb-bda7bcafcaaf] 2016-01-05 14:51:51.978 15262 AUDIT nova.compute.manager [req-5feab602-6d89-4072-8b6f-9c22cda2fbd5 None] [instance: ee9d5d79-0fe2-4ab7-abdb-bda7bcafcaaf] Terminating instance 2016-01-05 14:51:51.988 15262 INFO nova.virt.libvirt.driver [-] [instance: ee9d5d79-0fe2-4ab7-abdb-bda7bcafcaaf] Instance destroyed successfully. From: sosreport lspci 08:1f.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 08:1f.2 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) 08:1f.3 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01) It seems the scheduler is trying to use a Virtual Ethernet interface that is already in use.