Bug 1296959

Summary: scheduler tries to assign an already-in-use SRIOV VF to a new instance and instance fails
Product: Red Hat OpenStack Reporter: GE Scott Knauss <sknauss>
Component: openstack-novaAssignee: Sahid Ferdjaoui <sferdjao>
Status: CLOSED INSUFFICIENT_DATA QA Contact: nlevinki <nlevinki>
Severity: high Docs Contact:
Priority: urgent    
Version: 6.0 (Juno)CC: berrange, dasmith, eglynn, kchamart, mschuppe, ndipanov, pablo.iranzo, rsussman, sbauza, sferdjao, sgordon, vromanso, yeylon
Target Milestone: ---   
Target Release: 6.0 (Juno)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-01-28 14:35:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 743661    

Description GE Scott Knauss 2016-01-08 14:40:17 UTC
What problem/issue/behavior are you having trouble with?  What do you expect to see?

While deploying our appliance, one VM gets in error state:

2015-12-03 14:57:01.658 47781 INFO nova.osapi_compute.wsgi.server [req-50ae6529-7e00-4c05-82be-1bcb0637958b None] 198.18.1.21 "GET /v2/1d5d10b642704b17aacbeef3592f1b13/servers/09de0c82-fe4d-4ecd-b077-d6ed9661135a/os-volume_attachments HTTP/1.1" status: 200 len: 221 time: 0.0632861
2015-12-03 14:46:19.088 47991 ERROR nova.scheduler.utils [req-e8f9a434-8d3a-41c1-b4d8-46ca76af07a7 None] [instance: 09de0c82-fe4d-4ecd-b077-d6ed9661135a] Error from last host: vanessa-compute-1.nokia.ncio.localdomain (node vanessa-compute-1.nokia.ncio.localdomain): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2091, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2185, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=e.format_message())\n', u'RescheduledException: Build of instance 09de0c82-fe4d-4ecd-b077-d6ed9661135a was re-scheduled: Insufficient compute resources: Requested instance NUMA topology together with requested PCI devices cannot fit the given host NUMA topology.\n']
2015-12-03 14:46:19.197 47991 WARNING nova.scheduler.driver [req-e8f9a434-8d3a-41c1-b4d8-46ca76af07a7 None] [instance: 09de0c82-fe4d-4ecd-b077-d6ed9661135a] NoValidHost exception with message: 'No valid host was found.'
2015-12-03 14:46:19.197 47991 WARNING nova.scheduler.driver [req-e8f9a434-8d3a-41c1-b4d8-46ca76af07a7 None] [instance: 09de0c82-fe4d-4ecd-b077-d6ed9661135a] Setting instance to ERROR state.

Instance needs 14 cCPUs but 2 computes in this AZ have the available resources:
compute-1: 32 /46 vCPUs
compute-2: 24 /46 vCPUs
compute-3: 42 /46 vCPUs

Also our computes are with SRIOV so we have many pci_devices available:
# mysql -u root nova -e "SELECT * FROM pci_devices;" |wc -l
3781

It seems like scheduler algorithm did not work properly and although we had free resources in 2 computes the VMs was not scheduled to get launched in them.

It is occasional. There is no workaround undeployment fails.

Usually on next deployment with the exact same configuration (same templates, same computes, same flavors, same VMs) deployment succeeds.

Where are you experiencing the behavior?  What environment?

Hardware: BL460 Gen9

When does the behavior occur? Frequently?  Repeatedly?   At certain times?

Occasionally

What information can you provide around timeframes and urgency?

It is very urgent. It is a blocker for the reporter, who is actually behind its deadline.

Comment 2 GE Scott Knauss 2016-01-08 14:46:42 UTC
From: nova-compute.log

2016-01-05 14:51:51.940 15262 TRACE nova.compute.manager [instance: ee9d5d79-0fe2-4ab7-abdb-bda7bcafcaaf]     if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)  
2016-01-05 14:51:51.940 15262 TRACE nova.compute.manager [instance: ee9d5d79-0fe2-4ab7-abdb-bda7bcafcaaf] libvirtError: Requested operation is not valid: PCI device 0000:08:1f.2 is in use by driver QEMU, domain instance-0000046f
2016-01-05 14:51:51.940 15262 TRACE nova.compute.manager [instance: ee9d5d79-0fe2-4ab7-abdb-bda7bcafcaaf] 
2016-01-05 14:51:51.978 15262 AUDIT nova.compute.manager [req-5feab602-6d89-4072-8b6f-9c22cda2fbd5 None] [instance: ee9d5d79-0fe2-4ab7-abdb-bda7bcafcaaf] Terminating instance
2016-01-05 14:51:51.988 15262 INFO nova.virt.libvirt.driver [-] [instance: ee9d5d79-0fe2-4ab7-abdb-bda7bcafcaaf] Instance destroyed successfully.

From:  sosreport lspci


08:1f.1 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
08:1f.2 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)
08:1f.3 Ethernet controller: Intel Corporation 82599 Ethernet Controller Virtual Function (rev 01)


It seems the scheduler is trying to use a Virtual Ethernet interface that is already in use.