Description of problem: In limited resources environments, rebuilding VMs fails due to NUMATopologyFilter because scheduling isn't skipped even though the image metadata are the same: nova-scheduler.log:2019-04-25 15:37:07.031 3002 INFO nova.filters [req-c49164d8-95d4-4483-85ac-7cf6eb5e5264 e5a411de82e94a94ae80181bc710696b 17d1fb77d99b4f24a4898222e8c912ab - - -] Filtering removed all hosts for the request with instance ID 'a89e4dff-f247-4c10-b6fb-4d20faa77af8'. Filter results: ['AvailabilityZoneFilter: (start: 1, end: 1)', 'RamFilter: (start: 1, end: 1)', 'ComputeFilter: (start: 1, end: 1)', 'ComputeCapabilitiesFilter: (start: 1, end: 1)', 'ImagePropertiesFilter: (start: 1, end: 1)', 'ServerGroupAntiAffinityFilter: (start: 1, end: 1)', 'ServerGroupAffinityFilter: (start: 1, end: 1)', 'PciPassthroughFilter: (start: 1, end: 1)', 'NUMATopologyFilter: (start: 1, end: 0)'] Version-Release number of selected component (if applicable): openstack-nova-compute-14.1.0-40.el7ost.noarch How reproducible: Always Steps to Reproduce: 1. Create VMs with pinned vcpus until the overcloud is saturated 2. Try rebuilding one while also changing the image 3. Actual results: Fails Expected results: Succeeds if metadata is the same. Additional info:
FWIW, I think the proper long term fix for this is to use Placement update allocations API [1] along with the standard CPU resource tracking spec [2] and eventually NUMA in placement [3] [4]. Not sure what would be an acceptable short-term solution/workaround. [1] https://developer.openstack.org/api-ref/placement/?expanded=update-allocations-detail#update-allocations [2] https://review.opendev.org/#/c/555081/ [3] https://review.opendev.org/#/c/662191/ [4] https://review.opendev.org/#/c/658510/
We think this is a valid bug, but it's still an open question whether we can fix it in OSP10.
We can't fix this in OSP10 unfortunately, but we'd like to keep tracking this for OSP17. It's far away, but realistically it's going to be the first release where we might be able to address this.
*** Bug 1731847 has been marked as a duplicate of this bug. ***
*** This bug has been marked as a duplicate of bug 1700412 ***