Bug 1469073

Summary: [RFE] Implement CPU weighter filter
Product: Red Hat OpenStack Reporter: Pablo Iranzo Gómez <pablo.iranzo>
Component: openstack-novaAssignee: Stephen Finucane <stephenfin>
Status: CLOSED ERRATA QA Contact: Joe H. Rahme <jhakimra>
Severity: high Docs Contact:
Priority: high    
Version: 10.0 (Newton)CC: aguetta, berrange, brault, dasmith, dwojewod, egallen, eglynn, jraju, kchamart, lyarwood, mbooth, sbauza, sgordon, srevivo, stephenfin, vromanso
Target Milestone: Upstream M2Keywords: FutureFeature, Triaged
Target Release: 14.0 (Rocky)   
Hardware: Unspecified   
OS: Unspecified   
URL: https://blueprints.launchpad.net/nova/+spec/vcpu-weighter
Whiteboard:
Fixed In Version: openstack-nova-18.0.0-0.20180710150340.8469fa7 Doc Type: Enhancement
Doc Text:
Feature: Add the CPUWeigher weigher for nova-scheduler. Reason: The CPUWeigher allows operators to configure a stacking or spreading policy for vCPUs. Result: Operators can enable the CPUWeigher and configure a stacking (use all vCPUs on one node first) or spreading (attempt to use a roughly equal amount of vCPUs from all hosts) policy.
Story Points: ---
Clone Of:
: 1565089 1576882 (view as bug list) Environment:
Last Closed: 2019-01-11 11:47:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1638095, 1657391    
Bug Blocks: 1476900, 1565089, 1571286, 1573209, 1573210, 1576882, 1576885, 1576887, 1576888, 1582406, 1582409, 1582412, 1582413, 1732909    

Description Pablo Iranzo Gómez 2017-07-10 10:56:38 UTC
Description of problem:

Currently, hosts are not filtered based on used VCPU's on the system (upstream was proposed and abandoned at https://review.openstack.org/#/c/379525/ )

And systems can get over subscription


Version-Release number of selected component (if applicable):


How reproducible:

Initial status:
compute-0: host (source) migrating instances (4 instances, 2 vCPU each)
compute-1: 2 free vCPUs
compute-2: 2 free vCPUs
compute-3: 4 free vCPUs

Results:
- 1 VM (2 vCPU) to compute-1 (could hold 1 vm, got 0)  ← OK:   should got 1 VM's got 1
- 0 VM to compute-2          (could hold 1 vm, got 1)  ← WRONG should got 1 VM's not 0
- 3 VM (6 vCPU) to compute-3 (could hold 2 vm, got 3)  ← WRONG should got 2 VM's not 3


Actual results:

Hosts get oversubscribed

Expected results:

No oversubscription of VCPU's

Additional info:

Comment 3 Aviv Guetta 2017-10-29 09:01:31 UTC
https://review.openstack.org/#/c/379525/ was abandoned.
Reason (from Launchpad - sfinucan):
The real concern I see with that change is that we're working on using the placement API for providing the best destinations related to the existing resources (CPU, memory, disk). While it's fine to merge that one, it would mean that it's not waiting for the scheduler using the placement API, which could be a bit sad. The real issue is that it's really easy to add a custom weighter out-of-tree without needing to add it in the Nova tree, so that's not really needed, see?
I did this fairly quickly while working on the PCI-NUMA filter here. The reason for doing this is that it seemed like a filter that had a clear use case and one I really expected to see there already. If you don't think it's as useful as I think it is, then I'm fine to abandon this :)

Comment 4 Aviv Guetta 2017-10-31 07:27:07 UTC
Following the abandon of upstream bug (see comment #3), Can engineering provide more info about how to achieve a better match of placement destinations in OSP 10?

Comment 7 Stephen Finucane 2017-11-21 14:59:21 UTC
Apologies for the delay - I have been on PTO.

I have discussed this upstream and it seems this is something that will *not* be covered by placement. As such, I have reopened the review. However, this kind of feature requires a blueprint and the deadline for this during the current cycle closed some time ago. We do not carry features that are not available upstream, so this is something that would not be available until OSP 14 at the earliest. In addition, we do not tend to backport features (compared to bugfixes), though this might (might) qualify as an exception. I will ask.

Comment 8 Aviv Guetta 2017-11-28 10:32:37 UTC
Hi Stefan,
Any update?

Aviv

Comment 11 Stephen Finucane 2017-11-30 10:09:55 UTC
@Aviv. Unfortunately this is considered a feature so it would require an approved blueprint to continue work on it. The window for this during the Queens cycle (what will become OSP 17) has already closed, so this will be delayed until Rocky (OSP 18). Fortunately this is something that would be easily backported, at least from a code perspective, and once merged upstream we could assess the viability of doing so.

I'll also note that weighers and filters are an area that the nova team explicitly allow custom code to run. It's possible that the customer could use the upstream filter in their own code. I don't know if this is supported from a Red Hat perspective though, so you'll need to ask around for info on this.

Comment 12 Aviv Guetta 2017-12-07 08:25:10 UTC
Hi Stephen, 
I think you missed the OSP release numbers, Queens will be OSP 13, Rocky 14.

Using upstream code by the customer is optional, but unsupported. We're urging them not to do that and we'll try to avoid it asap.

Is a one-of backport for OSP 10 is feasible somehow? 

Aviv

Comment 13 Stephen Finucane 2017-12-08 13:48:34 UTC
Oops - I was using the upstream nova release numbers.

A one-off backport might be possible and is something I need to discuss. However, just to set expectations for there to be a backport, there must first be a feature to backport. As noted above, this can't happen until OSP 14.

Comment 14 Aviv Guetta 2018-03-08 09:31:51 UTC
Hi Stephen,
Can you suggest (if any) a workaround for OSP 10?

Aviv

Comment 15 Stephen Finucane 2018-03-08 18:18:03 UTC
I think the weigher is the only practical approach, unfortunately. Given that the Rocky window is open upstream, I will see if we get this approved for the upcoming release.

Comment 17 Sylvain Bauza 2018-04-09 12:31:08 UTC
*** Bug 1246461 has been marked as a duplicate of this bug. ***

Comment 25 errata-xmlrpc 2019-01-11 11:47:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:0045