Bug 1262069

Summary: [RFE] Dynamically apply an IOPS limit based on a ceilometer trigger to limit a "noisy" VM
Product: Red Hat OpenStack Reporter: Neil Levine <nlevine>
Component: openstack-novaAssignee: OSP DFG:Compute <osp-dfg-compute>
Status: CLOSED WONTFIX QA Contact: OSP DFG:Compute <osp-dfg-compute>
Severity: low Docs Contact:
Priority: low    
Version: 9.0 (Mitaka)CC: berrange, dasmith, eglynn, flucifre, jdurgin, kchamart, nlevine, sbauza, sferdjao, sgordon, srevivo, vromanso
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-05 16:52:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1261092, 1369482    
Bug Blocks:    

Description Neil Levine 2015-09-10 18:39:21 UTC
If we see a VM consuming more IOPS than allowed by a threshold set in Ceilomater, we want to automatically limit it so it doesn't affect other tenants. This will provide a basic QoS facility.

Comment 3 Stephen Gordon 2016-01-29 14:54:35 UTC
(In reply to Neil Levine from comment #0)
> If we see a VM consuming more IOPS than allowed by a threshold set in
> Ceilomater, we want to automatically limit it so it doesn't affect other
> tenants. This will provide a basic QoS facility.

Can you elaborate on what is needed over and above instance resource quotas:

    https://wiki.openstack.org/wiki/InstanceResourceQuota

Thanks,

Steve

Comment 4 Neil Levine 2016-01-29 19:01:48 UTC
Those settings are the ones we need buto I think this is actually a qemu-kvm issue not a nova one, i.e qemu-kvm doesn't honor the IOPs limits when applied to RBD volumes. 

Josh, is that correct?

If so, we need to reassign the ticket to the qemu-kvm component.

Comment 5 Josh Durgin 2016-01-29 19:19:22 UTC
qemu does honor I/O limits for rbd, using its own throttling. cgroups limits do not work with librbd of course. There's a bug in previous nova versions that would not pass the disk io limits to qemu for rbd.

What I'm not sure about is whether these limits can be changed after a device is attached. Nova doesn't support that, and I'm not sure whether qemu does.

Comment 6 Neil Levine 2016-01-29 19:28:17 UTC
So is this a bug BZ and not a RFE against nova?

Comment 7 Josh Durgin 2016-01-29 19:35:07 UTC
If you don't mind statically defining the limits, it's a bug. Specifically this one:

https://bugs.launchpad.net/nova/+bug/1405367

If you wanted to adjust the limits dynamically, it'd be a nova RFE and possibly also a qemu RFE.

Comment 8 Neil Levine 2016-01-29 19:40:10 UTC
We need the dynamic limits as the user story around this feature is to help admins clamp on noisy VMs. 

Just having the static limits work for now would be a win though.

Comment 9 Stephen Gordon 2016-01-29 22:27:33 UTC
(In reply to Josh Durgin from comment #7)
> If you don't mind statically defining the limits, it's a bug. Specifically
> this one:
> 
> https://bugs.launchpad.net/nova/+bug/1405367

OK, we were actually already tracking this under Bug # 1261092 which has been verified for RHEL OpenStack Platform 8.

> If you wanted to adjust the limits dynamically, it'd be a nova RFE and
> possibly also a qemu RFE.

This is more complicated, since as a general comment Nova instances are currently "fire and forget" and there isn't really an interface to change things after the fact.

There is a proposal up around live resize which is more specifically targeted at adding CPU/RAM/Disk to a running instance but I am wondering if this type of change would fit as an extension or that or be shouted down.

Comment 10 Stephen Gordon 2016-01-29 22:30:30 UTC
One other comment, the full use case - "apply an IOPS limit based on a ceilometer trigger to limit a "noisy" VM" - sounds like it would belong being orchestrated by a higher level system (e.g. CloudForms) I think what we need in Ceilometer/Nova is to ensure we are exposing the right information and APIs for such a system to use though.

Comment 11 Neil Levine 2016-03-17 23:25:09 UTC
(In reply to Stephen Gordon from comment #10)
> One other comment, the full use case - "apply an IOPS limit based on a
> ceilometer trigger to limit a "noisy" VM" - sounds like it would belong
> being orchestrated by a higher level system (e.g. CloudForms) I think what
> we need in Ceilometer/Nova is to ensure we are exposing the right
> information and APIs for such a system to use though.

Yeah, this makes sense. We're starting to look at native QoS in Ceph which will address the higher-level use-case but getting VM-centric logging of IOPS is always going to be necessary.

Is there info on what Nova (or Qemu) logs for VM IOPS currently?