| Summary: | [heat] potential autoscaling headroom remains unused | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Eoghan Glynn <eglynn> | ||||
| Component: | openstack-heat | Assignee: | Eoghan Glynn <eglynn> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Kevin Whitney <kwhitney> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 4.0 | CC: | breeler, ddomingo, eglynn, hateya, sbaker, sdake, shardy, srevivo, yeylon | ||||
| Target Milestone: | rc | Keywords: | OtherQA, Triaged | ||||
| Target Release: | 4.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | openstack-heat-engine-2013.2-2.0.el6ost | Doc Type: | Bug Fix | ||||
| Doc Text: |
The Orchestration engine used counterintuitive truncation logic when calculating the autoscaling of server group size changes. Specifically, autoscaling always only used the configured scaling increment, regardless of the configured maximum or minimum group size.
This allowed certain scaling increment settings to prevent the autoscaling feature from actually hitting minimum or maximum group sizes. For example, with a scale-up setting of 2, the only possible autoscaling maximum would be 4 if the configured maximum group size if 5.
With this release, the autoscaling feature now truncates scaling adjustments to upper/lower bounds in case of an overshoot. This allows the Orchestration engine to automatically scale to maximum and minimum group sizes, regardless of the configuring scaling increments.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2013-12-20 00:39:08 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Attachments: |
|
||||||
Created attachment 829191 [details]
Heat template that reproduces te issue.
Fix proposed to master upstream: https://review.openstack.org/58343 Fix landed on master upstream: http://github.com/openstack/heat/commit/2c25616e Fix proposed to stable/havana upstream: https://review.openstack.org/58552 Fix landed on stable/havana upstream: https://github.com/openstack/heat/commit/a8c0b110 Fix backported to internal rhos-4.0-rhel-6-patches branch: https://code.engineering.redhat.com/gerrit/16394 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2013-1859.html |
Description of problem: Potential headroom for an autoscaling group growth or shrinkage will remain unused if the adjustment doesn't *exactly* hit the max or min size respectively. Take for example an instance group with: * MinSize=1 * MaxSize=6 * ScaleupPolicy ScalingAdjustment=2 * ScaledownPolicy ScalingAdjustment=-3 When the under-scaled alarm fires, the group will grow in increments of 2 from 1->3->5 and then grow no further, even if the under-scaled alarm condition persists. So the max group size is never reached. Then if the over-scaled alarm fires subsequently, the group will shrink in one decrement of 3 from 5->2 and then shrink no further, even if the over-scaled alarm condition persists. So the min group size is never resumed. This may seem like an edge case, but is actually quite likely to be hit especially if the adjustment type is set to PercentChangeInCapacity, in which case it's non-trivial to choose min and max size such that a compounded application of the percentage delta always exactly lands on the upper and lower bounds. More intuitive behavior would be to truncate the adjustment to the upper or lower bound in the case of an over-shoot. Version-Release number of selected component (if applicable): openstack-heat-engine-2013.2-1.0.el6ost.noarch How reproducible: 100% Steps to Reproduce: 0. Install openstack, including heat & ceilometer. Ensure that the ceilometer compute agent is measuring cpu_util at a reasonable frequency (every minute as opposed to the default 10 mins): sudo sed -i '/^ *name: cpu_pipeline$/ { n ; s/interval: 600$/interval: 60/ }' /etc/ceilometer/pipeline.yaml sudo service openstack-ceilometer-compute restart 1. Upload the cirros images if not already present in glance: sudo yum install -y wget wget http://launchpad.net/cirros/trunk/0.3.0/+download/cirros-0.3.0-x86_64-uec.tar.gz tar zxvf cirros-0.3.0-x86_64-uec.tar.gz glance add name=cirros-aki is_public=true container_format=aki disk_format=aki < cirros-0.3.0-x86_64-vmlinuz glance add name=cirros-ari is_public=true container_format=ari disk_format=ari < cirros-0.3.0-x86_64-initrd glance add name=cirros-ami is_public=true container_format=ami disk_format=ami \ "kernel_id=$(glance index | awk '/cirros-aki/ {print $1}')" \ "ramdisk_id=$(glance index | awk '/cirros-ari/ {print $1}')" < cirros-0.3.0-x86_64-blank.img 2. Add a UserKey if not already present in nova: nova keypair-add --pub_key ~/.ssh/id_rsa.pub userkey 3. Create stack with the attached template: heat stack-create test_stack --template-file=template.yaml --parameters="KeyName=userkey;InstanceType=m1.tiny;ImageId=$CIRROS_AMI_IMAGE" 4. Wait for the stack creation to complete: watch "heat stack-show test_stack | grep status" 5. Check that the high and low CPU alarms transition into the alarm and ok states respectively within a couple of minutes: watch "ceilometer alarm-list | grep test_stack" 6. Verify that the peak number of servers never goes beyond 5 (whereas the declared MaxSize is 6): watch "nova list | grep ServerGroup | wc -l" Actual results: The autoscaling group will max out at 5 instances, regardless of how long the underscaled alarm persists for. Expected results: The autoscaling group should max out at 6 instances. Additional info: This issue would also occur with native cloudwatch-style alarming, as opposed to ceilometer alarming.