Bug 1430753
Summary: | Heat not able to maintain loadbalancer minimum member count | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | VIKRANT <vaggarwa> |
Component: | openstack-heat | Assignee: | Zane Bitter <zbitter> |
Status: | CLOSED ERRATA | QA Contact: | Ronnie Rasouli <rrasouli> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 9.0 (Mitaka) | CC: | aarapov, mburns, pkundal, rhel-osp-director-maint, sbaker, shardy, srevivo, therve, tvignaud, zbitter |
Target Milestone: | rc | Keywords: | Triaged |
Target Release: | 12.0 (Pike) | Flags: | tvignaud:
needinfo+
|
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | openstack-heat-9.0.0-0.20170728194225.cc4fdce.el7ost | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-12-13 21:13:42 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
VIKRANT
2017-03-09 13:56:06 UTC
stack-show on the nested (autoscaling group) stack will tell you why it failed. You can get the uuid of the nested stack by doing "openstack stack resource show WebServer-Stack WebServerASG", it's listed as the physical_resource_id. So the error is: resources.tdhf2cznzqnd: StackValidationFailed: resources.member: Property error: member.Properties.address: Error validating value '' So it looks like the scaling unit is a nested stack, and that it contains a resource named 'member' with a property called 'address', and the address is resolving to an empty string when it needs to be a valid IP address. This could be a problem with the template, or it could be a bug in Heat. (At the validation stage, intrinsic functions like {get_attr: } don't return valid values, but Heat ought to cope with that gracefully.) Could you attach the lb_server.yaml template? Here is the conten of lb_server.yaml template. ~~~ heat_template_version: 2013-05-23 description: A load-balancer server parameters: image: type: string description: Image used for servers key_name: type: string description: SSH key to connect to the servers flavor: type: string description: flavor used by the servers pool_id: type: string description: Pool to contact user_data: type: string description: Server user_data metadata: type: json network: type: string description: Network used by the server resources: server: type: OS::Nova::Server properties: flavor: {get_param: flavor} image: {get_param: image} key_name: {get_param: key_name} metadata: {get_param: metadata} user_data: {get_param: user_data} user_data_format: RAW networks: [{network: {get_param: network} }] member: type: OS::Neutron::PoolMember properties: pool_id: {get_param: pool_id} address: {get_attr: [server, first_address]} protocol_port: 80 outputs: server_ip: description: IP Address of the load-balanced server. value: { get_attr: [server, first_address] } lb_member: description: LB member details. value: { get_attr: [member, show] } ~~~ OK, that looks like a Heat bug then... {get_attr: [server, first_address]} probably returns an empty string during validation, and the validation ought to be able to handle that but apparently it's complaining. I wonder how it managed to create the autoscaling group in the first place without running into this issue for the initial members... I suspect it may be failing to get the IP addresses of the the _existing_ members. For resources that aren't created yet, we should always get None returned for their attribute values without even asking the resource. An empty string (which is returned by the resource itself) suggests that the resource was in the created state but getting the server's address failed for some reason. That also explains how the group could be created initially but updating it fails. The first_address attribute is deprecated. You should probably replace that line with: address: {get_attr: [server, networks, {get_param: network}, 0]} That might actually resolve the problem. Thanks Zane. Suggested the same to Cu. Awaiting Cu. response. Zane, as per the latest update from Cu. they are still hitting the issue after making suggested change. Hello Zane, Here's the updated template received from the customer: heat_template_version: 2013-05-23 description: A load-balancer instance parameters: image: type: string description: Image used for instances key_name: type: string description: SSH key to connect to the instances flavor: type: string description: flavor used by the instances pool_id: type: string description: Pool to contact user_data: type: string description: Server user_data metadata: type: json network: type: string description: Network used by the instance #security_groups: # type: string # description: Webinstance Security group resources: server: type: OS::Nova::Server properties: flavor: {get_param: flavor} image: {get_param: image} key_name: {get_param: key_name} metadata: {get_param: metadata} user_data: {get_param: user_data} #security_groups: webserverSG #security_groups: [{security_groups: {get_param: security_groups}}] user_data_format: RAW networks: [{network: {get_param: network} }] member: type: OS::Neutron::PoolMember properties: pool_id: {get_param: pool_id} address: {get_attr: [server, networks, {get_param: network}, 0]} #address: {get_attr: [instance, first_address]} protocol_port: 80 outputs: # instance_ip: # description: IP Address of the load-balanced instance. #value: { get_attr: [instance, first_address] } lb_member: description: LB member details. value: { get_attr: [member, show] } As mentioned by vikrant, the customer is still hitting the same issue. Thanks. OK, after reading more carefully here, I see the cause of the problem. You're deleting a server from Nova manually, but Heat doesn't know that it's missing, so when it comes to validate the template the server is not found. This causes it to return a default value for the IP address (an empty string, as it happens), and that is being rejected as a valid IP address by the pool member. I'm not sure why this would have worked in Liberty but not Mitaka. Possibly the validation became more robust in Mitaka. One thing you can do is mark the server resource that you've deleted as FAILED using the 'resource mark-unhealthy' command. That should convince the autoscaling template generator to remove that resource from the template. I'll continue investigating to see if there's a way we can avoid the error in this case. Hello Zane, Many thanks for suggesting the resource mark-unhealthy command. This suggestion has worked for the customer. After marking the deleting the resource as unhealthy, new instance was spawned automatically. Could you please advice if there is a way in which this can be incorporated in the heat template itself ? Thanks and Regards, Punit This should get fixed in Pike by https://review.openstack.org/#/c/422983/ when it merges. For current releases... we _could_ fix the {get_attr: [instance, first_address]} attribute by changing the default value that it returns to something that will pass the IP address constraint, i.e. '0.0.0.0' instead of ''. But this attribute is already deprecated. There's no sane way to make {get_attr: [server, networks, {get_param: network}, 0]} not return None. I haven't seen the error message for this case, but I suspect it's failing in a different spot - it's a required property with a value of None (which reads as nothing specified) at a time when Heat is expecting to have resolved the real value. I can't think of anything we can do to resolve that. Configured the following stack template: heat_template_version: pike resources: server: type: OS::Nova::Server properties: image: cirros-0.3.5-x86_64-disk flavor: m1.nano networks: - network: heat-net - subnet: heat-subnet value: type: OS::Heat::Value properties: value: {get_attr: [server, first_address]} type: string heat_template_version: pike resources: asg: type: OS::Heat::AutoScalingGroup properties: resource: type: server.yaml min_size: 2 desired_capacity: 3 max_size: 5 scale_up_policy: type: OS::Heat::ScalingPolicy properties: adjustment_type: change_in_capacity auto_scaling_group_id: {get_resource: asg} cooldown: 60 scaling_adjustment: 1 scale_dn_policy: type: OS::Heat::ScalingPolicy properties: adjustment_type: change_in_capacity auto_scaling_group_id: {get_resource: asg} cooldown: 60 scaling_adjustment: '-1' outputs: scale_up_url: value: {get_attr: [scale_up_policy, alarm_url]} scale_dn_url: value: {get_attr: [scale_dn_policy, alarm_url]} Server has been deleted and by creting this stack 3 new servers has been scaled with the server stack attributes Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3462 |