Bug 1236372
Summary: | HAProxy health check for nova_ec2 fails for all backend nodes | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Marius Cornea <mcornea> |
Component: | rhosp-director | Assignee: | Jiri Stransky <jstransk> |
Status: | CLOSED ERRATA | QA Contact: | Giulio Fidente <gfidente> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7.0 (Kilo) | CC: | abeekhof, achernet, bperkins, dmacpher, eglynn, fdinitto, gfidente, hbrock, jstransk, kbasil, mburns, nbarcet, rhel-osp-director-maint, sgordon |
Target Milestone: | --- | Keywords: | Triaged, ZStream |
Target Release: | 8.0 (Liberty) | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
A misconfiguration of the health check for Nova EC2 API caused HAProxy to believe the API was down. This meant the API was unreachable through HAProxy. This fix corrects the health check to query the API service state correctly. Now the Nova EC2 API is reachable through HAProxy.
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2016-04-07 21:37:52 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Marius Cornea
2015-06-28 10:53:36 UTC
Giulio, please let us know once you have had a chance look at this to assess whether it's really a blocking issue. JK. Jiri, please take a look and let us know. Thanks Marius for excellent info in the bug description. EC2 API returns 400 unless authentication details are provided, and HAProxy then thinks there's some problem with the service. However, the core issue for us here is that EC2 API probably shouldn't be running at all, as it's not present in the Pacemaker HA docs and we don't have it pacemakerized. It's getting started and haproxied because the puppet modules take over the defaults from non-pacemaker deployment. I don't think having it running would negatively affect the other services, but we probably shouldn't run services which we don't support. I think disabling it should be easy. (Correction of my comment #5 -- we do pacemakerize the EC2 API because the APIs are served by a single service openstack-nova-api, it's just listening on multiple ports.) The Pacemaker HA docs probably don't disable EC2 API because they don't set the enabled_apis config option and the default is "ec2,osapi_compute,metadata" [1]. But there's no mention of port 8773 in the loadbalancer doc [2]. So in the ref arch the EC2 API is enabled (accessible on physical IPs) but not HAProxied (not accessible on VIP)? I can update the config to do exactly that but i'm not sure if that's the expected correct state. Andrew can you please shed some light on this? Maybe we should disable the EC2 API entirely, or add it to HAProxy? [1] https://github.com/beekhof/osp-ha-deploy/blob/f73eec96ddd9c7f2c85dbb0348ff909f144631ec/pcmk/nova.scenario [2] https://github.com/beekhof/osp-ha-deploy/blob/f73eec96ddd9c7f2c85dbb0348ff909f144631ec/pcmk/lb.scenario (In reply to Jiri Stransky from comment #7) > (Correction of my comment #5 -- we do pacemakerize the EC2 API because the > APIs are served by a single service openstack-nova-api, it's just listening > on multiple ports.) > > The Pacemaker HA docs probably don't disable EC2 API because they don't set > the enabled_apis config option and the default is > "ec2,osapi_compute,metadata" [1]. But there's no mention of port 8773 in the > loadbalancer doc [2]. So in the ref arch the EC2 API is enabled (accessible > on physical IPs) but not HAProxied (not accessible on VIP)? I can update the > config to do exactly that but i'm not sure if that's the expected correct > state. Andrew can you please shed some light on this? Maybe we should > disable the EC2 API entirely, or add it to HAProxy? I can narrow the question down. Do we want to support the EC2 API. yes - add port 8773 to the load balancer. no - disable EC2 with the enabled_apis config option. The fact that EC2 is enabled yet not load balanced indicates that this was an oversight on our part. The decision as to whether or not we are interested in exposing the EC2 API is not something I can answer. Talked to jayg on daily scrum - Astapor doesn't expose EC2 API because it doesn't behave well under HA. There's no port 8773 in Astapor's Nova loadbalancer manifest either [1], maybe the right thing could be to stay consistent with Astapor then. [1] https://github.com/redhat-openstack/astapor/blob/d793eeb5f559874bab95177189aacbdcf06c092e/puppet/modules/quickstack/manifests/load_balancer/nova.pp > I can narrow the question down. Do we want to support the EC2 API.
>
> yes - add port 8773 to the load balancer.
> no - disable EC2 with the enabled_apis config option.
>
> The fact that EC2 is enabled yet not load balanced indicates that this was
> an oversight on our part. The decision as to whether or not we are
> interested in exposing the EC2 API is not something I can answer.
The EC2 API is supported from a Nova point of view, albeit deprecated at this point in the hope that we will eventually move to the newer out of tree EC2 API implementation which includes broader coverage of the relevant APIS. We do have a number of customers that use it.
I do not however believe this is a release blocker, as long as at worst we release with a similar level of inclusivity with regards to the EC2 API as we did in the OpenStack Platform 6 deployment architecture.
Done upstream. Can be backported once we get acks. Using openstack-puppet-modules-7.0.16-1.el7ost.noarch.rpm the HAProxy listener is not using httpchk anymore for the nova_ec2 listener; backends are still seen DOWN though because we don't enable the ec2 API by default. I think the BZ can be closed as the problem won't be seen anymore if the service is enabled. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHEA-2016-0604.html |