Bug 1208373 - 503 error in CFME when connecting RHELOSP with no Swift service
Summary: 503 error in CFME when connecting RHELOSP with no Swift service
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Providers
Version: 5.3.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: GA
: 5.5.0
Assignee: Ladislav Smola
QA Contact: Milan Falešník
URL:
Whiteboard: openstack
Depends On: 1208579
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-04-02 06:35 UTC by Jerome Marc
Modified: 2015-12-08 13:04 UTC (History)
9 users (show)

Fixed In Version: 5.5.0.12
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-12-08 13:04:45 UTC
Category: ---
Cloudforms Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:2551 0 normal SHIPPED_LIVE Moderate: CFME 5.5.0 bug fixes and enhancement update 2015-12-08 17:58:09 UTC

Description Jerome Marc 2015-04-02 06:35:16 UTC
Description of problem:
Deploying RHELOSP5 (and 6) with Foreman does no longer provide the option to deploy Swift. In 5 the swift proxy is deployed but it's not in front of anything and just throws errors (could be the same behavior in 6).

When connecting CFME to an RHELOSP5 install (and presumably 6 as well) the initial state gathering (Refresh power relationships & states) fails on the swift service.

That failure can be either be a 503 error (because swift proxy is up but there are no devices behind it) or a flat out no connection error (if swift proxy is down).

Looking at /var/www/miq/vmdb/app/models/ems_refresh/parsers/openstack.rb it looks like CFME is just hardcoded to go and get storage resources:

CFME reports all details when commenting out storage resources in that file:

       # @storage_service      = @os_handle.detect_storage_service
       # @storage_service_name = @os_handle.storage_service_name 

Could we ensure that CFME handles error in a better way (e.g. not throwing an exception and leaving the inspection)? or ask keystone about which services are offered and running prior to query them.

Version-Release number of selected component (if applicable):
CloudForms 3.1
CFME 5.3.3.2.20150217120931_a465215 

How reproducible:
100%

Comment 2 Greg Blomquist 2015-04-02 14:47:33 UTC
Jerome,

CFME actually does query Keystone for the list of available services.  The OpenStack installer is leaving the OpenStack environment in an invalid state.  It sets up a Swift proxy, it registers the proxy with Keystone, but it never actually sets up the Swift service.

This means that when CFME asks Keystone if Swift is available, Keystone reports back that Swift *is* available, then when CFME attempt to connect to Swift, the service is not actually available and we get back a 503 from the proxy.

It looks like you used StayPuft to install OpenStack.  Can you confirm that?

We will have to open a bug against StayPuft for this situation.

Comment 3 August Simonelli 2015-04-06 22:31:30 UTC
Hi Greg ... I worked with Jerome on this and did the installs, etc. 

Yes, it is indeed the openstack installer (aka staypuft) used to install. I agree that the installer leaves openstack in an unfinished state but not invalid. Since we no longer ship/encourage a local disk based swift we basically leave the proxy waiting for a backend to be added. 

However, in troubelshooting the error i did a few things that make me think it wouldn't have mattered anyway ... 

First i turned off the swift proxy entirely but CFME got nonconnect errors (not 503s) and still failed to connect the services.

Next I tried removing the swift endpoint altogether (which i agree should not be there if the user does't want swift). Unfortunately even with the endpoint gone CFME still seems to go after the controller's IP on port 8080 (and get 503s or noconnect). 

Based on the code workaround i found (which Jerome lists above) as well as my tests it seems more like CFME is just assuming storage on 8080 and asking for it.

Regardless of how CFME is getting the info, it'd be great if a failure on one service didn't stop the whole thing in its tracks. Is there a way it can just skip it and still connect up CFME to OpenStack?

Comment 4 Ladislav Smola 2015-07-20 08:09:33 UTC
@greg can we turn this into RFE Make certain services in refresh optional (Swift, Cinder, ..). Cause this is how it is designed, that all resfreshed services are mandatory, we will need refactor refreshes a bit.

Or is this bug needed cause we want to backport small fix?

Comment 5 Greg Blomquist 2015-07-22 01:59:36 UTC
@ladas, I'm fine with just having the openstack_handle deal with errors from the services.

A nice longer term fix might be to keep an relatively up-to-date status of the services reported by Keystone.  I think we have this for TripleO-based OpenStack Infrastructure installations.  However, if we had this for OSP-based OpenStack Cloud installations, we could simply look at our cached status of the services before trying to connect. 

I suggest going with the easier fix first, then we could have a longer term fix come later.

Comment 6 August Simonelli 2015-08-20 08:40:55 UTC
I've just done a CFME to RHELOSP7 install and this is still an issue. I've had to do this workaround to get past swift as there is a swift proxy but no backend. Is there a plan to remove this requirement?

Comment 7 Ladislav Smola 2015-08-20 09:12:48 UTC
Yes I will do a quick fix for cinder and swift, for deployments without storage. Though this is just a medium priority, meaning it's far in my TODO list. :-) Talk to Greg about the priority, if you think this is needed quickly.

Comment 8 Marius Cornea 2015-10-09 08:45:06 UTC
I did the following test on CFME 5.4 with OSP7:

1. Stop openstack-swift-proxy.service on controller node. Refresh failed with end of file reached (EOFError) error message. This is probably due to haproxy listening on port 8080, otherwise it would have been a timeout. 
http://paste.openstack.org/show/475836/

2. Stop all other Swift services except the openstack-swift-proxy.service. This resulted in a failed refresh and a 503 error message. 
http://paste.openstack.org/show/475835/

Refresh completed fine in both scenarios after deleting the Swift Keystone endpoint.

Comment 9 Ladislav Smola 2015-11-04 10:08:19 UTC
changing to 5.6, @Greg do we need to backport this to 5.4? As stated in comment 4, 'Refresh completed fine in both scenarios after deleting the Swift Keystone endpoint.'. Which is the actual fix of the issue stated in comment 3.


https://github.com/ManageIQ/manageiq/pull/5274 brings nicely formatted exception. That probably doesn't need to be backported to 5.4

Comment 10 Ladislav Smola 2015-11-04 10:28:26 UTC
@marius, for testing of the nice exception of the stories in comment 8, you will also need this PR https://github.com/ManageIQ/manageiq/pull/5275 . Swift was missing in the generic handling.

Comment 11 CFME Bot 2015-11-13 16:20:29 UTC
New commit detected on ManageIQ/manageiq/master:
https://github.com/ManageIQ/manageiq/commit/7f18a21fe0b061c41920b9440b5f72bab7d02984

commit 7f18a21fe0b061c41920b9440b5f72bab7d02984
Author:     Ladislav Smola <lsmola>
AuthorDate: Wed Nov 4 10:40:21 2015 +0100
Commit:     Ladislav Smola <lsmola>
CommitDate: Thu Nov 5 08:40:10 2015 +0100

    OpenStack check all list related exceptions
    
    OpenStack check all list related exceptions and throw specific
    exception, that has a nice format message. This message shows
    what exactly is wrong in the refresh last state.
    
    Fixes BZ
    https://bugzilla.redhat.com/show_bug.cgi?id=1208373

 gems/pending/openstack/openstack_handle/handled_list.rb | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comment 13 CFME Bot 2015-11-23 15:03:59 UTC
New commit detected on cfme/5.5.z:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=7abd5b7af96cf69d7b4a3b0bd5cbe1036407b000

commit 7abd5b7af96cf69d7b4a3b0bd5cbe1036407b000
Author:     Ladislav Smola <lsmola>
AuthorDate: Wed Nov 4 10:40:21 2015 +0100
Commit:     Ladislav Smola <lsmola>
CommitDate: Mon Nov 23 08:37:35 2015 +0100

    OpenStack check all list related exceptions
    
    OpenStack check all list related exceptions and throw specific
    exception, that has a nice format message. This message shows
    what exactly is wrong in the refresh last state.
    
    Fixes BZ
    https://bugzilla.redhat.com/show_bug.cgi?id=1208373

 gems/pending/openstack/openstack_handle/handled_list.rb | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comment 14 CFME Bot 2015-11-23 15:04:05 UTC
New commit detected on cfme/5.5.z:
https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=de893177b8a6ce0418e4f38618cadbb458558768

commit de893177b8a6ce0418e4f38618cadbb458558768
Merge: e527788 737daf9
Author:     Greg Blomquist <gblomqui>
AuthorDate: Mon Nov 23 09:21:01 2015 -0500
Commit:     Greg Blomquist <gblomqui>
CommitDate: Mon Nov 23 09:21:01 2015 -0500

    Merge branch 'bz1208373' into '5.5.z'
    
    Bz1208373
    
    Check presence of required service and throw nicely formatted exception. When list method fails, throw nicely formatted exception.
    
    Convert swift refresh to handled_list. Requires project(tenant) to be set on each object, since the object obtained from APi doesn't contain the tenant owner.
    
    Fixes BZ
    https://bugzilla.redhat.com/show_bug.cgi?id=1208373
    
    Clean cherry pick of:
    https://github.com/ManageIQ/manageiq/pull/5274
    
    Conflict in cherry-pick of:
    https://github.com/ManageIQ/manageiq/pull/5275
    
    Conflict was in commit ff5a86898d18325b3ecdcbd042528ae673d60364, that has refreshed VCRs. This commit was replaced by freshly refreshed VCRs
    
    See merge request !518

 .../openstack/cloud_manager/refresh_parser.rb      |    16 +
 .../openstack/infra_manager/refresh_parser.rb      |    20 +
 .../openstack/refresh_parser_common/objects.rb     |    11 +-
 gems/pending/openstack/openstack_handle/handle.rb  |     7 +-
 .../openstack/openstack_handle/handled_list.rb     |    12 +
 .../spec/openstack/openstack_handle/handle_spec.rb |     3 +-
 gems/pending/util/miq-exception.rb                 |     8 +
 .../openstack/cloud_manager/refresh_spec_common.rb |     4 +
 .../cloud_manager/refresher_rhos_grizzly.yml       |   944 +-
 .../cloud_manager/refresher_rhos_havana.yml        |  1869 +-
 .../cloud_manager/refresher_rhos_icehouse.yml      |  8337 +++----
 .../cloud_manager/refresher_rhos_juno.yml          |  2265 +-
 .../cloud_manager/refresher_rhos_kilo.yml          |  8797 ++++----
 .../refresher_rhos_kilo_keystone_v3.yml            |  3331 +--
 .../infra_manager/refresher_rhos_juno.yml          | 21823 ++++++++-----------
 15 files changed, 22640 insertions(+), 24807 deletions(-)

Comment 15 Milan Falešník 2015-11-26 14:53:39 UTC
Cehcked in 5.5.0.12, I deployed my Openstack instance without Swift and it refreshed.

Comment 17 errata-xmlrpc 2015-12-08 13:04:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:2551


Note You need to log in before you can comment on or make changes to this bug.