Description of problem: Deploying RHELOSP5 (and 6) with Foreman does no longer provide the option to deploy Swift. In 5 the swift proxy is deployed but it's not in front of anything and just throws errors (could be the same behavior in 6). When connecting CFME to an RHELOSP5 install (and presumably 6 as well) the initial state gathering (Refresh power relationships & states) fails on the swift service. That failure can be either be a 503 error (because swift proxy is up but there are no devices behind it) or a flat out no connection error (if swift proxy is down). Looking at /var/www/miq/vmdb/app/models/ems_refresh/parsers/openstack.rb it looks like CFME is just hardcoded to go and get storage resources: CFME reports all details when commenting out storage resources in that file: # @storage_service = @os_handle.detect_storage_service # @storage_service_name = @os_handle.storage_service_name Could we ensure that CFME handles error in a better way (e.g. not throwing an exception and leaving the inspection)? or ask keystone about which services are offered and running prior to query them. Version-Release number of selected component (if applicable): CloudForms 3.1 CFME 5.3.3.2.20150217120931_a465215 How reproducible: 100%
Jerome, CFME actually does query Keystone for the list of available services. The OpenStack installer is leaving the OpenStack environment in an invalid state. It sets up a Swift proxy, it registers the proxy with Keystone, but it never actually sets up the Swift service. This means that when CFME asks Keystone if Swift is available, Keystone reports back that Swift *is* available, then when CFME attempt to connect to Swift, the service is not actually available and we get back a 503 from the proxy. It looks like you used StayPuft to install OpenStack. Can you confirm that? We will have to open a bug against StayPuft for this situation.
Hi Greg ... I worked with Jerome on this and did the installs, etc. Yes, it is indeed the openstack installer (aka staypuft) used to install. I agree that the installer leaves openstack in an unfinished state but not invalid. Since we no longer ship/encourage a local disk based swift we basically leave the proxy waiting for a backend to be added. However, in troubelshooting the error i did a few things that make me think it wouldn't have mattered anyway ... First i turned off the swift proxy entirely but CFME got nonconnect errors (not 503s) and still failed to connect the services. Next I tried removing the swift endpoint altogether (which i agree should not be there if the user does't want swift). Unfortunately even with the endpoint gone CFME still seems to go after the controller's IP on port 8080 (and get 503s or noconnect). Based on the code workaround i found (which Jerome lists above) as well as my tests it seems more like CFME is just assuming storage on 8080 and asking for it. Regardless of how CFME is getting the info, it'd be great if a failure on one service didn't stop the whole thing in its tracks. Is there a way it can just skip it and still connect up CFME to OpenStack?
@greg can we turn this into RFE Make certain services in refresh optional (Swift, Cinder, ..). Cause this is how it is designed, that all resfreshed services are mandatory, we will need refactor refreshes a bit. Or is this bug needed cause we want to backport small fix?
@ladas, I'm fine with just having the openstack_handle deal with errors from the services. A nice longer term fix might be to keep an relatively up-to-date status of the services reported by Keystone. I think we have this for TripleO-based OpenStack Infrastructure installations. However, if we had this for OSP-based OpenStack Cloud installations, we could simply look at our cached status of the services before trying to connect. I suggest going with the easier fix first, then we could have a longer term fix come later.
I've just done a CFME to RHELOSP7 install and this is still an issue. I've had to do this workaround to get past swift as there is a swift proxy but no backend. Is there a plan to remove this requirement?
Yes I will do a quick fix for cinder and swift, for deployments without storage. Though this is just a medium priority, meaning it's far in my TODO list. :-) Talk to Greg about the priority, if you think this is needed quickly.
I did the following test on CFME 5.4 with OSP7: 1. Stop openstack-swift-proxy.service on controller node. Refresh failed with end of file reached (EOFError) error message. This is probably due to haproxy listening on port 8080, otherwise it would have been a timeout. http://paste.openstack.org/show/475836/ 2. Stop all other Swift services except the openstack-swift-proxy.service. This resulted in a failed refresh and a 503 error message. http://paste.openstack.org/show/475835/ Refresh completed fine in both scenarios after deleting the Swift Keystone endpoint.
changing to 5.6, @Greg do we need to backport this to 5.4? As stated in comment 4, 'Refresh completed fine in both scenarios after deleting the Swift Keystone endpoint.'. Which is the actual fix of the issue stated in comment 3. https://github.com/ManageIQ/manageiq/pull/5274 brings nicely formatted exception. That probably doesn't need to be backported to 5.4
@marius, for testing of the nice exception of the stories in comment 8, you will also need this PR https://github.com/ManageIQ/manageiq/pull/5275 . Swift was missing in the generic handling.
New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/7f18a21fe0b061c41920b9440b5f72bab7d02984 commit 7f18a21fe0b061c41920b9440b5f72bab7d02984 Author: Ladislav Smola <lsmola> AuthorDate: Wed Nov 4 10:40:21 2015 +0100 Commit: Ladislav Smola <lsmola> CommitDate: Thu Nov 5 08:40:10 2015 +0100 OpenStack check all list related exceptions OpenStack check all list related exceptions and throw specific exception, that has a nice format message. This message shows what exactly is wrong in the refresh last state. Fixes BZ https://bugzilla.redhat.com/show_bug.cgi?id=1208373 gems/pending/openstack/openstack_handle/handled_list.rb | 12 ++++++++++++ 1 file changed, 12 insertions(+)
http://gitlab.cloudforms.lab.eng.rdu2.redhat.com/cloudforms/cfme/merge_requests/518
New commit detected on cfme/5.5.z: https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=7abd5b7af96cf69d7b4a3b0bd5cbe1036407b000 commit 7abd5b7af96cf69d7b4a3b0bd5cbe1036407b000 Author: Ladislav Smola <lsmola> AuthorDate: Wed Nov 4 10:40:21 2015 +0100 Commit: Ladislav Smola <lsmola> CommitDate: Mon Nov 23 08:37:35 2015 +0100 OpenStack check all list related exceptions OpenStack check all list related exceptions and throw specific exception, that has a nice format message. This message shows what exactly is wrong in the refresh last state. Fixes BZ https://bugzilla.redhat.com/show_bug.cgi?id=1208373 gems/pending/openstack/openstack_handle/handled_list.rb | 12 ++++++++++++ 1 file changed, 12 insertions(+)
New commit detected on cfme/5.5.z: https://code.engineering.redhat.com/gerrit/gitweb?p=cfme.git;a=commitdiff;h=de893177b8a6ce0418e4f38618cadbb458558768 commit de893177b8a6ce0418e4f38618cadbb458558768 Merge: e527788 737daf9 Author: Greg Blomquist <gblomqui> AuthorDate: Mon Nov 23 09:21:01 2015 -0500 Commit: Greg Blomquist <gblomqui> CommitDate: Mon Nov 23 09:21:01 2015 -0500 Merge branch 'bz1208373' into '5.5.z' Bz1208373 Check presence of required service and throw nicely formatted exception. When list method fails, throw nicely formatted exception. Convert swift refresh to handled_list. Requires project(tenant) to be set on each object, since the object obtained from APi doesn't contain the tenant owner. Fixes BZ https://bugzilla.redhat.com/show_bug.cgi?id=1208373 Clean cherry pick of: https://github.com/ManageIQ/manageiq/pull/5274 Conflict in cherry-pick of: https://github.com/ManageIQ/manageiq/pull/5275 Conflict was in commit ff5a86898d18325b3ecdcbd042528ae673d60364, that has refreshed VCRs. This commit was replaced by freshly refreshed VCRs See merge request !518 .../openstack/cloud_manager/refresh_parser.rb | 16 + .../openstack/infra_manager/refresh_parser.rb | 20 + .../openstack/refresh_parser_common/objects.rb | 11 +- gems/pending/openstack/openstack_handle/handle.rb | 7 +- .../openstack/openstack_handle/handled_list.rb | 12 + .../spec/openstack/openstack_handle/handle_spec.rb | 3 +- gems/pending/util/miq-exception.rb | 8 + .../openstack/cloud_manager/refresh_spec_common.rb | 4 + .../cloud_manager/refresher_rhos_grizzly.yml | 944 +- .../cloud_manager/refresher_rhos_havana.yml | 1869 +- .../cloud_manager/refresher_rhos_icehouse.yml | 8337 +++---- .../cloud_manager/refresher_rhos_juno.yml | 2265 +- .../cloud_manager/refresher_rhos_kilo.yml | 8797 ++++---- .../refresher_rhos_kilo_keystone_v3.yml | 3331 +-- .../infra_manager/refresher_rhos_juno.yml | 21823 ++++++++----------- 15 files changed, 22640 insertions(+), 24807 deletions(-)
Cehcked in 5.5.0.12, I deployed my Openstack instance without Swift and it refreshed.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2015:2551