Bug 809621 - Conductor shouldn't attempt to deploy if configserver is unavailable
Conductor shouldn't attempt to deploy if configserver is unavailable
Status: CLOSED ERRATA
Product: CloudForms Cloud Engine
Classification: Red Hat
Component: aeolus-conductor (Show other bugs)
1.0.0
Unspecified Unspecified
unspecified Severity high
: rc
: ---
Assigned To: Matt Wagner
wes hayutin
: Triaged, ZStream
Depends On:
Blocks: 826116
  Show dependency treegraph
 
Reported: 2012-04-03 16:17 EDT by James Laska
Modified: 2014-08-17 18:27 EDT (History)
13 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Launching an instance with the configserver inactive created instances in a forever pending state were unable to be deleted. This bug fix updates deployment.rb so that the configserver returns a 503 error. This provides instances a create_failed status when the configserver is not reachable.
Story Points: ---
Clone Of:
: 826116 (view as bug list)
Environment:
Last Closed: 2012-12-04 10:02:36 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2012:1516 normal SHIPPED_LIVE CloudForms Cloud Engine 1.1 update 2012-12-04 14:51:45 EST

  None (edit)
Description James Laska 2012-04-03 16:17:29 EDT
Description of problem:

If a working configserver becomes inactive, any attempts to launch a new application deployment will result in stuck deployments.  The stuck deployments cannot be stopped or deleted.  To prevent this, conductor should check to see if the configserver is operational, before launching.

Version-Release number of selected component (if applicable):
 * aeolus-conductor-0.8.7-1.el6.src.rpm
 * aeolus-configure-2.5.2-1.el6.src.rpm
 * imagefactory-1.0.0rc11-1.el6.src.rpm
 * oz-0.8.0-5.el6.src.rpm
 * rubygem-aeolus-cli-0.3.1-1.el6.src.rpm
 * rubygem-aeolus-image-0.3.0-12.el6.src.rpm

How reproducible:
 * 2 out of 2 attempts

Steps to Reproduce:
1. Install and configure Aeolus conductor capable
2. Deploy and configure a working configserver
3. Update the cloud provider account information with valid configserver information
4. Make the configserver go away (block all traffic with iptables, or shut it down)
5. Attempt to launch an application that relies on configserver
  
Actual results:

The UI provides the following notifications:

> Warnings
> Failed to launch following component blueprints:

> Errors
> systemNo route to host - connect(2)

 * At this point, conductor shows a deployment in the 'new' state.  It never leaves that state, and I cannot delete the application.

Expected results:

I'd expect to either ...
 1) not be allow to deploy when the cfgserver is out of reach
 2) or, be able to delete failed deployments that resulted from missing cfgserver

Additional info:

 * See attached debug tarball
Comment 1 wes hayutin 2012-04-03 16:42:50 EDT
related to https://bugzilla.redhat.com/show_bug.cgi?id=796528 possibly
Comment 2 Jan Provaznik 2012-05-23 10:41:30 EDT
I believe that patch for https://bugzilla.redhat.com/show_bug.cgi?id=796528 fixes this too.
Comment 3 Matt Wagner 2012-05-23 15:12:58 EDT
Confirmed -- the patch for #796528 does resolve this issue. With an unreachable config server, instances go directly to create_failed state. It's on master, but not backported anywhere yet. I'm setting this to "modified" to match that bug.
Comment 4 Matt Wagner 2012-05-25 13:59:55 EDT
The relevant commits on https://bugzilla.redhat.com/show_bug.cgi?id=796528 are:
7a8502b846a819c27fa141621220ca0bbaeac23c
56016671e651cf17bb0bc5c29b49c5aa55e94536
3dd5f304b8458528d15ddf462e0db7622b56dd09
86987cd9194c344272c0cfff313edcdf66df80c0

Though it sounds like QE isn't pleased with 796528 yet.
Comment 7 dgao 2012-09-20 13:56:08 EDT
This bug is no longer an issue since deployment can be made even with a disabled configserver. The audrey agent simply receives a 503 status. 

aeolus-conductor-0.13.7-1.el6cf.noarch
Comment 9 errata-xmlrpc 2012-12-04 10:02:36 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2012-1516.html

Note You need to log in before you can comment on or make changes to this bug.