Bug 838726 - Failed deployments can't be removed
Summary: Failed deployments can't be removed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: CloudForms Cloud Engine
Classification: Retired
Component: aeolus-conductor
Version: 1.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: rc
Assignee: Jan Provaznik
QA Contact: Rehana
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-07-09 23:37 UTC by Justin Clift
Modified: 2015-07-13 04:35 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cloud Engine failed at some attempts to delete unsuccessfully launched applications. This caused the faulty application to remain indefinitely. This bug fix sets a CREATE_FAILED state for failed instances, which allows the deletion of failed applications.
Clone Of:
Environment:
Last Closed: 2012-12-04 15:12:27 UTC
Embargoed:


Attachments (Terms of Use)
Screencast showing deployment removal problem. (8.07 MB, video/mp4)
2012-07-09 23:37 UTC, Justin Clift
no flags Details
Screenshot showing flash message about deployment not being delete-able. (194.84 KB, image/png)
2012-07-14 06:33 UTC, Justin Clift
no flags Details
History tab for the instance that refuses to go away. (202.02 KB, image/png)
2012-07-14 06:34 UTC, Justin Clift
no flags Details
Failed deployment listed (154.05 KB, image/png)
2012-09-21 19:49 UTC, Ronelle Landy
no flags Details
Failed deployement deleted (142.93 KB, image/png)
2012-09-21 19:50 UTC, Ronelle Landy
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2012:1516 0 normal SHIPPED_LIVE CloudForms Cloud Engine 1.1 update 2012-12-04 19:51:45 UTC

Description Justin Clift 2012-07-09 23:37:49 UTC
Created attachment 597180 [details]
Screencast showing deployment removal problem.

Description of problem:

  With upstream Aeolus 0.10.x rpms (on F16),
  it's not possible to remove failed deployments.

  For example, with a recent setup, I have 3 wordpress
  deployments which failed when being started up.

  All three completely refuse to go away, no matter
  what I try to do to delete them.

  Screencast showing the problem (.mp4) attached.
  The deployment instance history and other info tabs
  are shown in the screencast.


Version-Release number of selected component (if applicable):

  aeolus-all-0.10.4-1.fc16.noarch
  aeolus-conductor-0.10.4-1.fc16.noarch
  aeolus-conductor-daemons-0.10.4-1.fc16.noarch
  aeolus-conductor-devel-0.10.4-1.fc16.noarch
  aeolus-conductor-doc-0.10.4-1.fc16.noarch
  aeolus-configure-2.6.0-1.fc16.noarch
  rubygem-aeolus-cli-0.5.0-1.fc16.noarch
  rubygem-aeolus-image-0.5.0-1.fc16.noarch

How reproducible:

  Every time.


Steps to Reproduce:
1. Probably the easiest way to cause a failed
   deployment, is by manually removing a
   VMware image from the backend NFS storage,
   once it's been pushed.
2. After that, Aeolus will attempt to launch
   the deployment, but the instance with
   missing storage will be status
   "create_failed".  It's deployment will not
   be removable.

Comment 1 Jan Provaznik 2012-07-12 18:46:17 UTC
a patch sent: https://fedorahosted.org/pipermail/aeolus-devel/2012-July/011266.html

Comment 2 Jan Provaznik 2012-07-13 14:27:04 UTC
pushed to master, commits:
8290aa1cfef19e54252256ea84f81120f46a6b99
94b243ac4af6b416e48528457d8bb4b5bcdc05b6

Comment 3 Justin Clift 2012-07-14 06:32:58 UTC
Found a case where deployments still can be killed.  Screenshots attached showing the problem, the History tab for the instance in question, and the related error in the VMware screen that I think caused it.

Looks like a follow up patch is needed. ;)

Comment 4 Justin Clift 2012-07-14 06:33:47 UTC
Created attachment 598220 [details]
Screenshot showing flash message about deployment not being delete-able.

Comment 5 Justin Clift 2012-07-14 06:34:26 UTC
Created attachment 598221 [details]
History tab for the instance that refuses to go away.

Comment 6 Justin Clift 2012-07-23 08:36:47 UTC
s/still can be killed/still can't be killed/

Comment 7 Jan Provaznik 2012-07-24 12:24:21 UTC
the issue described in Comment 3 is not conductor-related, it's probably dc-api bug, created new bug for it here: https://issues.apache.org/jira/browse/DTACLOUD-287

Comment 9 Ronelle Landy 2012-09-21 19:49:11 UTC
Tested rpms:

>> rpm -qa |grep aeolus
aeolus-configure-2.8.6-1.el6cf.noarch
rubygem-aeolus-image-0.3.0-12.el6.noarch
rubygem-aeolus-cli-0.7.1-1.el6cf.noarch
aeolus-conductor-0.13.8-1.el6cf.noarch
aeolus-conductor-daemons-0.13.8-1.el6cf.noarch
aeolus-conductor-doc-0.13.8-1.el6cf.noarch
aeolus-all-0.13.8-1.el6cf.noarch

Created a failed deployment by launching an instance in a rhevm realm with no hosts to start the vm. I was able to delete the deployment from the /conductor/deployments page.

See the attached screenshots.

Marking this BZ as 'verified'

Comment 10 Ronelle Landy 2012-09-21 19:49:58 UTC
Created attachment 615603 [details]
Failed deployment listed

Comment 11 Ronelle Landy 2012-09-21 19:50:34 UTC
Created attachment 615604 [details]
Failed deployement deleted

Comment 13 errata-xmlrpc 2012-12-04 15:12:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2012-1516.html


Note You need to log in before you can comment on or make changes to this bug.