Bug 820457 - Deleting a scaling app left gears behind
Summary: Deleting a scaling app left gears behind
Keywords:
Status: CLOSED DUPLICATE of bug 834663
Alias: None
Product: OKD
Classification: Red Hat
Component: Pod
Version: 2.x
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: ---
Assignee: Rob Millner
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-05-10 01:56 UTC by Rob Millner
Modified: 2015-05-15 01:53 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-25 19:52:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Run this on your dev instance to reproduce the problem (1.19 KB, application/x-shellscript)
2012-05-17 01:15 UTC, Rob Millner
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 834663 1 None None None 2021-01-20 06:05:38 UTC

Internal Links: 834663

Description Rob Millner 2012-05-10 01:56:42 UTC
Description of problem:
While exploring another bug; a scalable application was created through the REST API and scaled up 30 times.  It was then deleted through the REST API.

Deletion failed and reported:
      <severity>error</severity>
      <text>Failed to delete application pscale due to:Application gears already at zero for 'rmillner'</text>
      <exit-code>135</exit-code>

According to "rhc domain show", my application is still alive.  The primary gear is gone but one of the scale gears remains:
# ls -l /var/lib/stickshift/
total 4
drwxr-x---. 7 root e8b2ef55168440ef864a7334853e6338 4096 May  9 20:21 e8b2ef55168440ef864a7334853e6338
lrwxrwxrwx. 1 root root                               52 May  9 20:21 e8b2ef5516-rmillner0140 -> /var/lib/stickshift/e8b2ef55168440ef864a7334853e6338

Neither broker nor mcollective logs show any attempt to deconfigure on the gear.



Version-Release number of selected component (if applicable):
rhc-broker-0.92.4-1
rubygem-stickshift-controller-0.10.5-1
rubygem-stickshift-node-0.10.4-1.el6_2

How reproducible:
Unsure.

Steps to Reproduce:
1. Upgrade your account to 100 gears on a dev instance.
2. Create a scalable php application.
3. Issue 30 scale-up events.
4. Delete the application
  
Actual results:
Deletion fails and gears remain.  The application can no longer be deleted.

Expected results:
Nice clean deletion.

Additional info:

Severity set as medium but I'm going to tag this as a future feature since we don't expect to offer high scaling apps this (or even next?) sprint.

Comment 1 Rob Millner 2012-05-10 22:16:16 UTC
This behaviour doesn't repeat if you access the REST API directly via curl; just if the rhc tools are used.

Comment 2 Rob Millner 2012-05-17 01:15:01 UTC
Created attachment 585078 [details]
Run this on your dev instance to reproduce the problem

Comment 3 Rob Millner 2012-05-17 01:15:55 UTC
Was able to reproduce this behaviour via the REST API again.

A scalable jboss app was created and scaled up 30 times.  After deletion; two gears remained.

Comment 4 Rob Millner 2012-06-18 17:56:44 UTC
Setting severity low since this is well outside our offered scaling.

Comment 5 Rob Millner 2012-06-20 00:33:09 UTC
Was able to show that both lingering gears in one instance of this test were created back-to-back; and both failed to be created on the same error in broker.

[REQ_ID=2ef6a5210b80474e990b0c14ee582dc6] ACTION=SCALE_UP_APPLICATION Application event 'scale-up' failed: Query condition failed to update application 'phtest' for 'rmillner'
Completed 422 Unprocessable Entity in 9986ms (Views: 2.6ms)


Neither gear is being destroyed afterwards.

The error is being raised in:
stickshift/controller/lib/stickshift-controller/lib/stickshift/mongo_data_store.rb

Line 389 in StickShift::MongoDataStore.put_app; find_and_modify returns nil.

Comment 6 Rob Millner 2012-06-22 17:38:20 UTC
The problem seems to have changed in this sprint; running it in a loop showed a problem 9 creates/destroys in with a temporary DNS failure.  

We should probably handle any exceptions from DYN by finishing gear deletion and then try again.

From the delete failure:
<text>Failed to delete application rlmtmp14508 due to:Error communicating with DNS system.  If the problem persists please contact Red Hat support.</text>

Fragments of gears were left behind:
ls -l /var/lib/stickshift/

drwxr-x---. 10 root 05642e020d1148559ff3d923fb061c57 4096 Jun 21 23:26 05642e020d1148559ff3d923fb061c57
lrwxrwxrwx.  1 root root                               52 Jun 21 23:26 05642e020d-rlmtmp14508 -> /var/lib/stickshift/05642e020d1148559ff3d923fb061c57
drwxr-x---.  5 root 07bdd02189434c0884a1f99f7fa31a71 4096 Jun 21 23:30 07bdd02189434c0884a1f99f7fa31a71
drwxr-x---.  5 root 1abbbe24f3b1480b968c24385a8939f2 4096 Jun 21 23:29 1abbbe24f3b1480b968c24385a8939f2
drwxr-x---.  5 root 2c35596b893a4fdf94096118925587c5 4096 Jun 21 23:15 2c35596b893a4fdf94096118925587c5
drwxr-x---.  5 root 3321f48dcaed44c6bf742115edce799b 4096 Jun 21 23:20 3321f48dcaed44c6bf742115edce799b
drwxr-x---.  5 root a9841cfb626b42ef87e963b04ea037cf 4096 Jun 21 23:12 a9841cfb626b42ef87e963b04ea037cf
drwxr-x---.  5 root b3c942e6d7c044e19571336fc593b525 4096 Jun 21 23:16 b3c942e6d7c044e19571336fc593b525
drwxr-x---.  5 root b986e06e495849bb98c292c8d1df5674 4096 Jun 21 23:32 b986e06e495849bb98c292c8d1df5674
drwxr-x---.  5 root bfc18ce087dd486b89c3012add46ade2 4096 Jun 21 23:31 bfc18ce087dd486b89c3012add46ade2


 ls -l /var/lib/stickshift/bfc18ce087dd486b89c3012add46ade2
total 4
drwxr-xr-x. 4 root bfc18ce087dd486b89c3012add46ade2 4096 Jun 21 23:31 app-root

Comment 7 Rob Millner 2012-06-25 19:52:41 UTC
After >100 experiments; the only source of failure I ran into was Bug 834663 .

*** This bug has been marked as a duplicate of bug 834663 ***


Note You need to log in before you can comment on or make changes to this bug.