Bug 820457 - Deleting a scaling app left gears behind
Deleting a scaling app left gears behind
Status: CLOSED DUPLICATE of bug 834663
Product: OpenShift Origin
Classification: Red Hat
Component: Pod (Show other bugs)
2.x
Unspecified Unspecified
medium Severity low
: ---
: ---
Assigned To: Rob Millner
libra bugs
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-09 21:56 EDT by Rob Millner
Modified: 2015-05-14 21:53 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-06-25 15:52:41 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Run this on your dev instance to reproduce the problem (1.19 KB, application/x-shellscript)
2012-05-16 21:15 EDT, Rob Millner
no flags Details

  None (edit)
Description Rob Millner 2012-05-09 21:56:42 EDT
Description of problem:
While exploring another bug; a scalable application was created through the REST API and scaled up 30 times.  It was then deleted through the REST API.

Deletion failed and reported:
      <severity>error</severity>
      <text>Failed to delete application pscale due to:Application gears already at zero for 'rmillner'</text>
      <exit-code>135</exit-code>

According to "rhc domain show", my application is still alive.  The primary gear is gone but one of the scale gears remains:
# ls -l /var/lib/stickshift/
total 4
drwxr-x---. 7 root e8b2ef55168440ef864a7334853e6338 4096 May  9 20:21 e8b2ef55168440ef864a7334853e6338
lrwxrwxrwx. 1 root root                               52 May  9 20:21 e8b2ef5516-rmillner0140 -> /var/lib/stickshift/e8b2ef55168440ef864a7334853e6338

Neither broker nor mcollective logs show any attempt to deconfigure on the gear.



Version-Release number of selected component (if applicable):
rhc-broker-0.92.4-1
rubygem-stickshift-controller-0.10.5-1
rubygem-stickshift-node-0.10.4-1.el6_2

How reproducible:
Unsure.

Steps to Reproduce:
1. Upgrade your account to 100 gears on a dev instance.
2. Create a scalable php application.
3. Issue 30 scale-up events.
4. Delete the application
  
Actual results:
Deletion fails and gears remain.  The application can no longer be deleted.

Expected results:
Nice clean deletion.

Additional info:

Severity set as medium but I'm going to tag this as a future feature since we don't expect to offer high scaling apps this (or even next?) sprint.
Comment 1 Rob Millner 2012-05-10 18:16:16 EDT
This behaviour doesn't repeat if you access the REST API directly via curl; just if the rhc tools are used.
Comment 2 Rob Millner 2012-05-16 21:15:01 EDT
Created attachment 585078 [details]
Run this on your dev instance to reproduce the problem
Comment 3 Rob Millner 2012-05-16 21:15:55 EDT
Was able to reproduce this behaviour via the REST API again.

A scalable jboss app was created and scaled up 30 times.  After deletion; two gears remained.
Comment 4 Rob Millner 2012-06-18 13:56:44 EDT
Setting severity low since this is well outside our offered scaling.
Comment 5 Rob Millner 2012-06-19 20:33:09 EDT
Was able to show that both lingering gears in one instance of this test were created back-to-back; and both failed to be created on the same error in broker.

[REQ_ID=2ef6a5210b80474e990b0c14ee582dc6] ACTION=SCALE_UP_APPLICATION Application event 'scale-up' failed: Query condition failed to update application 'phtest' for 'rmillner@redhat.com'
Completed 422 Unprocessable Entity in 9986ms (Views: 2.6ms)


Neither gear is being destroyed afterwards.

The error is being raised in:
stickshift/controller/lib/stickshift-controller/lib/stickshift/mongo_data_store.rb

Line 389 in StickShift::MongoDataStore.put_app; find_and_modify returns nil.
Comment 6 Rob Millner 2012-06-22 13:38:20 EDT
The problem seems to have changed in this sprint; running it in a loop showed a problem 9 creates/destroys in with a temporary DNS failure.  

We should probably handle any exceptions from DYN by finishing gear deletion and then try again.

From the delete failure:
<text>Failed to delete application rlmtmp14508 due to:Error communicating with DNS system.  If the problem persists please contact Red Hat support.</text>

Fragments of gears were left behind:
ls -l /var/lib/stickshift/

drwxr-x---. 10 root 05642e020d1148559ff3d923fb061c57 4096 Jun 21 23:26 05642e020d1148559ff3d923fb061c57
lrwxrwxrwx.  1 root root                               52 Jun 21 23:26 05642e020d-rlmtmp14508 -> /var/lib/stickshift/05642e020d1148559ff3d923fb061c57
drwxr-x---.  5 root 07bdd02189434c0884a1f99f7fa31a71 4096 Jun 21 23:30 07bdd02189434c0884a1f99f7fa31a71
drwxr-x---.  5 root 1abbbe24f3b1480b968c24385a8939f2 4096 Jun 21 23:29 1abbbe24f3b1480b968c24385a8939f2
drwxr-x---.  5 root 2c35596b893a4fdf94096118925587c5 4096 Jun 21 23:15 2c35596b893a4fdf94096118925587c5
drwxr-x---.  5 root 3321f48dcaed44c6bf742115edce799b 4096 Jun 21 23:20 3321f48dcaed44c6bf742115edce799b
drwxr-x---.  5 root a9841cfb626b42ef87e963b04ea037cf 4096 Jun 21 23:12 a9841cfb626b42ef87e963b04ea037cf
drwxr-x---.  5 root b3c942e6d7c044e19571336fc593b525 4096 Jun 21 23:16 b3c942e6d7c044e19571336fc593b525
drwxr-x---.  5 root b986e06e495849bb98c292c8d1df5674 4096 Jun 21 23:32 b986e06e495849bb98c292c8d1df5674
drwxr-x---.  5 root bfc18ce087dd486b89c3012add46ade2 4096 Jun 21 23:31 bfc18ce087dd486b89c3012add46ade2


 ls -l /var/lib/stickshift/bfc18ce087dd486b89c3012add46ade2
total 4
drwxr-xr-x. 4 root bfc18ce087dd486b89c3012add46ade2 4096 Jun 21 23:31 app-root
Comment 7 Rob Millner 2012-06-25 15:52:41 EDT
After >100 experiments; the only source of failure I ran into was Bug 834663 .

*** This bug has been marked as a duplicate of bug 834663 ***

Note You need to log in before you can comment on or make changes to this bug.