Bug 809251 - Failed app creates sometimes leave around fragments
Failed app creates sometimes leave around fragments
Status: CLOSED CURRENTRELEASE
Product: OpenShift Origin
Classification: Red Hat
Component: Containers (Show other bugs)
2.x
Unspecified Unspecified
medium Severity high
: ---
: ---
Assigned To: Rajat Chopra
libra bugs
: Triaged
Depends On:
Blocks: 811192
  Show dependency treegraph
 
Reported: 2012-04-02 17:00 EDT by Thomas Wiest
Modified: 2015-05-14 18:53 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-04-13 14:35:00 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Thomas Wiest 2012-04-02 17:00:53 EDT
Description of problem:
Failed app creates sometimes leave around fragments.

This morning we got some e-mail from STG that showed that rhc-last-access is failing with this error:
EXCEPTION: No such file or directory - /var/lib/stickshift/2300b6a3f8ef4aa0834847ec326730c6/.env/OPENSHIFT_GEAR_DNS

In total, there are 9 apps that are failing with this error in STG.

When we go to look at these apps, here's what we see:
* The app _does not_ have an entry in Mongo
* The app _does not_ have a ProxyPass file
* The app _has_ a user on the machine
* The app _has_ a directory under /var/lib/stickshift


Under the /var/lib/stickshift/uuid directory, it looks like this:
[root@ex-std-node1 2300b6a3f8ef4aa0834847ec326730c6]# ll -a
total 24
drwxr-x---.   4 root 2300b6a3f8ef4aa0834847ec326730c6  4096 Mar 31 05:08 .
drwxr-x--x. 171 root root                             12288 Apr  2 16:51 ..
drwxr-x---.   2 root 2300b6a3f8ef4aa0834847ec326730c6  4096 Mar 31 05:08 .env
d---------.   3 root root                              4096 Mar 31 12:46 .tmp
[root@ex-std-node1 2300b6a3f8ef4aa0834847ec326730c6]#


Again, the other apps are in a similar situation.

This is the only thing in the mcollective log referring to this uuid:

D, [2012-03-31T05:08:39.403655 #10995] DEBUG -- : libra.rb:60:in `cartridge_do_action' cartridge_do_acti
on call / request = #<MCollective::RPC::Request:0x7f03903a54b8
 @action="cartridge_do",
 @agent="libra",
 @caller="cert=mcollective-public",
 @data=
  {:cartridge=>"stickshift-node",
   :args=>
    "--with-app-uuid '419a0a962d0243a980bc99892341b65f' --with-container-uuid '2300b6a3f8ef4aa0834847ec3
26730c6' -i '5185' --named 'scalephp' --with-namespace 'bmeng7s'",
   :action=>"app-create",
   :process_results=>true},
 @sender="mcollect.cloud.redhat.com",
 @time=1333184919,
 @uniqid="c88b6e359f3e824eb5ea5f7dbf86c491">

D, [2012-03-31T05:08:39.403945 #10995] DEBUG -- : libra.rb:61:in `cartridge_do_action' cartridge_do_acti
on validation = stickshift-node app-create --with-app-uuid '419a0a962d0243a980bc99892341b65f' --with-con
tainer-uuid '2300b6a3f8ef4aa0834847ec326730c6' -i '5185' --named 'scalephp' --with-namespace 'bmeng7s'
D, [2012-03-31T05:08:40.368987 #10995] DEBUG -- : libra.rb:102:in `cartridge_do_action' cartridge_do_act
ion (0)
------
CART_DATA: PROXY_HOST=b5053c3948-bmeng8s.stg.rhcloud.com
CART_DATA: PROXY_PORT=58906
CART_DATA: HOST=127.10.27.129
CART_DATA: PORT=8080

------)



Version-Release number of selected component (if applicable):
rhc-node-0.89.2-1.el6_2.x86_64


How reproducible:
Very sporadic, but since it's happening in STG with the latest release, I think the bug still exists.


Steps to Reproduce:
1. unknown

  
Actual results:
failed app create leaves fragments.


Expected results:
nothing should be left around
Comment 1 Dan McPherson 2012-04-02 18:18:24 EDT
Rajat, can you take a look at this and ensure we don't have a case deconfigure and destroy aren't being called.
Comment 2 Rajat Chopra 2012-04-02 19:10:08 EDT
Some more fixes with rev#04dff1e6fdb228ebea0aba2678658e8d94c5688c
The app's gears were not being cleaned up properly in case mongo save failed.

Keeping this open still for other issues lurking.
Comment 3 Rajat Chopra 2012-04-11 15:17:08 EDT
Found a code path where if max_gears limit is reached and the application is denied creation, it leaves gears behind. Fixed that with rev#1be5f271b378b8425702997b80e02f9ab346c464.
Comment 4 Johnny Liu 2012-04-13 12:15:01 EDT
For patch 04dff1e6fdb228ebea0aba2678658e8d94c5688c, it has been verified in BUG 807045.

For patch 1be5f271b378b8425702997b80e02f9ab346c464, verify it on devenv_stage_169, and PASS.

Reproduce steps:
1). Launch instance with sprint 8 release version - devenv_stage_157
2). Create 2 apps, that mean only 1 gear is allowed to be created.
3). Try to create a scalable app that will consume 2 gears at least, that mean creation failure will happen due to max_gears is reached.
4). In instance, check /var/lib/stickshift, some gear leftover is seen.

Note You need to log in before you can comment on or make changes to this bug.