Bug 800188
Summary: | Failed app create doesn't always clean up properly | ||
---|---|---|---|
Product: | OKD | Reporter: | Thomas Wiest <twiest> |
Component: | Pod | Assignee: | Dan McPherson <dmcphers> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | libra bugs <libra-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 2.x | CC: | ansilva, dmcphers, jhonce, jhou, jialiu, junpark, mmcgrath, rmillner, xtian |
Target Milestone: | --- | Keywords: | Security, Triaged |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | devenv_1858 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-06-25 18:27:23 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 767033 |
Description
Thomas Wiest
2012-03-05 22:32:06 UTC
This is hard to reproduce. Not been able to do so yet. As part of other bug fixes, whats better handled now is the deletion of the gear (unix user, home dir etc). As far as the entry in httpd conf files, its really the responsibility of the deconfigure hook. I am still combing through the hooks to see what could be handled differently. Lazy man's proposal for the meanwhile - shall we put some cleaner code in mcollective as we destroy gears? so if there is a destroy-gear call with the uuid, we just go and do a rm -rf /etc/httpd/conf.d/stickshift/$uuid_* ?? Thanks Rajat. I know this is hard to repro. I hate sporadic bugs like this. For our part, I've added a check to rhc-accept-node that will cause failures when the proxy pass files are left around. This will enable our monitoring to alert us when this specific case is happening and I can then tell you how often it happens and hopefully show you the problem. Right now, unfortunately we've just been running across these left over fragments or sometimes seeing our monitoring checks fail because of these left over fragments. The check that I added to rhc-accept-node went live with the last release and the first day we cleaned up all of the errors that we saw. We have a monitoring check for rhc-accept-node that tells us when it starts failing. When it fails, we go in and cleanup / fix whatever caused it to fail. This bug is still around because we're still seeing new errors from partially destroyed apps. Here's the latest failure from rhc-accept-node: FAIL: httpd config file c2e8f2b6db114b54852882551ab21f25_openshiftnagios_chkexsrv2.conf doesn't have an associated user As you can see, this partially destroyed app is actually from our monitoring check_create_app check. Basically that check just creates an app, makes sure it's there, and then removes the app. I'm going to manually clean this one up. For next time, let me know if there's any information that you would like me to collect from the machine before I clean up the app. I think this bug is the same as: https://bugzilla.redhat.com/show_bug.cgi?id=807638 I gave clear steps on how to create and what the issue is in there. Fixed with rev#dbe8ad2f7cf5b5869871feb58228a84e2d42b563 The steps given in https://bugzilla.redhat.com/show_bug.cgi?id=807638 indeed created leftovers. The fix does not let that happen. Tested this on devenv_1679 by make an creation failure intentionally, it's fixed now. This bug definitely still exists with the new code that's currently in STG. rhc-node-0.89.2-1.el6_2.x86_64 This morning there are 22 rhc-accept-node failures on ex-std-nodes 1 and 2 in STG. The errors look like this (the file is different for each error, of course): FAIL: httpd config file 015d9c4cea6549039d953cfe13a5f0fc_jizhao37777_wsgi.conf doesn't have an associated user Friday there were 0 rhc-accept-node failures on these nodes. This leads me to believe that QE's testing over the weekend led to these failures. Re-opening this bug. Just saw this again in stg with the new upgrade: [root@ex-std-node1 ~]# rhc-accept-node FAIL: httpd config file 45b16df29d0646c1aae8205059e01e7d_testfotios_45b16df29d.conf doesn't have an associated user FAIL: httpd config file dc82422b7221489388a549a2126bb73b_testfotios_testscale.conf doesn't have an associated user App creates are still failing in PROD with the latest code. We found this app: e7ce1e7004f34fe1a456fae7d190ac51 No mongo entry Gear home dir only had a .env and a .tmp directory in it Here's the relevant log entry from mcollective on the host: D, [2012-04-20T08:31:08.922103 #1095] DEBUG -- : libra.rb:60:in `cartridge_do_action' cartridge_do_acti on call / request = #<MCollective::RPC::Request:0x7fea3a10ab40 @action="cartridge_do", @agent="libra", @caller="cert=mcollective-public", @data= {:cartridge=>"stickshift-node", :args=> "--with-app-uuid 'e7ce1e7004f34fe1a456fae7d190ac51' --with-container-uuid 'e7ce1e7004f34fe1a456fae7 d190ac51' -i '6441' --named 'chkexsrv2' --with-namespace 'openshiftnagios'", :action=>"app-create", :process_results=>true}, @sender="mcollect.cloud.redhat.com", @time=1334925068, @uniqid="8bed61646e8b332950010b4577d28209"> D, [2012-04-20T08:31:08.922361 #1095] DEBUG -- : libra.rb:61:in `cartridge_do_action' cartridge_do_acti on validation = stickshift-node app-create --with-app-uuid 'e7ce1e7004f34fe1a456fae7d190ac51' --with-co ntainer-uuid 'e7ce1e7004f34fe1a456fae7d190ac51' -i '6441' --named 'chkexsrv2' --with-namespace 'openshi ftnagios' D, [2012-04-20T08:31:09.723126 #1095] DEBUG -- : libra.rb:102:in `cartridge_do_action' cartridge_do_act ion (0) ------ ------) Note: This is the ONLY entry for this gear uuid in the mcollective log. It would really help, if in such cases, a corresponding snippet of broker logs is provided (wrt that gear uuid or the application). That would help to see why proper deconfigure/destroy hooks are not called on certain gears. Oh, ok, no prob. We got an alert in our monitoring that gear 4a0e3154d50e4b8bb62a39e851eda756 was in a bad state. After investigating I determined that this was a half created gear. Here's it's directory: # ll -a total 32 drwxr-x---. 4 root 4a0e3154d50e4b8bb62a39e851eda756 4096 Apr 27 04:20 . drwxr-x--x. 211 root root 20480 Apr 27 09:36 .. drwxr-x---. 2 root 4a0e3154d50e4b8bb62a39e851eda756 4096 Apr 27 04:20 .env d---------. 2 root root 4096 Apr 27 04:20 .tmp This is still a problem with the latest code in STG as of this morning. Here is the log from the ex-srv: Started POST "/broker/rest/domains/bmeng5s/applications/py1s/events" for 10.77.7.46 at Fri Apr 27 04:20: 30 -0400 2012 Processing by AppEventsController#create as JSON Parameters: {"broker_auth_iv"=>"[FILTERED]", "broker_auth_key"=>"[FILTERED]", "application_id"=>"py1s" , "domain_id"=>"bmeng5s", "event"=>"scale-up"} MongoDataStore.find(CloudUser, bmeng+5, bmeng+5) Adding user bmeng+5...inside base_controller MongoDataStore.find(CloudUser, bmeng+5, bmeng+5) DEBUG: find_available_impl: district_uuid: 61ecace7bada4b2b9d14ddc0fc511ef0 DEBUG: rpc_get_fact: fact=active_capacity DEBUG: rpc_exec: rpc_client=#<MCollective::RPC::Client:0x7f4608891638> Next server: ex-std-node2.stg.rhcloud.com active capacity: 33.75 Current server: ex-std-node2.stg.rhcloud.com active capacity: 33.75 Next server: ex-std-node1.stg.rhcloud.com active capacity: 31.25 Current server: ex-std-node1.stg.rhcloud.com active capacity: 31.25 CURRENT SERVER: ex-std-node1.stg.rhcloud.com DEBUG: find_available_impl: current_server: ex-std-node1.stg.rhcloud.com: 31.25 MongoDataStore.reserve_district_uid(61ecace7bada4b2b9d14ddc0fc511ef0) DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7f4608bcab88> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"app-create", :cartridge=>"stickshift-node", :args=>"--with-app-uuid 'b2dd26b7809640b084e15f434294613d' --with-container-uuid '4a0e3154d50e4b8bb62a39e851eda756' -i '4505' --named 'py1s' --with-namespace 'bmeng5s'"}, @id, {'identity' => @id}) DEBUG: [#<MCollective::RPC::Result:0x7f46089acab8 @action="cartridge_do", @agent="libra", @results={:statuscode=>0, :data=>{:exitcode=>1, :output=>"CLIENT_ERROR: \nCLIENT_ERROR: Could not add job 'jbosstest-build' in Jenkins server:\nCLIENT_ERROR: \nCLIENT_ERROR: You'll need to correct this error before attempting to embed the Jenkins client again.\n"}, :sender=>"ex-std-node1.stg.rhcloud.com", :statusmsg=>"OK"}>] uninitialized constant GroupInstance::NodeException Completed 500 Internal Server Error in 10353ms NoMethodError (undefined method `code' for #<NameError: uninitialized constant GroupInstance::NodeException>): Here is the log from the ex-node (note: this is the only output with this gear uuid in the log): D, [2012-04-27T04:20:34.233545 #21673] DEBUG -- : libra.rb:60:in `cartridge_do_action' cartridge_do_act ion call / request = #<MCollective::RPC::Request:0x7f5cc75c65e0 @action="cartridge_do", @agent="libra", @caller="cert=mcollective-public", @data= {:cartridge=>"stickshift-node", :action=>"app-create", :process_results=>true, :args=> "--with-app-uuid 'b2dd26b7809640b084e15f434294613d' --with-container-uuid '4a0e3154d50e4b8bb62a39e8 51eda756' -i '4505' --named 'py1s' --with-namespace 'bmeng5s'"}, @sender="mcollect.cloud.redhat.com", @time=1335514834, @uniqid="df2dd5b0d80d5840d73d9ebcfd0e2517"> D, [2012-04-27T04:20:34.233803 #21673] DEBUG -- : libra.rb:61:in `cartridge_do_action' cartridge_do_act ion validation = stickshift-node app-create --with-app-uuid 'b2dd26b7809640b084e15f434294613d' --with-c ontainer-uuid '4a0e3154d50e4b8bb62a39e851eda756' -i '4505' --named 'py1s' --with-namespace 'bmeng5s' More fix with rev#d47868db10f5c84d0613c6fee93690aa0d2a0046 Situation : make gear create fail towards the end (non-zero exit code). This half-creates the gear, which fails the action on app. Everything else recovers, but the created gear is never destroyed. According to comment 12, reproduce this bug on devenv-stage_157. Step: 1. Modify /usr/lib/ruby/gems/1.8/gems/stickshift-node-0.7.4/bin/ss-app-create, replace the following line: exit 0 to exit 1 2. Try to create app 3. Failed to create app, and the app gear is left in a bad state. [root@ip-10-100-229-82 stickshift]# ls -la c5356021a43f403eaf01dc74b69b62b6 total 16 drwxr-x---. 4 root c5356021a43f403eaf01dc74b69b62b6 4096 May 3 04:56 . drwxr-x--x. 6 root root 4096 May 3 04:56 .. drwxr-x---. 2 root c5356021a43f403eaf01dc74b69b62b6 4096 May 3 04:56 .env d---------. 2 root root 4096 May 3 04:56 .tmp Verified this bug on devenv_1752, and PASS. 1. Modify ss-app-create to make it failed on purpose as reproduced steps. 2. Whatever create scalable app or non-scalable, even if it failed, no leftover for this app is seen. $ create_php_app Submitting form: debug: true rhlogin: jialiu Contacting https://ec2-107-21-67-181.compute-1.amazonaws.com Creating application: phptest in jialiu Contacting https://ec2-107-21-67-181.compute-1.amazonaws.com Problem reported from server. Response code was 500. DEBUG: Exit Code: 1 broker_c: namespacerhloginsshapp_uuiddebugaltercartridgecart_typeactionapp_nameapi api_c: placeholder API version: 1.1.3 RESULT: Unable to create gear on node $ curl -k -X POST -H 'Accept: application/xml' -d name=myapp -d cartridge=php-5.3 -d scale=true --user jialiu:214214 https://ec2-107-21-67-181.compute-1.amazonaws.com/broker/rest/domains/jialiu/applications <?xml version="1.0" encoding="UTF-8"?> <response> <type nil="true"></type> <data> <datum nil="true"></datum> </data> <version>1.0</version> <messages> <message> <exit-code nil="true"></exit-code> <severity>error</severity> <text>Failed to create application myapp due to:Unable to create gear on node</text> <field nil="true"></field> </message> </messages> <status>internal_server_error</status> <supported-api-versions> <supported-api-version>1.0</supported-api-version> </supported-api-versions> </response> This is still a problem in STG with the latest code: rhc-node-0.92.5-1.el6_2.x86_64 rhc-broker-0.92.8-1.el6_2.noarch Here's the partially destroyed app: [ccb8bc46eddd4fcd9cc302a617948c60]# ls -la total 36 drwxr-x---. 4 root ccb8bc46eddd4fcd9cc302a617948c60 4096 May 12 00:15 . drwxr-x--x. 278 root root 24576 May 12 13:03 .. drwxr-x---. 2 root ccb8bc46eddd4fcd9cc302a617948c60 4096 May 12 00:15 .env d---------. 3 root root 4096 May 12 00:25 .tmp [ccb8bc46eddd4fcd9cc302a617948c60]# This is from the ex-srv broker logs: * Note: I've e-mailed this directly to Rajat since it might have sensitive information in it. This is from the ex-node mcollective logs: D, [2012-05-12T00:15:06.029439 #10488] DEBUG -- : libra.rb:303:in `cartridge_do_action' cartridge_do_ac tion call / request = #<MCollective::RPC::Request:0x7f1431d0ce50 @action="cartridge_do", @agent="libra", @caller="cert=mcollective-public", @data= {:cartridge=>"stickshift-node", :action=>"app-create", :args=> {"--named"=>"wsgitest", "--with-uid"=>2879, "--with-app-uuid"=>"ccb8bc46eddd4fcd9cc302a617948c60", "--with-container-uuid"=>"ccb8bc46eddd4fcd9cc302a617948c60", "--with-namespace"=>"jialiu1"}, :process_results=>true}, @sender="mcollect.cloud.redhat.com", @time=1336796105, @uniqid="22fa0fad26984a0860e76b787d90aa20"> D, [2012-05-12T00:15:06.029962 #10488] DEBUG -- : libra.rb:304:in `cartridge_do_action' cartridge_do_ac tion validation = stickshift-node app-create --namedwsgitest--with-uid2879--with-app-uuidccb8bc46eddd4f cd9cc302a617948c60--with-container-uuidccb8bc46eddd4fcd9cc302a617948c60--with-namespacejialiu1 D, [2012-05-12T00:15:06.030238 #10488] DEBUG -- : libra.rb:59:in `ss_app_create' COMMAND: ss-app-create D, [2012-05-12T00:15:07.088597 #10488] DEBUG -- : amqp.rb:91:in `receive' Received message [...snip...] D, [2012-05-12T00:15:07.226191 #10488] DEBUG -- : libra.rb:303:in `cartridge_do_action' cartridge_do_ac tion call / request = #<MCollective::RPC::Request:0x7f1431ce3190 @action="cartridge_do", @agent="libra", @caller="cert=mcollective-public", @data= {:cartridge=>"stickshift-node", :action=>"app-destroy", :args=> {"--with-app-uuid"=>"ccb8bc46eddd4fcd9cc302a617948c60", "--with-container-uuid"=>"ccb8bc46eddd4fcd9cc302a617948c60"}, :process_results=>true}, @sender="mcollect.cloud.redhat.com", @time=1336796107, @uniqid="0cac21641ae2920fae565b8597b32501"> D, [2012-05-12T00:15:07.226506 #10488] DEBUG -- : libra.rb:304:in `cartridge_do_action' cartridge_do_action validation = stickshift-node app-destroy --with-app-uuidccb8bc46eddd4fcd9cc302a617948c60--with-container-uuidccb8bc46eddd4fcd9cc302a617948c60 D, [2012-05-12T00:15:07.226847 #10488] DEBUG -- : libra.rb:86:in `ss_app_destroy' COMMAND: ss-app-destroy D, [2012-05-12T00:15:07.227423 #10488] DEBUG -- : libra.rb:95:in `ss_app_destroy' ERROR: unable to destroy user account ccb8bc46eddd4fcd9cc302a617948c60 I met this issue on stage as well, Failed to create scalephp2: rhc app create --app scalephp2 --type php-5.3 -s Password: Creating application: scalephp2 in testssh0 /usr/lib/ruby/gems/1.8/gems/rhc-0.92.11/lib/rhc-common.rb:445:in `create_app': undefined method `uuid' for []:Array (NoMethodError) from /usr/lib/ruby/gems/1.8/gems/rhc-0.92.11/bin/rhc-app:226:in `create_app' from /usr/lib/ruby/gems/1.8/gems/rhc-0.92.11/bin/rhc-app:565 from /usr/bin/rhc-app:19:in `load' from /usr/bin/rhc-app:19 But it's listed in rhc domain show: scalephp2 Framework: php-5.3 Creation: 2012-05-14T03:39:25-04:00 UUID: 285839a125414a50bd1b26557f1e6e69 Git URL: ssh://285839a125414a50bd1b26557f1e6e69.rhcloud.com/~/git/scalephp2.git/ Public URL: http://scalephp2-testssh0.stg.rhcloud.com/ Embedded: haproxy-1.4 But if you ping the url, it's unknown host: #ping scalephp2-testssh0.stg.rhcloud.com ping: unknown host scalephp2-testssh0.stg.rhcloud.com Fix needed in unix_user.rb, two items : 1. Handle the destroy in a step fashion.. looking at uuid, uid, filesystem separately and logging their failures separately 2. Serializing unix_user.create and unix_user.destroy so that there is no race condition when mcollective barfs. Just for an update... Over the weekend of May 25th through May 29th there were over 30 instances this issue in PROD. Most of them were related to some .conf being left behind on ex-{std,lg}-node. AS FAIL: httpd config file 42b5b7ff73c04513816121cc33ac2d2f_imasen_cashloggerbldr.conf doesn't have an associated user FAIL: httpd config file cfaea2b1e63c4616b04741b1e488549d_imasen_cashloggerbldr.conf doesn't have an associated user This *_imasen_cashloggerbldr (user/namespace) has had over 6 failures in the last 2 days. AS I think I have fixed a lot of the issues with 825354. I would like to know how much better this issue is after this release is out. Meet this again on int.openshift.redhat.com Failed to create an app : [xiaoli@localhost int]$ rhc app create -a rubyap1 -t ruby-1.8 -l xtian+test1 -p 123456 Creating application: rubyap1 in z5eusdurut Problem reported from server. Response code was 500. Re-run with -d for more information. But it's listed in domain info: [xiaoli@localhost int]$ rhc domain show -l xtian+test1 -p 123456 User Info ========= Namespace: z5eusdurut RHLogin: xtian+test1 Application Info ================ rubyap1 Framework: ruby-1.8 Creation: 2012-06-01T07:19:56-04:00 UUID: 68e357b50b124be287f154319e73b76e Git URL: ssh://68e357b50b124be287f154319e73b76e.rhcloud.com/~/git/rubyap1.git/ Public URL: http://rubyap1-z5eusdurut.int.rhcloud.com/ Embedded: None Ssh to it will fail: [xiaoli@localhost int]$ ssh 68e357b50b124be287f154319e73b76e.rhcloud.com ssh: Could not resolve hostname rubyap1-z5eusdurut.int.rhcloud.com: Name or service not known Meet this again on current stage. the information is shown in rhc domain show, but it can not be destroyed. #rhc domain show Password: ****** User Info ========= Namespace: jhou RHLogin: jhou Application Info ================ sa Framework: jbossas-7 Creation: 2012-06-11T21:43:46-04:00 UUID: 6929c459471145d58f462c092cefa699 Git URL: ssh://6929c459471145d58f462c092cefa699.rhcloud.com/~/git/sa.git/ Public URL: http://sa-jhou.stg.rhcloud.com/ Embedded: haproxy-1.4 [hjw@localhost test]$ rhc app destroy -a sa Password: ****** !!!! WARNING !!!! WARNING !!!! WARNING !!!! You are about to destroy the sa application. This is NOT reversible, all remote data for this application will be removed. Do you want to destroy this application (y/n): y Problem reported from server. Response code was 400. Re-run with -d for more information. RESULT: Application gears already at zero for 'jhou' [hjw@localhost test]$ ssh 6929c459471145d58f462c092cefa699.rhcloud.com ssh: Could not resolve hostname sa-jhou.stg.rhcloud.com: Name or service not known If the app failed to create like bug 832745 said, the failed app data is not removed from mongo, but can not be destoryed via client. some log from broker: DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"deconfigure", :args=>"'b69f0e064b' 'freedom3' 'b69f0e064bf64e88abc75bb14180668b'", :cartridge=>"php-5.3"}, @id, {'identity' => @id}) DEBUG: [#<MCollective::RPC::Result:0x7f1859680830 @results={:sender=>"ip-10-62-91-176", :statusmsg=>"OK", :data=>{:exitcode=>0, :output=>"Waiting for stop to finish\n"}, :statuscode=>0}, @action="cartridge_do", @agent="libra">] DEBUG: Cartridge command php-5.3::deconfigure exitcode = 0 MongoDataStore.save(CloudUser, xtian+b105, xtian+b105, #hidden) DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7f185966b6b0> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"deconfigure", :args=>"'b3dd6a8997' 'freedom3' 'b3dd6a8997eb409c83764530a2f0342d'", :cartridge=>"php-5.3"}, @id, {'identity' => @id}) DEBUG: [#<MCollective::RPC::Result:0x7f185961ffa8 @results={:sender=>"ip-10-62-91-176", :statusmsg=>"OK", :data=>{:exitcode=>0, :output=>"Waiting for stop to finish\n"}, :statuscode=>0}, @action="cartridge_do", @agent="libra">] DEBUG: Cartridge command php-5.3::deconfigure exitcode = 0 MongoDataStore.save(CloudUser, xtian+b105, xtian+b105, #hidden) DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7f18596afef0> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"deconfigure", :args=>"'854dae0d92' 'freedom3' '854dae0d923545c690f9705895d3a7e7'", :cartridge=>"php-5.3"}, @id, {'identity' => @id}) DEBUG: [#<MCollective::RPC::Result:0x7f18597a1408 @results={:sender=>"ip-10-62-91-176", :statusmsg=>"OK", :data=>{:exitcode=>0, :output=>"Waiting for stop to finish\n"}, :statuscode=>0}, @action="cartridge_do", @agent="libra">] DEBUG: Cartridge command php-5.3::deconfigure exitcode = 0 MongoDataStore.save(CloudUser, xtian+b105, xtian+b105, #hidden) DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7f185976bc90> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"deconfigure", :args=>"'fbcdc69857' 'freedom3' 'fbcdc69857254724b3bb94f7444903ed'", :cartridge=>"php-5.3"}, @id, {'identity' => @id}) DEBUG: [#<MCollective::RPC::Result:0x7f18596e6608 @results={:sender=>"ip-10-62-91-176", :statusmsg=>"OK", :data=>{:exitcode=>0, :output=>"Waiting for stop to finish\n"}, :statuscode=>0}, @action="cartridge_do", @agent="libra">] DEBUG: Cartridge command php-5.3::deconfigure exitcode = 0 MongoDataStore.save(CloudUser, xtian+b105, xtian+b105, #hidden) DEBUG: Deconfiguring embedded application 'haproxy-1.4' in application 'phpapp5' on node 'ip-10-62-91-176' DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7f18596c4e18> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"deconfigure", :args=>"'phpapp5' 'freedom3' 'c0d1bec318104c7e9569cf9d29d35ec6'", :cartridge=>"embedded/haproxy-1.4"}, @id, {'identity' => @id}) DEBUG: [#<MCollective::RPC::Result:0x7f1859640f78 @results={:sender=>"ip-10-62-91-176", :statusmsg=>"OK", :data=>{:exitcode=>0, :output=>"/usr/libexec/stickshift/cartridges/embedded/haproxy-1.4/info/hooks/deconfigure: line 62: kill: (14315) - No such process\nSSH_KEY_REMOVE: \n"}, :statuscode=>0}, @action="cartridge_do", @agent="libra">] DEBUG: Cartridge command embedded/haproxy-1.4::deconfigure exitcode = 0 MongoDataStore.save(CloudUser, xtian+b105, xtian+b105, #hidden) DEBUG: rpc_exec_direct: rpc_client=#<MCollective::RPC::Client:0x7f18597a0f30> DEBUG: rpc_client.custom_request('cartridge_do', {:action=>"deconfigure", :args=>"'phpapp5' 'freedom3' 'c0d1bec318104c7e9569cf9d29d35ec6'", :cartridge=>"php-5.3"}, @id, {'identity' => @id}) DEBUG: [#<MCollective::RPC::Result:0x7f18596fbc88 @results={:sender=>"ip-10-62-91-176", :statusmsg=>"OK", :data=>{:exitcode=>0, :output=>"Waiting for stop to finish\n"}, :statuscode=>0}, @action="cartridge_do", @agent="libra">] DEBUG: Cartridge command php-5.3::deconfigure exitcode = 0 MongoDataStore.save(CloudUser, xtian+b105, xtian+b105, #hidden) MongoDataStore.save(Application, xtian+b105, phpapp5, #hidden) Completed 500 Internal Server Error in 96585ms StickShift::UserException (Application limit has reached for 'xtian+b105') , Comments 20,21,22 appear to be different issues. If you don't believe they are handled in other existing bugs please feel free to open new bugs. But this bug is about issues where data is left on the node and there is nothing in mongo. I have made changes to address the issues the bug was opened for. Would like to know if we have and additional cases of httpd conf or user home dirs left around. Re-test bug with devenv_1857, failed, httpd conf is left around after encounter app creation failure. steps: 1. On instance: # rhc-admin-ctl-user -l jialiu --setmaxgears 1 2. On client: $ rhc-create-app -a myapp -t php-5.3 -px -s Creating application: myapp in jialiu /usr/local/share/gems/gems/rhc-0.93.18/lib/rhc-rest.rb:134:in `raise': exception object expected (TypeError) from /usr/local/share/gems/gems/rhc-0.93.18/lib/rhc-rest.rb:134:in `process_error_response' from /usr/local/share/gems/gems/rhc-0.93.18/lib/rhc-rest.rb:86:in `rescue in send' from /usr/local/share/gems/gems/rhc-0.93.18/lib/rhc-rest.rb:71:in `send' from /usr/local/share/gems/gems/rhc-0.93.18/lib/rhc-rest/domain.rb:30:in `add_application' from /usr/local/share/gems/gems/rhc-0.93.18/lib/rhc-common.rb:511:in `create_app' from /usr/local/share/gems/gems/rhc-0.93.18/bin/rhc-create-app:226:in `<top (required)>' from /usr/local/bin/rhc-create-app:23:in `load' from /usr/local/bin/rhc-create-app:23:in `<main>' 3. On instance: [root@ip-10-85-3-53 stickshift]# pwd /var/lib/stickshift [root@ip-10-85-3-53 stickshift]# ls .httpd.d/ 11e5d4d172074502b8f9d42e8bfeaec7_jialiu_myapp bcbf670ed51d45a881b0a674cfc4d0ce_jialiu_bcbf670ed5 Fixed in version says devenv_1858. It was fixed after the last build came out. Verified this bug with devenv_1589, and PASS. |