Description of problem: When moving an application it failed to move due to the httpd.pid file being empty. Once removing this file we were able to continue moving this application with all of its gears. Specifically the phpmyadmin-3.4/run/httpd.pid file. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: App was not able to be moved. Expected results: The cartridges would be able to detect this and continue moving the application. Additional info: ramr will add more details as he's investigating.
I, [2012-08-28T11:13:45.116245 #18287] INFO -- : stickshift.rb:315:in `cartridge_do_action' cartridge_do_action call / request = #<MCollective::RPC::Request:0x7f264133a090 @action="cartridge_do", @agent="stickshift", @caller="uid=0", @data= {:cartridge=>"embedded/phpmyadmin-3.4", :args=>"'cake' 'esolvesapp' '00a496dc0d96405d888a02bc0020297d'", :action=>"stop", :process_results=>true}, @sender="mcollect.cloud.redhat.com", @time=1346166825, @uniqid="96616a35c0b4853533894471e6c83c0f"> I, [2012-08-28T11:13:45.116527 #18287] INFO -- : stickshift.rb:316:in `cartridge_do_action' cartridge_do_action validation = embedded/phpmyadmin-3.4 stop 'cake' 'esolvesapp' '00a496dc0d96405d888a02bc0020297d' I, [2012-08-28T11:13:45.437418 #18287] INFO -- : stickshift.rb:373:in `cartridge_do_action' cartridge_do_action ERROR (1) ------ (20014)Internal error: Error retrieving pid file run/httpd.pid Remove it before continuing if it is corrupted. ------)
Issue w/ phpmyadmin (+ possibly rockmongo/phpmoadmin). To reproduce this just edit the run/httpd.pid and blank out the contents or add a comment (leading #). Related issue exists w/ php/perl/python/ruby*/ where if you edit the pid file (affects idler).
Will fix the related issue with a separate bug so that QE can test a simpler case here. Fixed this issue with git commits: b137c82c49004cc8eae4dcd1c484ba622a6e0159 in li and 3e8c713e0b7eb2844b98a8269edce68fe01fc737 in crankcase. Waiting on pull requests: https://github.com/openshift/li/pull/326 and https://github.com/openshift/crankcase/pull/443
Created attachment 608132 [details] development.log Tested on devenv_2097 Failed to move application when the httpd.pid is empty Steps to Reproduce: 1. Setup multi-node environments 2. Create a php application 3. move this app from one node to another rhc-admin-move --gear_uuid 0057251d5b5248abb4b419ab06b4d049 move was OK 4. empty httpd.pid file and move again Move fails: DEBUG: Starting cartridge 'php-5.3' in 'php1' after move on ip-10-191-178-229 DEBUG: Moving failed. Rolling back gear 'php1' 'php1' with remove-httpd-proxy on 'ip-10-191-178-229' DEBUG: Moving failed. Rolling back gear 'php1' in 'php1' with destroy on 'ip-10-191-178-229' /usr/lib/ruby/gems/1.8/gems/gearchanger-mcollective-plugin-0.3.1/lib/gearchanger-mcollective-plugin/gearchanger/mcollective_application_container_proxy.rb:1324:in `run_cartridge_command_old': Node execution failure (invalid exit code from node). If the problem persists please contact Red Hat support. (StickShift::NodeException) from /var/www/stickshift/broker/lib/express/broker/mcollective_ext.rb:13:in `run_cartridge_command' from /usr/lib/ruby/gems/1.8/gems/gearchanger-mcollective-plugin-0.3.1/lib/gearchanger-mcollective-plugin/gearchanger/mcollective_application_container_proxy.rb:881:in `move_gear' from /usr/lib/ruby/gems/1.8/gems/gearchanger-mcollective-plugin-0.3.1/lib/gearchanger-mcollective-plugin/gearchanger/mcollective_application_container_proxy.rb:875:in `each' from /usr/lib/ruby/gems/1.8/gems/gearchanger-mcollective-plugin-0.3.1/lib/gearchanger-mcollective-plugin/gearchanger/mcollective_application_container_proxy.rb:875:in `move_gear' from /usr/bin/rhc-admin-move:109 Additional: Also failed with phpmyadmin, perl, so other cartridges may have the same problem Remove httpd.pid, and move again, move succeed Attached development.log
@Hou, can you please attach the mcollective log as well. Also the httpd.pid file emptying -- did you do that on the php-5.3/run/httpd.pid or the phymyadmin-3.4/run/httpd.pid file? The fix here was a simple case done to only address: phpmyadmin, rockmongo, phpmoadmin and metrics. The other apache based apps ones (php/python/perl/ruby etc) needs a lil' more involved fix as its the primary app control script. Will fix that as part of bug fixes next week.
(In reply to comment #5) > @Hou, can you please attach the mcollective log as well. Also the httpd.pid > file > emptying -- did you do that on the php-5.3/run/httpd.pid or the > phymyadmin-3.4/run/httpd.pid file? > > The fix here was a simple case done to only address: phpmyadmin, rockmongo, > phpmoadmin and metrics. > > > The other apache based apps ones (php/python/perl/ruby etc) needs a lil' > more involved fix as its the primary app control script. Will fix that as > part of bug fixes next week. I did it on both applications: a php-5.3 application and a embedded phpmyadmin-3.4 cartridge, they both failed. Now that this fix is only for phpmyadmin, rockmongo, phpmoadmin and metrics. I won't focus on other cartridges any more. I have tested phpmyadmin, rockmongo, phpmoadmin and metrics they all fails when httpd.pid is empty. Added development.log, mcollective.log and error messages when the problem is encountered. For other apache based primary cartridges, I have filed Bug 853372 to keep track
Created attachment 608447 [details] development.log(2012-08-31)
Created attachment 608448 [details] mcollective.log
Created attachment 608449 [details] error messages
Fixed with pull requests: https://github.com/openshift/crankcase/pull/452 https://github.com/openshift/li/pull/342 waiting for merge+test.
Verified on devenv_2115 Steps: 1. Setup multi-node environments 2. Create applications and embed phpmyadmin/rockmongo/phpmoadmin/metrics 3. Set httpd.pid empty for above cartridges :> /var/lib/stickshift/$UUID/$cartridge/run/httpd.pid 4. Move this app rhc-admin-move --gear_uuid $UUID -i $target_server_identity Result: Move is successful when httpd.pid for phpmyadmin/rockmongo/phpmoadmin/metrics is empty