Bug 852518 - Failed move due to httpd.pid file being empty
Summary: Failed move due to httpd.pid file being empty
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OKD
Classification: Red Hat
Component: Containers
Version: 2.x
Hardware: All
OS: Linux
unspecified
medium
Target Milestone: ---
: ---
Assignee: Ram Ranganathan
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-08-28 19:07 UTC by Kenny Woodson
Modified: 2015-05-14 22:58 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-09-17 21:29:33 UTC
Target Upstream Version:
Embargoed:
ramr: needinfo+


Attachments (Terms of Use)
development.log (9.40 KB, text/x-log)
2012-08-30 10:09 UTC, Jianwei Hou
no flags Details
development.log(2012-08-31) (12.16 KB, text/x-log)
2012-08-31 09:34 UTC, Jianwei Hou
no flags Details
mcollective.log (243.68 KB, text/x-log)
2012-08-31 09:35 UTC, Jianwei Hou
no flags Details
error messages (3.08 KB, text/x-log)
2012-08-31 09:36 UTC, Jianwei Hou
no flags Details

Description Kenny Woodson 2012-08-28 19:07:36 UTC
Description of problem:

When moving an application it failed to move due to the httpd.pid file being empty.  Once removing this file we were able to continue moving this application with all of its gears.

Specifically the phpmyadmin-3.4/run/httpd.pid file.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:

App was not able to be moved.

Expected results:

The cartridges would be able to detect this and continue moving the application.

Additional info:

ramr will add more details as he's investigating.

Comment 1 Kenny Woodson 2012-08-28 19:08:48 UTC
I, [2012-08-28T11:13:45.116245 #18287]  INFO -- : stickshift.rb:315:in `cartridge_do_action' cartridge_do_action call / request = #<MCollective::RPC::Request:0x7f264133a090
 @action="cartridge_do",
 @agent="stickshift",
 @caller="uid=0",
 @data=
  {:cartridge=>"embedded/phpmyadmin-3.4",
   :args=>"'cake' 'esolvesapp' '00a496dc0d96405d888a02bc0020297d'",
   :action=>"stop",
   :process_results=>true},
 @sender="mcollect.cloud.redhat.com",
 @time=1346166825,
 @uniqid="96616a35c0b4853533894471e6c83c0f">

I, [2012-08-28T11:13:45.116527 #18287]  INFO -- : stickshift.rb:316:in `cartridge_do_action' cartridge_do_action validation = embedded/phpmyadmin-3.4 stop 'cake' 'esolvesapp' '00a496dc0d96405d888a02bc0020297d'
I, [2012-08-28T11:13:45.437418 #18287]  INFO -- : stickshift.rb:373:in `cartridge_do_action' cartridge_do_action ERROR (1)
------
(20014)Internal error: Error retrieving pid file run/httpd.pid
Remove it before continuing if it is corrupted.

------)

Comment 2 Ram Ranganathan 2012-08-28 19:12:16 UTC
Issue w/ phpmyadmin  (+ possibly rockmongo/phpmoadmin). To reproduce this just edit the run/httpd.pid and blank out the contents or add a comment (leading #).  

Related issue exists w/ php/perl/python/ruby*/ where if you edit the pid file (affects idler).

Comment 3 Ram Ranganathan 2012-08-28 22:08:38 UTC
Will fix the related issue with a separate bug so that QE can test a simpler case here. 

Fixed this issue with git commits: b137c82c49004cc8eae4dcd1c484ba622a6e0159 in li and 3e8c713e0b7eb2844b98a8269edce68fe01fc737 in crankcase. 

Waiting on pull requests: https://github.com/openshift/li/pull/326 and https://github.com/openshift/crankcase/pull/443

Comment 4 Jianwei Hou 2012-08-30 10:09:25 UTC
Created attachment 608132 [details]
development.log

Tested on devenv_2097

Failed to move application when the httpd.pid is empty

Steps to Reproduce:
1. Setup multi-node environments
2. Create a php application
3. move this app from one node to another
   rhc-admin-move --gear_uuid 0057251d5b5248abb4b419ab06b4d049
   move was OK
4. empty httpd.pid file and move again

Move fails:

DEBUG: Starting cartridge 'php-5.3' in 'php1' after move on ip-10-191-178-229
DEBUG: Moving failed.  Rolling back gear 'php1' 'php1' with remove-httpd-proxy on 'ip-10-191-178-229'
DEBUG: Moving failed.  Rolling back gear 'php1' in 'php1' with destroy on 'ip-10-191-178-229'
/usr/lib/ruby/gems/1.8/gems/gearchanger-mcollective-plugin-0.3.1/lib/gearchanger-mcollective-plugin/gearchanger/mcollective_application_container_proxy.rb:1324:in `run_cartridge_command_old': Node execution failure (invalid exit code from node).  If the problem persists please contact Red Hat support. (StickShift::NodeException)
	from /var/www/stickshift/broker/lib/express/broker/mcollective_ext.rb:13:in `run_cartridge_command'
	from /usr/lib/ruby/gems/1.8/gems/gearchanger-mcollective-plugin-0.3.1/lib/gearchanger-mcollective-plugin/gearchanger/mcollective_application_container_proxy.rb:881:in `move_gear'
	from /usr/lib/ruby/gems/1.8/gems/gearchanger-mcollective-plugin-0.3.1/lib/gearchanger-mcollective-plugin/gearchanger/mcollective_application_container_proxy.rb:875:in `each'
	from /usr/lib/ruby/gems/1.8/gems/gearchanger-mcollective-plugin-0.3.1/lib/gearchanger-mcollective-plugin/gearchanger/mcollective_application_container_proxy.rb:875:in `move_gear'
	from /usr/bin/rhc-admin-move:109


Additional:
Also failed with phpmyadmin, perl, so other cartridges may have the same problem
Remove httpd.pid, and move again, move succeed

Attached development.log

Comment 5 Ram Ranganathan 2012-08-30 23:20:18 UTC
@Hou, can you please attach the mcollective log as well.  Also the httpd.pid file 
emptying -- did you do that on the php-5.3/run/httpd.pid or the phymyadmin-3.4/run/httpd.pid file?

The fix here was a simple case done to only address: phpmyadmin, rockmongo, phpmoadmin and metrics.


The other apache based apps ones (php/python/perl/ruby etc) needs a lil' more involved fix as its the primary app control script. Will fix that as part of bug fixes next week.

Comment 6 Jianwei Hou 2012-08-31 09:34:02 UTC
(In reply to comment #5)
> @Hou, can you please attach the mcollective log as well.  Also the httpd.pid
> file 
> emptying -- did you do that on the php-5.3/run/httpd.pid or the
> phymyadmin-3.4/run/httpd.pid file?
> 
> The fix here was a simple case done to only address: phpmyadmin, rockmongo,
> phpmoadmin and metrics.
> 
> 
> The other apache based apps ones (php/python/perl/ruby etc) needs a lil'
> more involved fix as its the primary app control script. Will fix that as
> part of bug fixes next week.

I did it on both applications: a php-5.3 application and a embedded phpmyadmin-3.4 cartridge, they both failed.
Now that this fix is only for phpmyadmin, rockmongo, phpmoadmin and metrics. I won't focus on other cartridges any more.

I have tested phpmyadmin, rockmongo, phpmoadmin and metrics they all fails when httpd.pid is empty.
Added development.log, mcollective.log and error messages when the problem is encountered.

For other apache based primary cartridges, I have filed Bug 853372 to keep track

Comment 7 Jianwei Hou 2012-08-31 09:34:58 UTC
Created attachment 608447 [details]
development.log(2012-08-31)

Comment 8 Jianwei Hou 2012-08-31 09:35:39 UTC
Created attachment 608448 [details]
mcollective.log

Comment 9 Jianwei Hou 2012-08-31 09:36:32 UTC
Created attachment 608449 [details]
error messages

Comment 10 Ram Ranganathan 2012-09-05 02:31:09 UTC
Fixed with pull requests: 
https://github.com/openshift/crankcase/pull/452
https://github.com/openshift/li/pull/342

waiting for merge+test.

Comment 11 Jianwei Hou 2012-09-05 10:50:49 UTC
Verified on devenv_2115

Steps:
1. Setup multi-node environments
2. Create applications and embed phpmyadmin/rockmongo/phpmoadmin/metrics
3. Set httpd.pid empty for above cartridges
   :> /var/lib/stickshift/$UUID/$cartridge/run/httpd.pid
4. Move this app
   rhc-admin-move --gear_uuid $UUID -i $target_server_identity

Result:
Move is successful when httpd.pid for phpmyadmin/rockmongo/phpmoadmin/metrics is empty


Note You need to log in before you can comment on or make changes to this bug.