Bug 1007704

Summary: Sometimes meet "Warning: Application XXX supervisor PID does not match '$OPENSHIFT_NODEJS_PID_DIR/supervisor.pid'" when stopping nodejs-0.10 and nodejs-0.6 apps
Product: OpenShift Online Reporter: Zhe Wang <zhewang>
Component: ContainersAssignee: Fotios Lindiakos <fotios>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.xCC: alex, fotios, jkeck, lzhang, xtian
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-19 16:51:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Zhe Wang 2013-09-13 07:33:49 UTC
Description of problem:
When deploying changes to a Node.js-0.10 app, it sometimes shows the warning:

supervisor PID does not match '$OPENSHIFT_NODEJS_PID_DIR/supervisor.pid

and the deployment then fails with the error:

Error message: Failed to execute: 'control stop' for /var/lib/openshift/5232b3ec6cec0e43ea0000e0/nodejs

However, this bug was not caught in yesterday's test against INT.

Version-Release number of selected component (if applicable):
INT(devenv_3779)
devenv-stage_471

How reproducible:
6/19 nodejs-0.10 cases failed due to this problem during the acceptance automation test against INT today, and one similar failure in devenv-stage_471. Moreover, there were 9 jobs running against INT.

Steps to Reproduce:
1. create a node.js-0.10 app
2. make some local changes and push them to the remote repo

Actual results:
The deployment failed with the errors below:

remote: Stopping NodeJS cartridge 
remote: Warning: Application 'app1' supervisor PID does not match '$OPENSHIFT_NODEJS_PID_DIR/supervisor.pid'. Use force-stop to kill. 
remote: An error occurred executing 'gear prereceive' (exit code: 141) 
remote: Error message: Failed to execute: 'control stop' for /var/lib/openshift/5232b3ec6cec0e43ea0000e0/nodejs 
remote:
remote: For more details about the problem, try running the command again with the '--trace' option. 
To ssh://5232b3ec6cec0e43ea0000e0.rhcloud.com/~/git/app1.git/ 
! [remote rejected] master -> master (pre-receive hook declined) 
error: failed to push some refs to 'ssh://5232b3ec6cec0e43ea0000e0.rhcloud.com/~/git/app1.git/'

Expected results:
The deployment should succeed.

Additional info:

Comment 1 Zhe Wang 2013-09-16 10:35:42 UTC
This bug is reproducible in STG(devenv-stage_472), for example,

Comment 2 Zhe Wang 2013-09-16 10:37:17 UTC
This bug is also reproducible in STG(devenv-stage_472) when creating a snapshot, making creating snapshot fail:

rhc snapshot restore dbscale2 -f /home/slave1/workdir/2013/09/15/22:40:04/Snapshot_Restore_database_of_scalable_app-nodejs-0_10/dbscale1.tar.gz -l jizhao+3 -p 'redhat' --server stg.openshift.redhat.com

Warning: Application 'dbscale2' supervisor PID does not match '$OPENSHIFT_NODEJS_PID_DIR/supervisor.pid'. Use force-stop to kill. 

/opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/v2_cart_model.rb:1160:in `block in do_control_with_directory': Failed to execute: 'control stop' for /var/lib/openshift/5236c30adbd93c6ebe0001e2/nodejs (OpenShift::Runtime::Utils::ShellExecutionException) from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/v2_cart_model.rb:969:in `process_cartridges' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/v2_cart_model.rb:1128:in `do_control_with_directory' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/v2_cart_model.rb:991:in `do_control' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/v2_cart_model.rb:1382:in `stop_cartridge' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/v2_cart_model.rb:1237:in `block in stop_gear' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/v2_cart_model.rb:78:in `block in each_cartridge' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/v2_cart_model.rb:975:in `block in process_cartridges' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/v2_cart_model.rb:973:in `each' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/v2_cart_model.rb:973:in `process_cartridges' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/v2_cart_model.rb:76:in `each_cartridge' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/v2_cart_model.rb:1236:in `stop_gear' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/application_container.rb:398:in `stop_gear' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/application_container_ext/cartridge_actions.rb:201:in `pre_receive' from /opt/rh/ruby193/root/usr/share/gems/gems/openshift-origin-node-1.14.7/lib/openshift-origin-node/model/application_container_ext/snapshots.rb:127:in `restore' from /usr/bin/gear:306:in `block (2 levels) in ' from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:180:in `call' from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:180:in `call' from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/command.rb:155:in `run' from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/runner.rb:385:in `run_active_command' from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/runner.rb:62:in `run!' from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/delegates.rb:11:in `run!' from /opt/rh/ruby193/root/usr/share/gems/gems/commander-4.0.3/lib/commander/import.rb:10:in `block in ' Restoring from snapshot /home/slave1/workdir/2013/09/15/22:40:04/Snapshot_Restore_database_of_scalable_app-nodejs-0_10/dbscale1.tar.gz... Error in trying to restore snapshot. You can try to restore manually by running: cat '/home/slave1/workdir/2013/09/15/22:40:04/Snapshot_Restore_database_of_scalable_app-nodejs-0_10/dbscale1.tar.gz' | ssh 5236c30adbd93c6ebe0001e2.rhcloud.com 'restore INCLUDE_GIT'

Comment 3 Fotios Lindiakos 2013-09-16 14:31:21 UTC
Testing and merging a PR for this: 
https://github.com/openshift/origin-server/pull/3640

Since this is after stage cut, there is a separate stage PR: 
https://github.com/openshift/origin-server/pull/3642

Comment 4 Zhe Wang 2013-09-17 09:47:12 UTC
Not meet this problem in the tests against devenv-stage_477 and STG(devenv-stage_475). Move this bug to VERIFIED.

Thanks,
z.

Comment 5 Alex Knol 2013-10-01 14:19:14 UTC
Does this mean it should be solved in the current version which is live on openshift.com ?