Description of problem: If the JBoss process terminates in between this check: local should_be_gone_pid=$(ps -o pid -p ${_pid} --no-headers) and this kill -TERM command kill -TERM ${_pid} Then the restart will fail. Reported error: Error: Failed to execute: 'control restart' for /var/lib/openshift/ <gear_id>/ jbossas Stopping jbossas cartridge Sending SIGTERM to jboss:54974 ... /var/lib/openshift/ <gear_id>/ jbossas/bin/ control: line 132: kill: (54974) - No such process -- Unable to complete the requested operation due to: Failed to correctly execute all parallel operations - ["RestartCompOp"]. Version-Release number of selected component (if applicable): Occurs in PROD Expected results: Since the process has actually terminated it should continue with attempting to start Jboss.
https://github.com/openshift/origin-server/pull/4523
Commit pushed to master at https://github.com/openshift/origin-server https://github.com/openshift/origin-server/commit/e54d18afa7868f9bc8ca14a162816fbefa7f3ee9 Bug 1055646 - [new relic] JBossAS cart restart fails if kill -TERM is called when process has already terminated
Checked on devenv_4248, Add "sleep 15" after the line local should_be_gone_pid=$(ps -o pid -p ${_pid} --no-headers) for easy simulate the issue. During the app restart, kill -TERM the java process will not break the operation. 2014-01-21 03:42:44.767 [DEBUG] Execute RestartCompOp (pid:2693) 2014-01-21 03:44:11.310 [DEBUG] DEBUG: Output of parallel execute: [{:tag=>{"op_id"=>"52de33045bb9a6a072000020"}, :gear=>"52de32755bb9a6a072000006", :job=>{:cartridge=>"openshift-origin-node", :action=>"restart", :args=>{"--with-app-uuid"=>"52de32755bb9a6a072000006", "--with-app-name"=>"jbas1", "--with-container-uuid"=>"52de32755bb9a6a072000006", "--with-container-name"=>"jbas1", "--with-namespace"=>"bmengdev", "--with-request-id"=>"92f17c1338dc83445de57ebefc967051", "--cart-name"=>"jbossas-7", "--component-name"=>"jbossas-7", "--with-software-version"=>"7", "--cartridge-vendor"=>"redhat", "--all"=>false, "--parallel_concurrency_ratio"=>0.5}}, :result_stdout=>"", :result_stderr=>"", :result_exit_code=>0, :result_addtl_params=>nil}], exitcode: 0, from: ip-10-238-239-6 (Request ID: 92f17c1338dc83445de57ebefc967051) (pid:2693) 2014-01-21 03:44:11.312 [DEBUG] DEBUG: MCollective Response Time (execute_parallel): 86542ms (Request ID: 92f17c1338dc83445de57ebefc967051) (pid:2693) 2014-01-21 03:44:11.379 [DEBUG] SUCCESS ACTION=RESTART_APPLICATION USER_ID=52de32695bb9a6a072000001 LOGIN=bmeng APP_UUID=52de32755bb9a6a072000006 DOMAIN=bmengdev Application jbas1 has restarted (pid:2693) 2014-01-21 03:44:11.396 [INFO ] Completed 200 OK in 86653ms (Views: 13.8ms) (pid:2693)