Bug 1055646 - [new relic] JBossAS cart restart fails if kill -TERM is called when process has already terminated
Summary: [new relic] JBossAS cart restart fails if kill -TERM is called when process h...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Image
Version: 1.x
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Ben Parees
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-01-20 16:40 UTC by Jessica Forrester
Modified: 2014-01-30 00:56 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-30 00:56:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Jessica Forrester 2014-01-20 16:40:35 UTC
Description of problem:
If the JBoss process terminates in between this check:

local should_be_gone_pid=$(ps -o pid -p ${_pid} --no-headers)

and this kill -TERM command

kill -TERM ${_pid}

Then the restart will fail.

Reported error:
Error: Failed to execute: 'control restart' for /var/lib/openshift/ <gear_id>/ jbossas Stopping jbossas cartridge Sending SIGTERM to jboss:54974 ... /var/lib/openshift/ <gear_id>/ jbossas/bin/ control: line 132: kill: (54974) - No such process -- Unable to complete the requested operation due to: Failed to correctly execute all parallel operations - ["RestartCompOp"].

Version-Release number of selected component (if applicable):
Occurs in PROD


Expected results:
Since the process has actually terminated it should continue with attempting to start Jboss.

Comment 3 openshift-github-bot 2014-01-20 23:08:08 UTC
Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/e54d18afa7868f9bc8ca14a162816fbefa7f3ee9
Bug 1055646 - [new relic] JBossAS cart restart fails if kill -TERM is called when process has already terminated

Comment 4 Meng Bo 2014-01-21 08:48:20 UTC
Checked on devenv_4248,

Add "sleep 15" after the line
local should_be_gone_pid=$(ps -o pid -p ${_pid} --no-headers)
for easy simulate the issue.

During the app restart, kill -TERM the java process will not break the operation.



2014-01-21 03:42:44.767 [DEBUG] Execute RestartCompOp (pid:2693)
2014-01-21 03:44:11.310 [DEBUG] DEBUG: Output of parallel execute: [{:tag=>{"op_id"=>"52de33045bb9a6a072000020"}, :gear=>"52de32755bb9a6a072000006", :job=>{:cartridge=>"openshift-origin-node", :action=>"restart", :args=>{"--with-app-uuid"=>"52de32755bb9a6a072000006", "--with-app-name"=>"jbas1", "--with-container-uuid"=>"52de32755bb9a6a072000006", "--with-container-name"=>"jbas1", "--with-namespace"=>"bmengdev", "--with-request-id"=>"92f17c1338dc83445de57ebefc967051", "--cart-name"=>"jbossas-7", "--component-name"=>"jbossas-7", "--with-software-version"=>"7", "--cartridge-vendor"=>"redhat", "--all"=>false, "--parallel_concurrency_ratio"=>0.5}}, :result_stdout=>"", :result_stderr=>"", :result_exit_code=>0, :result_addtl_params=>nil}], exitcode: 0, from: ip-10-238-239-6  (Request ID: 92f17c1338dc83445de57ebefc967051) (pid:2693)
2014-01-21 03:44:11.312 [DEBUG] DEBUG: MCollective Response Time (execute_parallel): 86542ms  (Request ID: 92f17c1338dc83445de57ebefc967051) (pid:2693)
2014-01-21 03:44:11.379 [DEBUG] SUCCESS ACTION=RESTART_APPLICATION USER_ID=52de32695bb9a6a072000001 LOGIN=bmeng APP_UUID=52de32755bb9a6a072000006 DOMAIN=bmengdev Application jbas1 has restarted (pid:2693)
2014-01-21 03:44:11.396 [INFO ] Completed 200 OK in 86653ms (Views: 13.8ms) (pid:2693)


Note You need to log in before you can comment on or make changes to this bug.