Bug 1028205 - 'oo-admin-ctl-gears forcestopgear' fails to stop locked gears
Summary: 'oo-admin-ctl-gears forcestopgear' fails to stop locked gears
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Online
Classification: Red Hat
Component: Containers
Version: 1.x
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Dan McPherson
QA Contact: libra bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-11-07 21:58 UTC by Stefanie Forrester
Modified: 2014-01-30 00:49 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-01-30 00:49:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Stefanie Forrester 2013-11-07 21:58:21 UTC
Description of problem:

Sometimes there are processes left running after a gear is "stopped". (Specifically, we're seeing this with java processes).

A process is running under the gear's uuid, but the gear is in a "locked" state, as if it tried to terminate. Attempting to terminate the process using 'oo-admin-ctl-gears forcestopgear <uuid>' fails. 


Version-Release number of selected component (if applicable): 
openshift-origin-node-util-1.16.3-1.el6oso.noarch


How reproducible:

Very reproducible, if you can find one of these locked gears (which appear somewhat frequently in production). 'forcestopgear' fails to stop the gear every time so far.

Steps to Reproduce:
1. Look at 'top' to identify processes consuming the most swap.
2. Try to restart the gear associated with that process... restart fails, and indicates gear is locked.

[sedgar ~]$ sudo oo-admin-ctl-gears restartgear  2364857a1c3e443fbb31bc00107e0d5d
Gear is locked: 2364857a1c3e443fbb31bc00107e0d5d

3. Force-stopping the gear also fails. It can only be killed with a SIGKILL.

[sedgar ~]$ sudo oo-admin-ctl-gears stopgear 2364857a1c3e443fbb31bc00107e0d5d
Gear is locked: 2364857a1c3e443fbb31bc00107e0d5d

[sedgar ~]$ sudo oo-admin-ctl-gears forcestopgear 2364857a1c3e443fbb31bc00107e0d5d
Gear is locked: 2364857a1c3e443fbb31bc00107e0d5d

Actual results:
'oo-admin-ctl-gears forcestopgear' is unable to stop the gear.

Expected results:
'oo-admin-ctl-gears forcestopgear' should be successful in terminating the remaining gear processes. 

Additional info:

Comment 1 Dan McPherson 2013-11-09 00:13:08 UTC
https://github.com/openshift/origin-server/pull/4141

Comment 2 Dan McPherson 2013-11-09 00:16:30 UTC
I changed both stop and force stop to ignore the stop lock.  It should only be used to stop an admin from starting/restarting.

Comment 4 Meng Bo 2013-11-11 05:47:16 UTC
Checked on devenv_4016, issue has been fixed.

restartgear/stopgear/forcestopgear will ignore the stop_lock.


# oo-admin-ctl-gears startgear 52806dc76f25e958b0000004
Gear is locked: 52806dc76f25e958b0000004
# oo-admin-ctl-gears restartgear 52806dc76f25e958b0000004
Restarting gear 52806dc76f25e958b0000004... [ OK ]

# oo-admin-ctl-gears startgear 52806dc76f25e958b0000004
Gear is locked: 52806dc76f25e958b0000004
# oo-admin-ctl-gears stopgear 52806dc76f25e958b0000004
Stopping gear 52806dc76f25e958b0000004... [ OK ]

# oo-admin-ctl-gears startgear 52806dc76f25e958b0000004
Gear is locked: 52806dc76f25e958b0000004
# oo-admin-ctl-gears forcestopgear 52806dc76f25e958b0000004
Stopping gear 52806dc76f25e958b0000004... [ OK ]

Move bug to verified.


Note You need to log in before you can comment on or make changes to this bug.