Bug 1508366

Summary: docker agent ignores existing but stopped containers
Product: Red Hat Enterprise Linux 7 Reporter: Andrew Beekhof <abeekhof>
Component: resource-agentsAssignee: Oyvind Albrigtsen <oalbrigt>
Status: CLOSED ERRATA QA Contact: Udi Shkalim <ushkalim>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 7.4CC: agk, cluster-maint, fdinitto, mnovacek
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: resource-agents-3.9.5-111.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-10 12:09:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Andrew Beekhof 2017-11-01 10:14:55 UTC
Description of problem:

When the 'reuse' parameter is false, the agent must treat the existence of a terminated container as an error.

Otherwise, a subsequent attempt to start it will result in an error because a container with the same name already exists.

Version-Release number of selected component (if applicable):

resource-agents-3.9.5-105.el7.x86_64

How reproducible:

100%

Steps to Reproduce:
1. Manually create a container named 'badContainer'
2. docker stop badContainer
3. Create a docker cluster resource called badContainer and arrange for it to start on the above node

Actual results:

monitor reports 'not running' and start fails

Expected results:

monitor detects an error, stop called, start succeeds
Additional info:

Comment 3 Andrew Beekhof 2017-11-01 10:54:37 UTC
Patch against 3.9.5-105.el7.x86_64:


--- docker	2017-11-01 21:50:21.423600738 +1100
+++ docker.beekhof	2017-11-01 21:48:10.570475949 +1100
 #######################################################################
 
@@ -234,16 +235,16 @@ docker_simple_status()
 
 	# retrieve the 'Running' attribute for the container
 	val=$(docker inspect --format {{.State.Running}} $CONTAINER 2>/dev/null)
-	if [ $? -ne 0 ]; then
-		#not running as a result of container not being found
-		return $OCF_NOT_RUNNING
-	fi
-
-	if ocf_is_true "$val"; then
+	if [ $? -eq 0 ]; then
+	    if ocf_is_true "$val" ; then
 		# container exists and is running
 		return $OCF_SUCCESS
 	fi
-
+	fi
+	# Known but in a stopped state
+	if ! ocf_is_true "$OCF_RESKEY_reuse"; then
+	    return $OCF_ERR_GENERIC
+	fi
 	return $OCF_NOT_RUNNING
 }

Comment 7 errata-xmlrpc 2018-04-10 12:09:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0757