Bug 1491040

Summary:	After openshift node process was restarted, pods with init-containers show in pending state
Product:	OpenShift Container Platform	Reporter:	Steven Walter <stwalter>
Component:	Node	Assignee:	Seth Jennings <sjenning>
Status:	CLOSED ERRATA	QA Contact:	DeShuai Ma <dma>
Severity:	high	Docs Contact:
Priority:	high
Version:	3.5.0	CC:	aos-bugs, byount, decarr, emahoney, jokerman, mmccomas, nbhatt, rhowe
Target Milestone:	---
Target Release:	3.6.z
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	After atomic-openshift-node restart, pods with init containers may have their status reset when an attempt is made to reread the init container status. If the container has been deleted, this will fail, resetting the status. This fix prevents rereading the init container status if the current status on the pod resource indicates the init container is terminated.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-10-25 13:06:40 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Steven Walter 2017-09-12 21:06:49 UTC

After openshift node process was restarted, pods with init containers show different status.

Pods show pending
Init container is in waiting state
Regular container is in running state

Expected:
Init container should be in terminated state with a reason of completed
Pod should show running state

Comment 15 DeShuai Ma 2017-10-10 10:02:08 UTC

Verify on openshift v3.6.173.0.45
[root@ip-172-18-5-103 ~]# openshift version
openshift v3.6.173.0.45
kubernetes v1.6.1+5115d708d7
etcd 3.2.1

1. Create a pod with initContainer
[root@ip-172-18-5-103 ~]# oc new-project dma
Now using project "dma" on server "https://ip-172-18-5-103.ec2.internal:8443".

You can add applications to this project with the 'new-app' command. For example, try:

    oc new-app centos/ruby-22-centos7~https://github.com/openshift/ruby-ex.git

to build a new example application in Ruby.
[root@ip-172-18-5-103 ~]# oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/pods/initContainers/Promote_InitContainers/pod-init-containers-success.yaml
pod "init-success" created
[root@ip-172-18-5-103 ~]# oc get po -w
NAME           READY     STATUS     RESTARTS   AGE
init-success   0/1       Init:0/1   0          9s
init-success   0/1       PodInitializing   0         26s
init-success   1/1       Running   0         55s
[root@ip-172-18-5-103 ~]# oc get po -o wide
NAME           READY     STATUS    RESTARTS   AGE       IP            NODE
init-success   1/1       Running   0          2m        10.128.0.12   ip-172-18-8-39.ec2.internal

2. When pod become running, restart the node that the pod located.
[root@ip-172-18-8-39 ~]# systemctl restart atomic-openshift-node
[root@ip-172-18-8-39 ~]# 

3. Then check the pod status again, the initContainer is Terminated and the app container is runnning
[root@ip-172-18-5-103 ~]# oc describe po init-success
Name:			init-success
Namespace:		dma
Security Policy:	anyuid
Node:			ip-172-18-8-39.ec2.internal/172.18.8.39
Start Time:		Tue, 10 Oct 2017 05:54:20 -0400
Labels:			<none>
Annotations:		openshift.io/scc=anyuid
Status:			Running
IP:			10.128.0.12
Controllers:		<none>
Init Containers:
  init:
    Container ID:	docker://1ca1097e96ece80e41a9cfcc89624e0f70d693ee0bfa5331ff0483cea1e7784e
    Image:		centos:centos7
    Image ID:		docker-pullable://docker.io/centos@sha256:eba772bac22c86d7d6e72421b4700c3f894ab6e35475a34014ff8de74c10872e
    Port:		
    Command:
      /bin/true
    State:		Terminated
      Reason:		Completed
      Exit Code:	0
      Started:		Tue, 10 Oct 2017 05:54:46 -0400
      Finished:		Tue, 10 Oct 2017 05:54:46 -0400
    Ready:		True
    Restart Count:	0
    Environment:	<none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-g562k (ro)
Containers:
  hello-pod:
    Container ID:	docker://0d963e9f67d5f254679c03d449bc8b1c07555ff729fe830d42cd7d5236dcad3f
    Image:		docker.io/ocpqe/hello-pod
    Image ID:		docker-pullable://docker.io/ocpqe/hello-pod@sha256:289953c559120c7d2ca92d92810885887ee45c871c373a1e492e845eca575b8c
    Port:		80/TCP
    State:		Running
      Started:		Tue, 10 Oct 2017 05:55:15 -0400
    Ready:		True
    Restart Count:	0
    Environment:	<none>
    Mounts:
      /usr/share/nginx/html from workdir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-g562k (ro)
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	True 
  PodScheduled 	True 
Volumes:
  workdir:
    Type:	EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:	
  default-token-g562k:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	default-token-g562k
    Optional:	false
QoS Class:	BestEffort
Node-Selectors:	<none>
Tolerations:	<none>
Events:
  FirstSeen	LastSeen	Count	From					SubObjectPath			Type		Reason		Message
  ---------	--------	-----	----					-------------			--------	------		-------
  5m		5m		1	default-scheduler							Normal		Scheduled	Successfully assigned init-success to ip-172-18-8-39.ec2.internal
  5m		5m		1	kubelet, ip-172-18-8-39.ec2.internal	spec.initContainers{init}	Normal		Pulling		pulling image "centos:centos7"
  4m		4m		1	kubelet, ip-172-18-8-39.ec2.internal	spec.initContainers{init}	Normal		Pulled		Successfully pulled image "centos:centos7"
  4m		4m		1	kubelet, ip-172-18-8-39.ec2.internal	spec.initContainers{init}	Normal		Created		Created container
  4m		4m		1	kubelet, ip-172-18-8-39.ec2.internal	spec.initContainers{init}	Normal		Started		Started container
  4m		4m		1	kubelet, ip-172-18-8-39.ec2.internal	spec.containers{hello-pod}	Normal		Pulling		pulling image "docker.io/ocpqe/hello-pod"
  4m		4m		1	kubelet, ip-172-18-8-39.ec2.internal	spec.containers{hello-pod}	Normal		Pulled		Successfully pulled image "docker.io/ocpqe/hello-pod"
  4m		4m		1	kubelet, ip-172-18-8-39.ec2.internal	spec.containers{hello-pod}	Normal		Created		Created container
  4m		4m		1	kubelet, ip-172-18-8-39.ec2.internal	spec.containers{hello-pod}	Normal		Started		Started container

Comment 17 errata-xmlrpc 2017-10-25 13:06:40 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3049