Bugzilla will be upgraded to version 5.0 on December 2, 2018. The outage period for the upgrade will start at 0:00 UTC and have a duration of 12 hours
Bug 1318681 - The pod's state is different from web UI and CLI
The pod's state is different from web UI and CLI
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Management Console (Show other bugs)
3.1.0
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Samuel Padgett
Yadan Pei
:
Depends On: 1318497
Blocks:
  Show dependency treegraph
 
Reported: 2016-03-17 09:40 EDT by Eric Rich
Modified: 2018-03-29 22:23 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1318497
Environment:
Last Closed: 2016-05-12 12:33:19 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
webconsole output (105.80 KB, application/pdf)
2016-03-17 17:45 EDT, Avesh Agarwal
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2317401 None None None 2018-03-29 22:23 EDT
Red Hat Product Errata RHSA-2016:1064 normal SHIPPED_LIVE Important: Red Hat OpenShift Enterprise 3.2 security, bug fix, and enhancement update 2016-05-12 16:19:17 EDT

  None (edit)
Description Eric Rich 2016-03-17 09:40:59 EDT
+++ This bug was initially created as a clone of Bug #1318497 +++

Description of problem:

pod shows "Terminating" by CLI, web UI has "Pending" status, according to the user's report.

Version-Release number of selected component (if applicable):

- v3.1.1.6

How reproducible:

- Unclear

Actual results:

- Pods remains as "Terminating" status, while UI states "Pending"

Expected results:

- Pods state with UI and CLI should match

Additional info:
Comment 1 Andy Goldstein 2016-03-17 15:04:41 EDT
Avesh, please review the support case, triage, and let me know if you think this is something we need to handle or if it's a console (UI) issue.
Comment 2 Avesh Agarwal 2016-03-17 16:35:39 EDT
I tested it with latest origin with both UI and CLI, in some way I can reproduce it but there is a difference, the CLI (oc get and oc describe shows Terminating which is correct), however UI shows "creatingcontainer (I forgot to capture the screen, will post shortly). So right now it really seems UI issue, unless there is another way to reproduce

Output from CLI:

[root@localhost origin]# oc get pods
NAME        READY     STATUS        RESTARTS   AGE
hello-pod   0/1       Terminating   0          37s
[root@localhost origin]# oc describe pod
Name:                           hello-pod
Namespace:                      default
Node:                           192.168.122.253/192.168.122.253
Start Time:                     Thu, 17 Mar 2016 16:16:00 -0400
Labels:                         name=hello-pod
Status:                         Terminating (expires Thu, 17 Mar 2016 16:16:40 -0400)
Termination Grace Period:       30s
IP:
Controllers:                    <none>
Containers:
  hello-pod:
    Container ID:
    Image:              pweil/hello-nginx-docker
    Image ID:
    Port:
    QoS Tier:
      cpu:              BestEffort
      memory:           BestEffort
    State:              Waiting
      Reason:           ContainerCreating
    Ready:              False
    Restart Count:      0
    Environment Variables:
Conditions:
  Type          Status
  Ready         False
Volumes:
  default-token-28x1v:
    Type:       Secret (a volume populated by a Secret)
    SecretName: default-token-28x1v
Events:
  FirstSeen     LastSeen        Count   From                            SubobjectPath                   Type            Reason          Message
  ---------     --------        -----   ----                            -------------                   --------        ------          -------
  38s           38s             1       {default-scheduler }                                            Normal          Scheduled       Successfully assigned hello-pod to 192.168.122.253
  34s           34s             1       {kubelet 192.168.122.253}       spec.containers{hello-pod}      Normal          Pulling         pulling image "pweil/hello-nginx-docker"
Comment 3 Avesh Agarwal 2016-03-17 16:36:10 EDT
I will test soon on 3.1.1.6 but wanted to test UI first with latest origin.
Comment 4 Andy Goldstein 2016-03-17 16:52:20 EDT
We need an sosreport before we can debug much more
Comment 5 Avesh Agarwal 2016-03-17 17:42:27 EDT
Summary: I tried some different steps to reproduce it with latest origin, and this time pod (running busybox container with sleep command with some very high value like 9999999) was in Running state. When I deleted with grace period with 200 seconds, I could see that both oc get/describe shows the pod in terminating but the web UI shows "Running state.

Steps:
1. One master and one node with latest origin
2. busybox image was already pulled. 
3. oc create -f oc create -f /root/test-pod.yaml
cat /root/test-pod.yaml:
apiVersion: v1
kind: Pod
metadata:
  name: busybox
  namespace: test
spec:  # specification of the pod's contents
  restartPolicy: Never
  containers:
  - name: busybox
    image: "busybox"
    command: ["sleep", "999999"]
4. oc delete -f /root/test-pod.yaml  --grace-period=200 
  

The difference this time with the previous one is that in the previous was image was being pulled when oc delete was executed, whereas in this image already exists and pod is in Running state but during oc delete gets stuck in terminating until grace period as the container does not exit before that. 
Output:

[root@localhost origin]# oc describe pod -n test
Name:				busybox
Namespace:			test
Node:				192.168.122.253/192.168.122.253
Start Time:			Thu, 17 Mar 2016 17:30:11 -0400
Labels:				<none>
Status:				Terminating (expires Thu, 17 Mar 2016 17:34:22 -0400)
Termination Grace Period:	200s
IP:				172.17.0.2
Controllers:			<none>
Containers:
  busybox:
    Container ID:	docker://2c385747bee291749e19f531a9c75c75bd58f78ec1c6430ebed188221f85b912
    Image:		busybox
    Image ID:		docker://559d41a5eba1c166e63a27a766321d494946ad220c1636be26c83b23ff7549f2
    Port:		
    Command:
      sleep
      999999
    QoS Tier:
      memory:		BestEffort
      cpu:		BestEffort
    State:		Running
      Started:		Thu, 17 Mar 2016 17:30:14 -0400
    Ready:		True
    Restart Count:	0
    Environment Variables:
Conditions:
  Type		Status
  Ready 	True 
Volumes:
  default-token-mrbqp:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	default-token-mrbqp
Events:
  FirstSeen	LastSeen	Count	From				SubobjectPath			Type		Reason		Message
  ---------	--------	-----	----				-------------			--------	------		-------
  2m		2m		1	{default-scheduler }						Normal		Scheduled	Successfully assigned busybox to 192.168.122.253
  2m		2m		1	{kubelet 192.168.122.253}	spec.containers{busybox}	Normal		Pulling		pulling image "busybox"
  2m		2m		1	{kubelet 192.168.122.253}	spec.containers{busybox}	Normal		Pulled		Successfully pulled image "busybox"
  2m		2m		1	{kubelet 192.168.122.253}	spec.containers{busybox}	Normal		Created		Created container with docker id 2c385747bee2
  2m		2m		1	{kubelet 192.168.122.253}	spec.containers{busybox}	Normal		Started		Started container with docker id 2c385747bee2


[root@localhost origin]# oc get pod -n test
NAME      READY     STATUS        RESTARTS   AGE
busybox   1/1       Terminating   0          2m
Comment 6 Avesh Agarwal 2016-03-17 17:45:25 EDT
I have attached a pdf for webconsole output showing running state.
Comment 7 Avesh Agarwal 2016-03-17 17:45 EDT
Created attachment 1137530 [details]
webconsole output
Comment 8 Avesh Agarwal 2016-03-17 17:47:54 EDT
If you compare the timesstamp in oc describe and webconsole, you could see that they are showing different states at the same time (around 17:30:14).
Comment 9 Avesh Agarwal 2016-03-17 17:56:10 EDT
So it seems to me that in both tests, webconsole seems to be reporting the state that was before running oc delete but not updating to Terminating after oc delete.
Comment 10 Andy Goldstein 2016-03-18 09:01:58 EDT
Jessica, sending this your way since it appears to be UI related
Comment 11 Samuel Padgett 2016-03-18 12:21:13 EDT
https://github.com/openshift/origin/pull/8127
Comment 12 Yadan Pei 2016-03-21 02:28:38 EDT
checked on latest 3.2 puddle AtomicOpenShift-errata/3.2/latest/RH7-RHAOS-3.2/x86_64/os/Packages/

Fixed not merged yet, will check when new puddle is ready
Comment 13 Yadan Pei 2016-03-23 02:42:39 EDT
Checked on 2016-03-21.4 puddle, 

1) Create "busybox" pod
$ oc create -f test-pod.yaml

2) Check pod status through CLI
$ oc get pods
NAME             READY     STATUS              RESTARTS   AGE
busybox          0/1       ContainerCreating   0          14s

On web console, it's also ContainerCreating

3) Delete pod 
$ oc delete pod busybox --grace-period=200

4) Check pod status after deleting
$ oc get pods
NAME             READY     STATUS        RESTARTS   AGE
busybox          1/1       Terminating   0          1m

5) On web console, it is shown as "Terminating" also 

Move to VERIFIED
Comment 15 errata-xmlrpc 2016-05-12 12:33:19 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:1064

Note You need to log in before you can comment on or make changes to this bug.