Bug 1343157 - Project delete leads to unexpected items in namespace
Summary: Project delete leads to unexpected items in namespace
Keywords:
Status: CLOSED DUPLICATE of bug 1364243
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 3.2.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Derek Carr
QA Contact: DeShuai Ma
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-06-06 16:02 UTC by Vikas Laad
Modified: 2016-08-15 17:23 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-15 17:23:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
journalctl_atomic-openshift-master (15.05 MB, application/x-gzip)
2016-06-06 16:02 UTC, Vikas Laad
no flags Details
journalctl_atomic-openshift-node (48.89 KB, application/x-gzip)
2016-06-06 16:05 UTC, Vikas Laad
no flags Details
grep for project eap64-mysql-s2i-user-402-171-469-216 in /var/log/messages (6.54 MB, application/x-gzip)
2016-06-06 16:07 UTC, Vikas Laad
no flags Details
different pods stuck on the cluster (10.78 KB, text/plain)
2016-06-06 16:11 UTC, Vikas Laad
no flags Details

Description Vikas Laad 2016-06-06 16:02:54 UTC
Created attachment 1165272 [details]
journalctl_atomic-openshift-master

Description of problem:
Pods stuck in Terminating state, logs are filled with following error

Jun  6 00:42:24 ip-172-31-39-29 atomic-openshift-master: E0606 00:42:24.950699   11186 namespace_controller.go:139] unexpected items still remain in namespace: eap64-mysql-s2i-user-402-171-469-216 for gvr: { v1 pods}

Version-Release number of selected component (if applicable):
openshift v3.2.0.45
kubernetes v1.2.0-36-g4a3f9c5
etc 2.2.5

Docker version 1.10.3-25.el7 

How reproducible:
First attempt to run reliability tests against docker 1.10

Steps to Reproduce:
1. Create a cluster with 1 infra, 1 master and 2 nodes
2. Start reliability tests which continuously creates/access/deletes projects
3. Monitor the cluster for the duration of tests

Actual results:
Running into issue where pods stuck in Terminating state after "oc delete project" was issued.

Expected results:
Project should be deleted without problem.

Additional info:
This bug is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1322538

Comment 1 Vikas Laad 2016-06-06 16:05:18 UTC
Created attachment 1165274 [details]
journalctl_atomic-openshift-node

Comment 2 Vikas Laad 2016-06-06 16:07:39 UTC
Created attachment 1165276 [details]
grep for project eap64-mysql-s2i-user-402-171-469-216 in /var/log/messages

Comment 3 Vikas Laad 2016-06-06 16:11:07 UTC
Created attachment 1165278 [details]
different pods stuck on the cluster

Comment 4 Derek Carr 2016-08-15 17:23:23 UTC
The logs showed a single pod stuck in Terminating status: eap-app-5-s05at

The master logs showd the namespace controller repeatedly observing that the pod was not yet deleted (it was stuck pending deletion from the kubelet or node controller in the case that the node was no longer healthy).  

There was no specific log action from the kubelet for pod: eap-app-5-s05at.  

It's not possible to know given the current logs if this was the only pod stuck terminating on this node.  If so, its possible the symptom for this bug is the same as https://bugzilla.redhat.com/show_bug.cgi?id=1364243 - where pods in terminating status would not get deleted by node controller if they were the only pod on the node.

I am closing this issue as duplicate of 1364243.

If the symptom repeats itself, please include the YAML output for `oc get pods --all-namespaces`, and the YAML output for `oc get nodes` so we can check heartbeats and pod->node assignment.

*** This bug has been marked as a duplicate of bug 1364243 ***


Note You need to log in before you can comment on or make changes to this bug.