Created attachment 1165272 [details] journalctl_atomic-openshift-master Description of problem: Pods stuck in Terminating state, logs are filled with following error Jun 6 00:42:24 ip-172-31-39-29 atomic-openshift-master: E0606 00:42:24.950699 11186 namespace_controller.go:139] unexpected items still remain in namespace: eap64-mysql-s2i-user-402-171-469-216 for gvr: { v1 pods} Version-Release number of selected component (if applicable): openshift v3.2.0.45 kubernetes v1.2.0-36-g4a3f9c5 etc 2.2.5 Docker version 1.10.3-25.el7 How reproducible: First attempt to run reliability tests against docker 1.10 Steps to Reproduce: 1. Create a cluster with 1 infra, 1 master and 2 nodes 2. Start reliability tests which continuously creates/access/deletes projects 3. Monitor the cluster for the duration of tests Actual results: Running into issue where pods stuck in Terminating state after "oc delete project" was issued. Expected results: Project should be deleted without problem. Additional info: This bug is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1322538
Created attachment 1165274 [details] journalctl_atomic-openshift-node
Created attachment 1165276 [details] grep for project eap64-mysql-s2i-user-402-171-469-216 in /var/log/messages
Created attachment 1165278 [details] different pods stuck on the cluster
The logs showed a single pod stuck in Terminating status: eap-app-5-s05at The master logs showd the namespace controller repeatedly observing that the pod was not yet deleted (it was stuck pending deletion from the kubelet or node controller in the case that the node was no longer healthy). There was no specific log action from the kubelet for pod: eap-app-5-s05at. It's not possible to know given the current logs if this was the only pod stuck terminating on this node. If so, its possible the symptom for this bug is the same as https://bugzilla.redhat.com/show_bug.cgi?id=1364243 - where pods in terminating status would not get deleted by node controller if they were the only pod on the node. I am closing this issue as duplicate of 1364243. If the symptom repeats itself, please include the YAML output for `oc get pods --all-namespaces`, and the YAML output for `oc get nodes` so we can check heartbeats and pod->node assignment. *** This bug has been marked as a duplicate of bug 1364243 ***