Bug 2038507 - Namespaces stuck terminating: Failed to delete all resource types, 1 remaining: unexpected items still remain in namespace (With ovn-kubernetes)
Summary: Namespaces stuck terminating: Failed to delete all resource types, 1 remaini...
Keywords:
Status: CLOSED DUPLICATE of bug 2038780
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.10
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Sai Ramesh Vanka
QA Contact: Sunil Choudhary
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-08 02:13 UTC by Alex Krzos
Modified: 2022-02-10 08:34 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-02-10 08:34:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Alex Krzos 2022-01-08 02:13:04 UTC
Description of problem:
While running multiple benchmarks to check the 250pods/node max-pods limit on a bare-metal cluster with OVN-kubernetes, several namespaces get stuck in terminating state.

In this testing we are attempting to validate a bare metal cluster with OVN-kubernetes being able to create/run/delete up to max-pods capacity of several worker nodes concurrently. The cluster is 13 nodes (3 control-plane nodes, 10 worker nodes). Currently 5 of the 10 worker nodes are labeled for the workload (jetlag=true) and host an additional node-role label of nodedensity to more easily parse the pod and resource capacity of those nodes. Prior to running the benchmark, the 5 workload nodes are drained and then uncordoned so that they only host pods related to keeping the node running and connected to the cluster. (This is 14pods/node)

Thus the capacity we test is:
5 nodes * 250 max-pods = 1250 total capacity
5 nodes * 14 openshift pods = 70 steady-state openshift pods
5 nodes * 234 workload pods = 1170 workload pods

Sum of steady state and workload pods (70+1170 = 1240 pods) allows for 10 extra pods incase there is a job pod that decides to schedule on the workload nodes.

** Thus we are actually testing to just below the 250 max-pods a node count **

Version-Release number of selected component (if applicable):
4.10.0-0.nightly-2021-12-23-153012

How reproducible:
"Always" in the sense it typically requires a few benchmark runs before a node becomes stuck with many pods stuck terminating and subsequent namespaces stuck terminating.

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Node status while multiple namespaces are stuck terminating
# oc get no
NAME          STATUS   ROLES                AGE    VERSION
jetlag-bm10   Ready    master               2d4h   v1.22.1+6859754
jetlag-bm11   Ready    master               2d4h   v1.22.1+6859754
jetlag-bm12   Ready    master               2d4h   v1.22.1+6859754
jetlag-bm13   Ready    nodedensity,worker   2d4h   v1.22.1+6859754
jetlag-bm14   Ready    nodedensity,worker   2d4h   v1.22.1+6859754
jetlag-bm15   Ready    nodedensity,worker   2d4h   v1.22.1+6859754
jetlag-bm16   Ready    nodedensity,worker   2d4h   v1.22.1+6859754
jetlag-bm17   Ready    nodedensity,worker   2d4h   v1.22.1+6859754
jetlag-bm18   Ready    worker               2d4h   v1.22.1+6859754
jetlag-bm19   Ready    worker               2d4h   v1.22.1+6859754
jetlag-bm20   Ready    worker               2d4h   v1.22.1+6859754
jetlag-bm21   Ready    worker               2d4h   v1.22.1+6859754
jetlag-bm22   Ready    worker               2d4h   v1.22.1+6859754

Namespaces stuck terminating:

# oc get ns | head
NAME                                               STATUS        AGE
assisted-installer                                 Active        2d4h
boatload-1037                                      Terminating   160m
boatload-1039                                      Terminating   160m
boatload-1051                                      Terminating   159m
boatload-1055                                      Terminating   159m
boatload-1060                                      Terminating   159m
boatload-1063                                      Terminating   159m
boatload-1066                                      Terminating   159m
boatload-1068                                      Terminating   159m

# oc get ns | grep "Terminating"  -c
115

# oc get po -A | grep boatload |grep Terminating -c
115

# oc get po -A | grep boatload | head
boatload-1037                                      boatload-1037-1-boatload-774d9fb978-4sf6g                   0/1     Terminating   0              161m
boatload-1039                                      boatload-1039-1-boatload-84d9bf6964-8xr22                   0/1     Terminating   0              160m
boatload-1051                                      boatload-1051-1-boatload-67dd588b74-vqr9n                   0/1     Terminating   0              160m
boatload-1055                                      boatload-1055-1-boatload-9f5d6b6c8-7qbqx                    0/1     Terminating   0              160m
boatload-1060                                      boatload-1060-1-boatload-9bdd489c8-tqqbj                    0/1     Terminating   0              160m
boatload-1063                                      boatload-1063-1-boatload-6d5d96cc89-lbnm9                   0/1     Terminating   0              160m
boatload-1066                                      boatload-1066-1-boatload-6dc666fbdd-mbztf                   0/1     Terminating   0              160m
boatload-1068                                      boatload-1068-1-boatload-569fbb78f8-zkxtr                   0/1     Terminating   0              160m
boatload-113                                       boatload-113-1-boatload-7cc9f7887b-kv9r4                    0/1     Terminating   0              175m
boatload-1130                                      boatload-1130-1-boatload-6476bd98dd-999w9                   0/1     Terminating   0              159m

Distribution of pods across nodes:
#  oc get po -A -o wide | grep boatload | awk '{print $8}' | sort | uniq -c
    115 jetlag-bm13

* Just one node is hosting the pods that are stuck

Looking at one stuck pod:

# oc get po -n boatload-113
NAME                                       READY   STATUS        RESTARTS   AGE
boatload-113-1-boatload-7cc9f7887b-kv9r4   0/1     Terminating   0          176m


# oc describe po -n boatload-113
Name:                      boatload-113-1-boatload-7cc9f7887b-kv9r4
Namespace:                 boatload-113
Priority:                  0
Node:                      jetlag-bm13/10.5.190.42
Start Time:                Fri, 07 Jan 2022 17:13:21 -0600
Labels:                    app=boatload-113-1
                           pod-template-hash=7cc9f7887b
Annotations:               k8s.ovn.org/pod-networks:
                             {"default":{"ip_addresses":["10.130.19.53/21"],"mac_address":"0a:58:0a:82:13:35","gateway_ips":["10.130.16.1"],"ip_address":"10.130.19.53/...
                           k8s.v1.cni.cncf.io/network-status:
                             [{
                                 "name": "ovn-kubernetes",
                                 "interface": "eth0",
                                 "ips": [
                                     "10.130.19.53"
                                 ],
                                 "mac": "0a:58:0a:82:13:35",
                                 "default": true,
                                 "dns": {}
                             }]
                           k8s.v1.cni.cncf.io/networks-status:
                             [{
                                 "name": "ovn-kubernetes",
                                 "interface": "eth0",
                                 "ips": [
                                     "10.130.19.53"
                                 ],
                                 "mac": "0a:58:0a:82:13:35",
                                 "default": true,
                                 "dns": {}
                             }]
                           openshift.io/scc: restricted
Status:                    Terminating (lasts 128m)
Termination Grace Period:  30s
IP:                        10.130.19.53
IPs:
  IP:           10.130.19.53
Controlled By:  ReplicaSet/boatload-113-1-boatload-7cc9f7887b
Containers:
  boatload-1:
    Container ID:   cri-o://6d677870614812541683d7ef2d9766c025eb7c575a9fc8e09925a41f5103e36b
    Image:          quay.io/redhat-performance/test-gohttp-probe:v0.0.2
    Image ID:       99b026db9534b7ede003ab26a626e1ce90e0a5d41ba2191f615601695afccfae
    Port:           8000/TCP
    Host Port:      0/TCP
    State:          Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Fri, 07 Jan 2022 17:13:24 -0600
      Finished:     Fri, 07 Jan 2022 17:37:20 -0600
    Ready:          False
    Restart Count:  0
    Environment:
      PORT:                         8000
      LISTEN_DELAY_SECONDS:         0
      LIVENESS_DELAY_SECONDS:       0
      READINESS_DELAY_SECONDS:      0
      RESPONSE_DELAY_MILLISECONDS:  0
      LIVENESS_SUCCESS_MAX:         0
      READINESS_SUCCESS_MAX:        0
    Mounts:
      /etc/cm-1 from cm-1 (rw)
      /etc/cm-2 from cm-2 (rw)
      /etc/cm-3 from cm-3 (rw)
      /etc/cm-4 from cm-4 (rw)
      /etc/cm-5 from cm-5 (rw)
      /etc/cm-6 from cm-6 (rw)
      /etc/cm-7 from cm-7 (rw)
      /etc/cm-8 from cm-8 (rw)
      /etc/secret-1 from secret-1 (rw)
      /etc/secret-2 from secret-2 (rw)
      /etc/secret-3 from secret-3 (rw)
      /etc/secret-4 from secret-4 (rw)
      /etc/secret-5 from secret-5 (rw)
      /etc/secret-6 from secret-6 (rw)
      /etc/secret-7 from secret-7 (rw)
      /etc/secret-8 from secret-8 (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-b6gbq (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  cm-1:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      boatload-113-1-boatload
    Optional:  false
  cm-2:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      boatload-113-2-boatload
    Optional:  false
  cm-3:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      boatload-113-3-boatload
    Optional:  false
  cm-4:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      boatload-113-4-boatload
    Optional:  false
  cm-5:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      boatload-113-5-boatload
    Optional:  false
  cm-6:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      boatload-113-6-boatload
    Optional:  false
  cm-7:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      boatload-113-7-boatload
    Optional:  false
  cm-8:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      boatload-113-8-boatload
    Optional:  false
  secret-1:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  boatload-113-1-boatload
    Optional:    false
  secret-2:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  boatload-113-2-boatload
    Optional:    false
  secret-3:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  boatload-113-3-boatload
    Optional:    false
  secret-4:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  boatload-113-4-boatload
    Optional:    false
  secret-5:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  boatload-113-5-boatload
    Optional:    false
  secret-6:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  boatload-113-6-boatload
    Optional:    false
  secret-7:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  boatload-113-7-boatload
    Optional:    false
  secret-8:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  boatload-113-8-boatload
    Optional:    false
  kube-api-access-b6gbq:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
    ConfigMapName:           openshift-service-ca.crt
    ConfigMapOptional:       <nil>
QoS Class:                   BestEffort
Node-Selectors:              jetlag=true
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>

Namespace of pod stuck:

# oc get ns boatload-113 -o yaml
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    openshift.io/sa.scc.mcs: s0:c532,c54
    openshift.io/sa.scc.supplemental-groups: 1282600000/10000
    openshift.io/sa.scc.uid-range: 1282600000/10000
  creationTimestamp: "2022-01-07T23:13:20Z"
  deletionTimestamp: "2022-01-07T23:35:25Z"
  labels:
    kube-burner-job: boatload
    kube-burner-uuid: dc493669-588d-4e4f-85f0-544a76c21d4d
    kubernetes.io/metadata.name: boatload-113
    name: boatload-113
  name: boatload-113
  resourceVersion: "8187751"
  uid: cc374d37-1e93-4c6a-a39d-6e8fc0ad841d
spec:
  finalizers:
  - kubernetes
status:
  conditions:
  - lastTransitionTime: "2022-01-07T23:35:31Z"
    message: All resources successfully discovered
    reason: ResourcesDiscovered
    status: "False"
    type: NamespaceDeletionDiscoveryFailure
  - lastTransitionTime: "2022-01-07T23:35:31Z"
    message: All legacy kube types successfully parsed
    reason: ParsedGroupVersions
    status: "False"
    type: NamespaceDeletionGroupVersionParsingFailure
  - lastTransitionTime: "2022-01-07T23:36:02Z"
    message: 'Failed to delete all resource types, 1 remaining: unexpected items still
      remain in namespace: boatload-113 for gvr: /v1, Resource=pods'
    reason: ContentDeletionFailed
    status: "True"
    type: NamespaceDeletionContentFailure
  - lastTransitionTime: "2022-01-07T23:35:31Z"
    message: 'Some resources are remaining: pods. has 1 resource instances'
    reason: SomeResourcesRemain
    status: "True"
    type: NamespaceContentRemaining
  - lastTransitionTime: "2022-01-07T23:35:31Z"
    message: All content-preserving finalizers finished
    reason: ContentHasNoFinalizers
    status: "False"
    type: NamespaceFinalizersRemaining
  phase: Terminating

Comment 7 Peter Hunt 2022-01-12 20:11:11 UTC
FWIW, I have taken a look at a node that Alex put together that had this issue happening. I have concluded that it is not a cri-o problem. The container that ends up stuck in terminating is never referenced in cri-o after it's removed, nor is there a runtime process or any container storage artifacts left. however, the kubelet seems to think the pod still has some resources that still need cleaning up. Tossing over to Harshal for further investigation

Comment 10 Sai Ramesh Vanka 2022-01-18 16:17:19 UTC
Hello Alex,

From the logs, I see the issue is a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=2038780

Sample log that shows the pod in an indefinite loop while mounting the required volumes.
Setting up of the volumes required the configmaps which are already deleted.

Following are some of the upstream links which are helpful.

Issue Link: https://github.com/kubernetes/kubernetes/issues/96635

A PR having the possible resolution is also open.
PR Link: https://github.com/kubernetes/kubernetes/pull/96790

Thanks,
Ramesh

Comment 11 Alex Krzos 2022-01-19 14:57:04 UTC
(In reply to Sai Ramesh Vanka from comment #10)
> Hello Alex,
> 
> From the logs, I see the issue is a duplicate of
> https://bugzilla.redhat.com/show_bug.cgi?id=2038780
> 
> Sample log that shows the pod in an indefinite loop while mounting the
> required volumes.
> Setting up of the volumes required the configmaps which are already deleted.
> 
> Following are some of the upstream links which are helpful.
> 
> Issue Link: https://github.com/kubernetes/kubernetes/issues/96635
> 
> A PR having the possible resolution is also open.
> PR Link: https://github.com/kubernetes/kubernetes/pull/96790
> 
> Thanks,
> Ramesh

Thanks for sharing, seems like this has been an issue for a long time that is difficult to reproduce.

FWIW, I was able to reproduce on 4.10.0-0.nightly-2022-01-15-092722 and once it reproduced, simply restarting kubelet on the affected node resolves the pods that were stuck and allows the namespaces to terminate. (This also occurs when crio is restarted, but this is probably because crio causes kubelet to restart as well)

Comment 13 Sai Ramesh Vanka 2022-02-10 08:34:01 UTC

*** This bug has been marked as a duplicate of bug 2038780 ***


Note You need to log in before you can comment on or make changes to this bug.