1834927 – revision-pruner killed during OCP 4.3 installation

Bug 1834927 - revision-pruner killed during OCP 4.3 installation

Summary: revision-pruner killed during OCP 4.3 installation

Keywords:
Status:	CLOSED DUPLICATE of bug 1800609
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	4.3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	---
Assignee:	Ryan Phillips
QA Contact:	Sunil Choudhary
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-05-12 16:54 UTC by Andreas Karis
Modified:	2023-10-06 19:59 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-05-13 16:28:42 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Comment 1 Andreas Karis 2020-05-12 16:59:02 UTC

One of the revision pruner pods shows as OOM killed:

[root@jumpserver ~]# oc get pods -A -o wide | grep -iv runn | grep -iv compl
NAMESPACE                                               NAME                                                              READY   STATUS      RESTARTS   AGE     IP               NODE                                     NOMINATED NODE   READINESS GATES
openshift-kube-scheduler                                revision-pruner-7-master01.example.com          0/1     OOMKilled   0          13d     192.168.3.96      master01.example.com   <none>           <none>
[root@jumpserver ~]# oc describe pod -n openshift-kube-scheduler revision-pruner-7-master01.example.com
Name:                 revision-pruner-7-master01.example.com
Namespace:            openshift-kube-scheduler
Priority:             2000001000
Priority Class Name:  system-node-critical
Node:                 master01.example.com/10.0.0.251
Start Time:           Wed, 29 Apr 2020 10:29:15 +0200
Labels:               app=pruner
Annotations:          k8s.v1.cni.cncf.io/networks-status:
Status:               Succeeded
IP:                   192.168.3.96
IPs:
  IP:  192.168.3.96
Containers:
  pruner:
    Container ID:  cri-o://<container id>
    Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<checksum>
    Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:<checksum>
    Port:          <none>
    Host Port:     <none>
    Command:
      cluster-kube-scheduler-operator
      prune
    Args:
      -v=4
      --max-eligible-revision=7
      --protected-revisions=1,2,3,4,5,6,7
      --resource-dir=/etc/kubernetes/static-pod-resources
      --static-pod-name=kube-scheduler-pod
    State:          Terminated
      Reason:       OOMKilled
      Exit Code:    0
      Started:      Wed, 29 Apr 2020 10:29:18 +0200
      Finished:     Wed, 29 Apr 2020 10:29:18 +0200
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /etc/kubernetes/ from kubelet-dir (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from installer-sa-token-aaaaa (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  kubelet-dir:
    Type:          HostPath (bare host directory volume)
    Path:          /etc/kubernetes/
    HostPathType:
  installer-sa-token-aaaaa:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  installer-sa-token-aaaaa
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:
Events:          <none>

---------------

root@jumpserver ~]# oc logs -n openshift-kube-scheduler revision-pruner-7-master01.example.com
unable to retrieve container logs for cri-o://<container id>[root@jumpserver ~]#
[root@jumpserver ~]#
[root@jumpserver ~]#



Could be related to any of the following so please close as duplicate if that's the case: https://bugzilla.redhat.com/show_bug.cgi?id=1800609  ; https://bugzilla.redhat.com/show_bug.cgi?id=1799079 ; https://bugzilla.redhat.com/show_bug.cgi?id=1792501 ; https://github.com/openshift/origin/pull/24596

Comment 2 Maciej Szulik 2020-05-13 10:00:42 UTC

Yes, these look like similar cases to aforementioned, but I'll let the node team give the final call if it's a duplicate.

Comment 3 Ryan Phillips 2020-05-13 16:28:42 UTC


*** This bug has been marked as a duplicate of bug 1800609 ***

Note You need to log in before you can comment on or make changes to this bug.