Bug 1670833

Summary: cloud-credential-operator pod in CrashLoopBackOff
Product: OpenShift Container Platform Reporter: Jaspreet Kaur <jkaur>
Component: Cloud ComputeAssignee: Devan Goodwin <dgoodwin>
Status: CLOSED CURRENTRELEASE QA Contact: Jianwei Hou <jhou>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.1.0CC: aos-bugs, dgoodwin, jiazha, jokerman, mmccomas, wking
Target Milestone: ---   
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-04-13 06:10:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1664187    

Description Jaspreet Kaur 2019-01-30 12:21:52 UTC
Description of problem: cloud-credential-operator pod shows Crashloopbackoff and eventually OOMkilled

 oc describe pod/cloud-credential-operator-54c8757c48-kth6l
Name:               cloud-credential-operator-54c8757c48-kth6l
Namespace:          openshift-cloud-credential-operator
Priority:           2000000000
PriorityClassName:  system-cluster-critical
Node:               ip-10-0-14-1.us-east-2.compute.internal/10.0.14.1
Start Time:         Mon, 28 Jan 2019 06:19:41 +0000
Labels:             control-plane=controller-manager
                    controller-tools.k8s.io=1.0
                    pod-template-hash=1074313704
Annotations:        openshift.io/scc=restricted
Status:             Running
IP:                 10.128.0.19
Controlled By:      ReplicaSet/cloud-credential-operator-54c8757c48
Containers:
  manager:
    Container ID:  cri-o://196c8c085918972947080e4613498d97b6cb723840b25eec5d461f11199c8b9c
    Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:5202e9057a0bc7ccbcae6bc0e997c3ae71767d02c1ee590d6cb15b0bfba891b3
    Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:5202e9057a0bc7ccbcae6bc0e997c3ae71767d02c1ee590d6cb15b0bfba891b3
    Port:          9876/TCP
    Host Port:     0/TCP
    Command:
      /root/manager
      --log-level
      debug
    State:          Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Wed, 30 Jan 2019 12:19:21 +0000
      Finished:     Wed, 30 Jan 2019 12:19:26 +0000
    Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Wed, 30 Jan 2019 12:14:09 +0000
      Finished:     Wed, 30 Jan 2019 12:14:15 +0000
    Ready:          False
    Restart Count:  128
    Limits:
      cpu:     100m
      memory:  100Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-lq982 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  default-token-lq982:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-lq982
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  node-role.kubernetes.io/master=
Tolerations:     
                 node.kubernetes.io/memory-pressure:NoSchedule
Events:
  Type     Reason   Age                 From                                              Message
  ----     ------   ----                ----                                              -------
  Warning  BackOff  5m (x2621 over 1d)  kubelet, ip-10-0-14-1.us-east-2.compute.internal  Back-off restarting failed container
  Normal   Pulling  8s (x129 over 2d)   kubelet, ip-10-0-14-1.us-east-2.compute.internal  pulling image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:5202e9057a0bc7ccbcae6bc0e997c3ae71767d02c1ee590d6cb15b0bfba891b3"
[root@ip-10-0-27-69 ~]# oc get pods
NAME                                         READY     STATUS      RESTARTS   AGE
cloud-credential-operator-54c8757c48-kth6l   0/1       OOMKilled   128        2d



Version-Release number of selected component (if applicable):


How reproducible:

oc version
oc v4.0.0-0.147.0
kubernetes v1.11.0+dde478551e
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://jastry-api.aws.cee.redhat.com:6443
kubernetes v1.11.0+8868a98a7b


Steps to Reproduce:
1.
2.
3.

Actual results: Fails and eventually OOMKilled


Expected results: Should be running state.


Additional info:

Comment 1 Jian Zhang 2019-01-31 02:25:41 UTC
Thanks for your reporting this issue. 
But, the "openshift-cloud-credential-operator" wasn't managed by the OLM, but CVO.
So, transfer this issue to the "Installer" component. Correct me if I'm wrong.

Actually, I hit this issue too. As below:
[jzhang@dhcp-140-18 ocp-30]$ oc get pods -n openshift-cloud-credential-operator
NAME                                         READY     STATUS             RESTARTS   AGE
cloud-credential-operator-7f9b57cc46-gmzj5   0/1       CrashLoopBackOff   98         24h
[jzhang@dhcp-140-18 ocp-30]$ oc logs cloud-credential-operator-7f9b57cc46-gmzj5 -n  openshift-cloud-credential-operator
time="2019-01-31T02:17:16Z" level=debug msg="debug logging enabled"
time="2019-01-31T02:17:16Z" level=info msg="setting up client for manager"
time="2019-01-31T02:17:16Z" level=info msg="setting up manager"
time="2019-01-31T02:17:17Z" level=info msg="registering components"
time="2019-01-31T02:17:17Z" level=info msg="setting up scheme"
time="2019-01-31T02:17:17Z" level=info msg="setting up controller"
time="2019-01-31T02:17:17Z" level=warning msg="apiVersion: v1beta1\nbaseDomain: qe.devcluster.openshift.com\nmachines:\n- name: master\n  platform: {}\n  replicas: 3\n- name: worker\n  platform: {}\n  replicas: 3\nmetadata:\n  creationTimestamp: null\n  name: jiazha-3\nnetworking:\n  clusterNetworks:\n  - cidr: 10.128.0.0/14\n    hostSubnetLength: 9\n  machineCIDR: 10.0.0.0/16\n  serviceCIDR: 172.30.0.0/16\n  type: OpenshiftSDN\nplatform:\n  aws:\n    region: us-east-2\npullSecret: '{\"auths\":{\"cloud.openshift.com\":{\"auth\":\"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2ppYXpoYTFlc3Fvc3owcXhiaGt6Yzd3Ym94YnFib29ydDpMUjgyVUVEWjgxMVg3Vlg2Qlk2NVJZT0FWQzYxMEhENDdVVVZKR0VYWEZNU05GTTJJOVBDT0FPOTFXRTJVU1c3\",\"email\":\"jiazha\"},\"quay.io\":{\"auth\":\"b3BlbnNoaWZ0LXJlbGVhc2UtZGV2K2ppYXpoYTFlc3Fvc3owcXhiaGt6Yzd3Ym94YnFib29ydDpMUjgyVUVEWjgxMVg3Vlg2Qlk2NVJZT0FWQzYxMEhENDdVVVZKR0VYWEZNU05GTTJJOVBDT0FPOTFXRTJVU1c3\",\"email\":\"jiazha\"},\"registry.svc.ci.openshift.org\":{\"auth\":\"amlhbnpoYW5nYmp6OmRxVFRHMExjbmVLbWFQTThtSE1mUHl6Y2FPaEdEX3VNNDBXVTd6SGd5QUU=\"}}}'\nsshKey: |\n  ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDUq7W38xCZ9WGSWCvustaMGMT04tRohw6AKGzI7P7xql5lhCAReyt72n9qWQRZsE1YiCSQuTfXI1oc8NpSM7+lMLwj12G8z3I1YT31JHr9LLYg/XIcExkzfBI920CaS82VqmKOpI9+ARHSJBdIbKRI0f5Y+u4xbc5UzKCJX8jcKGG7nEiw8zm+cvAlfOgssMK+qJppIbVcb2iZNTsw5i2aX6FDMyC+b17DQHzBGpNbhZYxuoERZVRcnYctgIzuo6fD60gniX0fVvrchlOnubB1sRYbloP2r6UE22w/dpLKOFE5i7CA0ZzNBERZ94cIKumIH9MiJs1a6bMe89VOjjNV\n"
time="2019-01-31T02:17:17Z" level=info msg="initializing AWS actuator"
time="2019-01-31T02:17:17Z" level=info msg="starting the cmd"
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-authentication
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-kube-scheduler-operator
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=plyml
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=default
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=kube-system
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-cluster-network-operator
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-kube-apiserver-operator
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-operators
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=dd0fi
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-cluster-storage-operator
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-cluster-version
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-core-operators
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-logging
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=1egzc
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=26zzd
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-cluster-node-tuning-operator
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-image-registry
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-kube-controller-manager
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-node
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-cluster-api
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-controller-manager-operator
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-dns-operator
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-kube-scheduler
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-cloud-credential-operator
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-kube-controller-manager-operator
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=g3z70
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-config-managed
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-dns
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-marketplace
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-sdn
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-apiserver
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-apiserver-operator
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-cluster-machine-approver
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-cluster-samples-operator
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-monitoring
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=yapei
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=etf31
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=ewit8
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=kube-public
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-controller-manager
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-kube-apiserver
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=qubt4
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-operator-lifecycle-manager
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-authentication-operator
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-config
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-console
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-infra
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-service-cert-signer
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=0xlmv
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=o983m
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=prozyp
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=uhupq
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=cgzzq
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=hjgjr
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-ingress
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-ingress-operator
time="2019-01-31T02:17:17Z" level=debug msg="checking for credentials requests targeting namespace" namespace=openshift-machine-config-operator

oc version:
Server https://jiazha-3-api.qe.devcluster.openshift.com:6443
kubernetes v1.12.4+50c2f2340a

Comment 2 W. Trevor King 2019-02-01 09:01:03 UTC
I'm not sure what the right component is for the credentials operator, but "Cloud Compute" feels like a better match than "Installer".

Comment 3 Devan Goodwin 2019-02-01 11:54:03 UTC
Looks like some randomly generated namespaces? How many were there on that cluster? I suspect my memory limits were much too stringent, I will bump them up immediately.

Comment 4 Devan Goodwin 2019-02-01 12:00:02 UTC
The warning in the logs is leftover debug logging. I will remove this as well.

Comment 6 Jianwei Hou 2019-02-02 07:57:00 UTC
Verified with 4.0.0-0.nightly-2019-01-30-145955. The pod has not been Crashloopbackoff after running for a while.

NAME                                         READY     STATUS    RESTARTS   AGE
cloud-credential-operator-64bc855b98-lsjsz   1/1       Running   0          4h15m

Comment 8 W. Trevor King 2019-04-13 06:10:30 UTC
> Verified with 4.0.0-0.nightly-2019-01-30-145955.

This was a long time ago, and we've had a number of releases since.  Moving to CURRENTRELEASE, although feel free to reopen if I'm misunderstanding the workflow.