Bug 1810896 - [4.5] Install OCP 4.5 with ovn kubernetes failed
Summary: [4.5] Install OCP 4.5 with ovn kubernetes failed
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: unspecified
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.5.0
Assignee: Ricardo Carrillo Cruz
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-06 06:59 UTC by gaoshang
Modified: 2020-03-09 11:29 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-03-09 11:29:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description gaoshang 2020-03-06 06:59:19 UTC
Description of problem:
Install OCP 4.5 with ovn kubernetes failed, in bootstrap machine, found all ovnkube-master-* pod status CrashLoopBackOff. 

Version-Release number of selected component (if applicable):
OCP 4.5.0-0.nightly-2020-03-05-190442

How reproducible:
Always

Steps to Reproduce:
1. Install OCP 4.5 with OVNKubernetes network
2.
3.

Actual results:
Install failed

Expected results:
Install succeed


Additional info:
[root@ip-10-0-4-83 ~]# oc describe pod/ovnkube-master-6nrx4
Name:                 ovnkube-master-6nrx4
Namespace:            openshift-ovn-kubernetes
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 ip-10-0-150-151.us-east-2.compute.internal/10.0.150.151
Start Time:           Fri, 06 Mar 2020 04:50:17 +0000
Labels:               app=ovnkube-master
                      component=network
                      controller-revision-hash=f778c8785
                      kubernetes.io/os=linux
                      openshift.io/component=network
                      pod-template-generation=2
                      type=infra
Annotations:          <none>
Status:               Running
IP:                   10.0.150.151
IPs:
  IP:           10.0.150.151
Controlled By:  DaemonSet/ovnkube-master
Containers:
  northd:
    Container ID:  cri-o://924b1caf0b2b6336437639343584465d262403a84acf380ab0e64b9a73495b69
    Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:66ffaa52ca844214ef09a0a18222ccc2a248d4ddb69e9fe11ac11de24a747290
    Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:66ffaa52ca844214ef09a0a18222ccc2a248d4ddb69e9fe11ac11de24a747290
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/bash
      -c
      set -xe
      if [[ -f /env/_master ]]; then
        set -o allexport
        source /env/_master
        set +o allexport
      fi
      
      exec ovn-northd \
        --no-chdir "-vconsole:${OVN_LOG_LEVEL}" -vfile:off \
        --ovnnb-db "ssl:10.0.130.166:9641,ssl:10.0.150.151:9641,ssl:10.0.164.31:9641" \
        --ovnsb-db "ssl:10.0.130.166:9642,ssl:10.0.150.151:9642,ssl:10.0.164.31:9642" \
        -p /ovn-cert/tls.key \
        -c /ovn-cert/tls.crt \
        -C /ovn-ca/ca-bundle.crt 
      
    State:          Running
      Started:      Fri, 06 Mar 2020 04:50:18 +0000
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:     100m
      memory:  300Mi
    Environment:
      OVN_LOG_LEVEL:  info
    Mounts:
      /env from env-overrides (rw)
      /etc/openvswitch/ from etc-openvswitch (rw)
      /ovn-ca from ovn-ca (rw)
      /ovn-cert from ovn-cert (rw)
      /run/openvswitch/ from run-openvswitch (rw)
      /run/ovn/ from run-ovn (rw)
      /var/lib/openvswitch/ from var-lib-openvswitch (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from ovn-kubernetes-controller-token-nsr77 (ro)
  nbdb:
    Container ID:  cri-o://7951dd43cef22ac16e2232e616f4597b7bdf4f30f401bdf3b0cf96bc88278ea6
    Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:66ffaa52ca844214ef09a0a18222ccc2a248d4ddb69e9fe11ac11de24a747290
    Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:66ffaa52ca844214ef09a0a18222ccc2a248d4ddb69e9fe11ac11de24a747290
    Ports:         9641/TCP, 9643/TCP
    Host Ports:    9641/TCP, 9643/TCP
    Command:
      /bin/bash
      -c
      set -xe
      if [[ -f /env/_master ]]; then
        set -o allexport
        source /env/_master
        set +o allexport
      fi
      
      bracketify() { case "$1" in *:*) echo "[$1]" ;; *) echo "$1" ;; esac }
      
      MASTER_IP="10.0.130.166"
      if [[ "${K8S_NODE_IP}" == "${MASTER_IP}" ]]; then
        exec /usr/share/ovn/scripts/ovn-ctl \
        --db-nb-cluster-local-port=9643 \
        --db-nb-cluster-local-addr=$(bracketify ${K8S_NODE_IP}) \
        --no-monitor \
        --db-nb-cluster-local-proto=ssl \
        --ovn-nb-db-ssl-key=/ovn-cert/tls.key \
        --ovn-nb-db-ssl-cert=/ovn-cert/tls.crt \
        --ovn-nb-db-ssl-ca-cert=/ovn-ca/ca-bundle.crt \
        --ovn-nb-log="-vconsole:${OVN_LOG_LEVEL} -vfile:off" \
        run_nb_ovsdb
      else
        exec /usr/share/ovn/scripts/ovn-ctl \
        --db-nb-cluster-local-port=9643 \
        --db-nb-cluster-remote-port=9643 \
        --db-nb-cluster-local-addr=$(bracketify ${K8S_NODE_IP}) \
        --db-nb-cluster-remote-addr=$(bracketify ${MASTER_IP}) \
        --no-monitor \
        --db-nb-cluster-local-proto=ssl \
        --db-nb-cluster-remote-proto=ssl \
        --ovn-nb-db-ssl-key=/ovn-cert/tls.key \
        --ovn-nb-db-ssl-cert=/ovn-cert/tls.crt \
        --ovn-nb-db-ssl-ca-cert=/ovn-ca/ca-bundle.crt \
        --ovn-nb-log="-vconsole:${OVN_LOG_LEVEL} -vfile:off" \
        run_nb_ovsdb
      fi
      
    State:       Waiting
      Reason:    CrashLoopBackOff
    Last State:  Terminated
      Reason:    Error
      Message:   + [[ -f /env/_master ]]
+ MASTER_IP=10.0.130.166
+ [[ 10.0.150.151 == \1\0\.\0\.\1\3\0\.\1\6\6 ]]
++ bracketify 10.0.150.151
++ case "$1" in
++ echo 10.0.150.151
++ bracketify 10.0.130.166
++ case "$1" in
++ echo 10.0.130.166
+ exec /usr/share/ovn/scripts/ovn-ctl --db-nb-cluster-local-port=9643 --db-nb-cluster-remote-port=9643 --db-nb-cluster-local-addr=10.0.150.151 --db-nb-cluster-remote-addr=10.0.130.166 --no-monitor --db-nb-cluster-local-proto=ssl --db-nb-cluster-remote-proto=ssl --ovn-nb-db-ssl-key=/ovn-cert/tls.key --ovn-nb-db-ssl-cert=/ovn-cert/tls.crt --ovn-nb-db-ssl-ca-cert=/ovn-ca/ca-bundle.crt '--ovn-nb-log=-vconsole:info -vfile:off' run_nb_ovsdb
/bin/bash: line 22: /usr/share/ovn/scripts/ovn-ctl: No such file or directory

      Exit Code:    1
      Started:      Fri, 06 Mar 2020 06:43:07 +0000
      Finished:     Fri, 06 Mar 2020 06:43:07 +0000
    Ready:          False
    Restart Count:  27
    Requests:
      cpu:      100m
      memory:   300Mi
    Readiness:  exec [/bin/bash -c set -xe
exec /usr/bin/ovn-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound  2>/dev/null | grep ${K8S_NODE_IP} | grep -v Address -q
] delay=30s timeout=1s period=10s #success=1 #failure=3
    Environment:
      OVN_LOG_LEVEL:  info
      K8S_NODE_IP:     (v1:status.hostIP)
    Mounts:
      /env from env-overrides (rw)
      /etc/openvswitch/ from etc-openvswitch (rw)
      /etc/ovn/ from etc-openvswitch (rw)
      /ovn-ca from ovn-ca (rw)
      /ovn-cert from ovn-cert (rw)
      /run/openvswitch/ from run-openvswitch (rw)
      /run/ovn/ from run-ovn (rw)
      /var/lib/openvswitch/ from var-lib-openvswitch (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from ovn-kubernetes-controller-token-nsr77 (ro)
  sbdb:
    Container ID:  cri-o://216573141b91e988c330f577cfa3e59e0d38b139ab26ea4483d07f8ffae6637b
    Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:66ffaa52ca844214ef09a0a18222ccc2a248d4ddb69e9fe11ac11de24a747290
    Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:66ffaa52ca844214ef09a0a18222ccc2a248d4ddb69e9fe11ac11de24a747290
    Ports:         9642/TCP, 9644/TCP
    Host Ports:    9642/TCP, 9644/TCP
    Command:
      /bin/bash
      -c
      set -xe
      if [[ -f /env/_master ]]; then
        set -o allexport
        source /env/_master
        set +o allexport
      fi
      
      bracketify() { case "$1" in *:*) echo "[$1]" ;; *) echo "$1" ;; esac }
      
      MASTER_IP="10.0.130.166"
      if [[ "${K8S_NODE_IP}" == "${MASTER_IP}" ]]; then
        exec /usr/share/ovn/scripts/ovn-ctl \
        --db-sb-cluster-local-port=9644 \
        --db-sb-cluster-local-addr=$(bracketify ${K8S_NODE_IP}) \
        --no-monitor \
        --db-sb-cluster-local-proto=ssl \
        --ovn-sb-db-ssl-key=/ovn-cert/tls.key \
        --ovn-sb-db-ssl-cert=/ovn-cert/tls.crt \
        --ovn-sb-db-ssl-ca-cert=/ovn-ca/ca-bundle.crt \
        --ovn-sb-log="-vconsole:${OVN_LOG_LEVEL} -vfile:off" \
        run_sb_ovsdb
      else
        echo "joining cluster at ${MASTER_IP}"
        exec /usr/share/ovn/scripts/ovn-ctl \
        --db-sb-cluster-local-port=9644 \
        --db-sb-cluster-remote-port=9644 \
        --db-sb-cluster-local-addr=$(bracketify ${K8S_NODE_IP}) \
        --db-sb-cluster-remote-addr=$(bracketify ${MASTER_IP}) \
        --no-monitor \
        --db-sb-cluster-local-proto=ssl \
        --db-sb-cluster-remote-proto=ssl \
        --ovn-sb-db-ssl-key=/ovn-cert/tls.key \
        --ovn-sb-db-ssl-cert=/ovn-cert/tls.crt \
        --ovn-sb-db-ssl-ca-cert=/ovn-ca/ca-bundle.crt \
        --ovn-sb-log="-vconsole:${OVN_LOG_LEVEL} -vfile:off" \
        run_sb_ovsdb
      fi
      
    State:       Waiting
      Reason:    CrashLoopBackOff
    Last State:  Terminated
      Reason:    Error
      Message:   + [[ -f /env/_master ]]
+ MASTER_IP=10.0.130.166
+ [[ 10.0.150.151 == \1\0\.\0\.\1\3\0\.\1\6\6 ]]
+ echo 'joining cluster at 10.0.130.166'
joining cluster at 10.0.130.166
++ bracketify 10.0.150.151
++ case "$1" in
++ echo 10.0.150.151
++ bracketify 10.0.130.166
++ case "$1" in
++ echo 10.0.130.166
+ exec /usr/share/ovn/scripts/ovn-ctl --db-sb-cluster-local-port=9644 --db-sb-cluster-remote-port=9644 --db-sb-cluster-local-addr=10.0.150.151 --db-sb-cluster-remote-addr=10.0.130.166 --no-monitor --db-sb-cluster-local-proto=ssl --db-sb-cluster-remote-proto=ssl --ovn-sb-db-ssl-key=/ovn-cert/tls.key --ovn-sb-db-ssl-cert=/ovn-cert/tls.crt --ovn-sb-db-ssl-ca-cert=/ovn-ca/ca-bundle.crt '--ovn-sb-log=-vconsole:info -vfile:off' run_sb_ovsdb
/bin/bash: line 23: /usr/share/ovn/scripts/ovn-ctl: No such file or directory

      Exit Code:    1
      Started:      Fri, 06 Mar 2020 06:43:08 +0000
      Finished:     Fri, 06 Mar 2020 06:43:08 +0000
    Ready:          False
    Restart Count:  27
    Readiness:      exec [/bin/bash -c set -xe
exec /usr/bin/ovn-appctl -t /var/run/ovn/ovnsb_db.ctl cluster/status OVN_Southbound  2>/dev/null | grep ${K8S_NODE_IP} | grep -v Address -q
] delay=30s timeout=1s period=10s #success=1 #failure=3
    Environment:
      OVN_LOG_LEVEL:  info
      K8S_NODE_IP:     (v1:status.hostIP)
    Mounts:
      /env from env-overrides (rw)
      /etc/openvswitch/ from etc-openvswitch (rw)
      /etc/ovn/ from etc-openvswitch (rw)
      /ovn-ca from ovn-ca (rw)
      /ovn-cert from ovn-cert (rw)
      /run/openvswitch/ from run-openvswitch (rw)
      /run/ovn/ from run-ovn (rw)
      /var/lib/openvswitch/ from var-lib-openvswitch (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from ovn-kubernetes-controller-token-nsr77 (ro)
  ovnkube-master:
    Container ID:  cri-o://8ea92f8479639fcd21bed6520430b789f195db0181f8b0b8ca9f4d5a192faf65
    Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:66ffaa52ca844214ef09a0a18222ccc2a248d4ddb69e9fe11ac11de24a747290
    Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:66ffaa52ca844214ef09a0a18222ccc2a248d4ddb69e9fe11ac11de24a747290
    Port:          9102/TCP
    Host Port:     9102/TCP
    Command:
      /bin/bash
      -c
      set -xe
      if [[ -f "/env/_master" ]]; then
        set -o allexport
        source "/env/_master"
        set +o allexport
      fi
      
      hybrid_overlay_flags=
      if [[ -n "" ]]; then
        hybrid_overlay_flags="--enable-hybrid-overlay --no-hostsubnet-nodes=\"kubernetes.io/os=windows\""
        if [[ -n "" ]]; then
          hybrid_overlay_flags="${hybrid_overlay_flags} --hybrid-overlay-cluster-subnets="
        fi
      fi
      
      # start nbctl daemon for caching
      export OVN_NB_DAEMON=$(ovn-nbctl --pidfile=/tmp/ovnk-nbctl.pid \
        --detach \
        -p /ovn-cert/tls.key -c /ovn-cert/tls.crt -C /ovn-ca/ca-bundle.crt \
        --db "ssl:10.0.130.166:9641,ssl:10.0.150.151:9641,ssl:10.0.164.31:9641")
      
      exec /usr/bin/ovnkube \
        --init-master "${K8S_NODE}" \
        --config-file=/run/ovnkube-config/ovnkube.conf \
        --ovn-empty-lb-events \
        --loglevel "${OVN_KUBE_LOG_LEVEL}" \
        ${hybrid_overlay_flags} \
        --metrics-bind-address "0.0.0.0:9102" \
        --sb-address "ssl://10.0.130.166:9642,ssl://10.0.150.151:9642,ssl://10.0.164.31:9642" \
        --sb-client-privkey /ovn-cert/tls.key \
        --sb-client-cert /ovn-cert/tls.crt \
        --sb-client-cacert /ovn-ca/ca-bundle.crt
      
    State:          Running
      Started:      Fri, 06 Mar 2020 04:50:19 +0000
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:     100m
      memory:  300Mi
    Environment:
      OVN_KUBE_LOG_LEVEL:  4
      K8S_NODE:             (v1:spec.nodeName)
    Mounts:
      /env from env-overrides (rw)
      /etc/openvswitch/ from etc-openvswitch (rw)
      /etc/ovn/ from etc-openvswitch (rw)
      /ovn-ca from ovn-ca (rw)
      /ovn-cert from ovn-cert (rw)
      /run/openvswitch/ from run-openvswitch (rw)
      /run/ovn/ from run-ovn (rw)
      /run/ovnkube-config/ from ovnkube-config (rw)
      /var/lib/openvswitch/ from var-lib-openvswitch (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from ovn-kubernetes-controller-token-nsr77 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  etc-openvswitch:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/ovn/etc
    HostPathType:  
  var-lib-openvswitch:
    Type:          HostPath (bare host directory volume)
    Path:          /var/lib/ovn/data
    HostPathType:  
  run-openvswitch:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  run-ovn:
    Type:          HostPath (bare host directory volume)
    Path:          /var/run/ovn
    HostPathType:  
  ovnkube-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ovnkube-config
    Optional:  false
  env-overrides:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      env-overrides
    Optional:  true
  ovn-ca:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      ovn-ca
    Optional:  false
  ovn-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ovn-cert
    Optional:    false
  ovn-kubernetes-controller-token-nsr77:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ovn-kubernetes-controller-token-nsr77
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  beta.kubernetes.io/os=linux
                 node-role.kubernetes.io/master=
Tolerations:     node-role.kubernetes.io/master
                 node.kubernetes.io/disk-pressure:NoSchedule
                 node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/network-unavailable
                 node.kubernetes.io/not-ready
                 node.kubernetes.io/pid-pressure:NoSchedule
                 node.kubernetes.io/unreachable
                 node.kubernetes.io/unschedulable:NoSchedule
Events:
  Type     Reason               Age                  From                                                 Message
  ----     ------               ----                 ----                                                 -------
  Normal   Scheduled            <unknown>            default-scheduler                                    Successfully assigned openshift-ovn-kubernetes/ovnkube-master-6nrx4 to ip-10-0-150-151.us-east-2.compute.internal
  Normal   Pulled               117m                 kubelet, ip-10-0-150-151.us-east-2.compute.internal  Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:66ffaa52ca844214ef09a0a18222ccc2a248d4ddb69e9fe11ac11de24a747290" already present on machine
  Normal   Created              117m                 kubelet, ip-10-0-150-151.us-east-2.compute.internal  Created container northd
  Normal   Started              117m                 kubelet, ip-10-0-150-151.us-east-2.compute.internal  Started container northd
  Normal   Pulled               117m                 kubelet, ip-10-0-150-151.us-east-2.compute.internal  Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:66ffaa52ca844214ef09a0a18222ccc2a248d4ddb69e9fe11ac11de24a747290" already present on machine
  Normal   Killing              117m                 kubelet, ip-10-0-150-151.us-east-2.compute.internal  FailedPostStartHook
  Normal   Pulled               117m (x2 over 117m)  kubelet, ip-10-0-150-151.us-east-2.compute.internal  Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:66ffaa52ca844214ef09a0a18222ccc2a248d4ddb69e9fe11ac11de24a747290" already present on machine
  Warning  FailedPostStartHook  117m (x2 over 117m)  kubelet, ip-10-0-150-151.us-east-2.compute.internal  Exec lifecycle hook ([/bin/bash -c MASTER_IP="10.0.130.166"
if [[ "${K8S_NODE_IP}" == "${MASTER_IP}" ]]; then
  retries=0
  while ! ovn-nbctl --no-leader-only -t 5 set-connection pssl:9641 -- set connection . inactivity_probe=0; do
    (( retries += 1 ))
  if [[ "${retries}" -gt 40 ]]; then
    echo "too many failed ovn-nbctl attempts, giving up"
      exit 1
  fi
  sleep 2
  done
fi
]) for Container "nbdb" in Pod "ovnkube-master-6nrx4_openshift-ovn-kubernetes(fcfb5d16-88b1-4781-82fc-e73ba17279e9)" failed - error: rpc error: code = Unknown desc = container is not created or running, message: ""
  Normal   Killing              117m (x2 over 117m)  kubelet, ip-10-0-150-151.us-east-2.compute.internal  FailedPostStartHook
  Normal   Started              117m (x2 over 117m)  kubelet, ip-10-0-150-151.us-east-2.compute.internal  Started container nbdb
  Normal   Created              117m (x2 over 117m)  kubelet, ip-10-0-150-151.us-east-2.compute.internal  Created container sbdb
  Normal   Started              117m                 kubelet, ip-10-0-150-151.us-east-2.compute.internal  Started container ovnkube-master
  Normal   Created              117m                 kubelet, ip-10-0-150-151.us-east-2.compute.internal  Created container ovnkube-master
  Normal   Created              117m (x2 over 117m)  kubelet, ip-10-0-150-151.us-east-2.compute.internal  Created container nbdb
  Normal   Pulled               117m (x2 over 117m)  kubelet, ip-10-0-150-151.us-east-2.compute.internal  Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:66ffaa52ca844214ef09a0a18222ccc2a248d4ddb69e9fe11ac11de24a747290" already present on machine
  Warning  FailedPostStartHook  117m (x2 over 117m)  kubelet, ip-10-0-150-151.us-east-2.compute.internal  Exec lifecycle hook ([/bin/bash -c MASTER_IP="10.0.130.166"
if [[ "${K8S_NODE_IP}" == "${MASTER_IP}" ]]; then
  retries=0
  while ! ovn-sbctl --no-leader-only -t 5 set-connection pssl:9642 -- set connection . inactivity_probe=0; do
    (( retries += 1 ))
  if [[ "${retries}" -gt 40 ]]; then
    echo "too many failed ovn-sbctl attempts, giving up"
      exit 1
  fi
  sleep 2
  done
fi
]) for Container "sbdb" in Pod "ovnkube-master-6nrx4_openshift-ovn-kubernetes(fcfb5d16-88b1-4781-82fc-e73ba17279e9)" failed - error: rpc error: code = Unknown desc = container is not created or running, message: ""
  Normal   Started  117m (x2 over 117m)     kubelet, ip-10-0-150-151.us-east-2.compute.internal  Started container sbdb
  Warning  BackOff  32m (x411 over 117m)    kubelet, ip-10-0-150-151.us-east-2.compute.internal  Back-off restarting failed container
  Warning  BackOff  2m19s (x552 over 117m)  kubelet, ip-10-0-150-151.us-east-2.compute.internal  Back-off restarting failed container


[root@ip-10-0-4-83 ~]# oc describe node ip-10-0-150-151.us-east-2.compute.internal
Name:               ip-10-0-150-151.us-east-2.compute.internal
Roles:              master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=m4.xlarge
                    beta.kubernetes.io/os=linux
                    failure-domain.beta.kubernetes.io/region=us-east-2
                    failure-domain.beta.kubernetes.io/zone=us-east-2b
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ip-10-0-150-151
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/master=
                    node.kubernetes.io/instance-type=m4.xlarge
                    node.openshift.io/os_id=rhcos
                    topology.kubernetes.io/region=us-east-2
                    topology.kubernetes.io/zone=us-east-2b
Annotations:        volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Fri, 06 Mar 2020 04:43:49 +0000
Taints:             node-role.kubernetes.io/master:NoSchedule
                    node.kubernetes.io/not-ready:NoSchedule
Unschedulable:      false
Lease:
  HolderIdentity:  ip-10-0-150-151.us-east-2.compute.internal
  AcquireTime:     <unset>
  RenewTime:       Fri, 06 Mar 2020 06:53:15 +0000
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Fri, 06 Mar 2020 06:50:36 +0000   Fri, 06 Mar 2020 04:43:49 +0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Fri, 06 Mar 2020 06:50:36 +0000   Fri, 06 Mar 2020 04:43:49 +0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Fri, 06 Mar 2020 06:50:36 +0000   Fri, 06 Mar 2020 04:43:49 +0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            False   Fri, 06 Mar 2020 06:50:36 +0000   Fri, 06 Mar 2020 04:43:49 +0000   KubeletNotReady              runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: Missing CNI default network
Addresses:
  InternalIP:   10.0.150.151
  Hostname:     ip-10-0-150-151.us-east-2.compute.internal
  InternalDNS:  ip-10-0-150-151.us-east-2.compute.internal
Capacity:
  attachable-volumes-aws-ebs:  39
  cpu:                         4
  ephemeral-storage:           125277164Ki
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      16419376Ki
  pods:                        250
Allocatable:
  attachable-volumes-aws-ebs:  39
  cpu:                         3500m
  ephemeral-storage:           114381692328
  hugepages-1Gi:               0
  hugepages-2Mi:               0
  memory:                      15268400Ki
  pods:                        250
System Info:
  Machine ID:                 dda37d76906a4d5abd79dbcd1e372156
  System UUID:                ec29f38f-1072-56b4-a458-591bb321f1fc
  Boot ID:                    dee4005e-25a0-4340-8db5-8fd1c4e6de0f
  Kernel Version:             4.18.0-147.5.1.el8_1.x86_64
  OS Image:                   Red Hat Enterprise Linux CoreOS 44.81.202002190130-0 (Ootpa)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  cri-o://1.17.0-4.dev.rhaos4.4.gitc3436cc.el8
  Kubelet Version:            v1.17.1
  Kube-Proxy Version:         v1.17.1
ProviderID:                   aws:///us-east-2b/i-062e23a89609460ad
Non-terminated Pods:          (5 in total)
  Namespace                   Name                                 CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                   ----                                 ------------  ----------  ---------------  -------------  ---
  openshift-multus            multus-bdfzb                         10m (0%)      0 (0%)      150Mi (1%)       0 (0%)         129m
  openshift-network-operator  network-operator-5bf64d5d6c-bzb72    10m (0%)      0 (0%)      50Mi (0%)        0 (0%)         131m
  openshift-ovn-kubernetes    ovnkube-master-6nrx4                 300m (8%)     0 (0%)      900Mi (6%)       0 (0%)         123m
  openshift-ovn-kubernetes    ovnkube-node-22djj                   200m (5%)     0 (0%)      600Mi (4%)       0 (0%)         123m
  openshift-ovn-kubernetes    ovs-node-bkfdq                       100m (2%)     0 (0%)      300Mi (2%)       0 (0%)         128m
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests      Limits
  --------                    --------      ------
  cpu                         620m (17%)    0 (0%)
  memory                      2000Mi (13%)  0 (0%)
  ephemeral-storage           0 (0%)        0 (0%)
  attachable-volumes-aws-ebs  0             0
Events:
  Type    Reason                   Age                  From                                                 Message
  ----    ------                   ----                 ----                                                 -------
  Normal  NodeHasSufficientMemory  129m (x8 over 129m)  kubelet, ip-10-0-150-151.us-east-2.compute.internal  Node ip-10-0-150-151.us-east-2.compute.internal status is now: NodeHasSufficientMemory
  Normal  NodeHasNoDiskPressure    129m (x8 over 129m)  kubelet, ip-10-0-150-151.us-east-2.compute.internal  Node ip-10-0-150-151.us-east-2.compute.internal status is now: NodeHasNoDiskPressure

Comment 1 Ricardo Carrillo Cruz 2020-03-09 11:29:30 UTC
I just created a 4.5 cluster just fine:

[ricky@ricky-laptop openshift-installer]$ oc get clusterversion
NAME      VERSION                        AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.0-0.ci-2020-03-09-035935   True        False         18m     Cluster version is 4.5.0-0.ci-2020-03-09-035935


Closing this, if it reoccurs please provide fresh logs.


Note You need to log in before you can comment on or make changes to this bug.