Bug 1794775
| Summary: | [Bare Metal] OVN install fails on UPI | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Anurag saxena <anusaxen> | ||||
| Component: | Networking | Assignee: | Ricardo Carrillo Cruz <ricarril> | ||||
| Networking sub component: | ovn-kubernetes | QA Contact: | Anurag saxena <anusaxen> | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | high | ||||||
| Priority: | high | CC: | anbhat, bbennett, danw, mcambria, mifiedle, pcameron, rbrattai, ricarril, weliang, wsun, xtian, zshi, zzhao | ||||
| Version: | 4.4 | Flags: | weliang:
needinfo-
anusaxen: needinfo- |
||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.5.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | 1751274 | Environment: | |||||
| Last Closed: | 2020-07-13 17:13:28 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Anurag saxena
2020-01-24 16:10:02 UTC
*** Bug 1751274 has been marked as a duplicate of this bug. *** Assigning to Ricky. Hi there Can you please reproduce and give me link to the environment? Thanks The kube apiserver static pods died:
Name: kube-apiserver
Namespace:
Labels: <none>
Annotations: <none>
API Version: config.openshift.io/v1
Kind: ClusterOperator
Metadata:
Creation Timestamp: 2020-02-03T15:29:11Z
Generation: 1
Resource Version: 278447
Self Link: /apis/config.openshift.io/v1/clusteroperators/kube-apiserver
UID: 160bc85d-87e6-4a2b-84a9-6133866285f7
Spec:
Status:
Conditions:
Last Transition Time: 2020-02-03T15:31:25Z
Message: NodeInstallerDegraded: 1 nodes are failing on revision 4:
NodeInstallerDegraded:
StaticPodsDegraded: pods "kube-apiserver-ip-10-0-55-223" not found
StaticPodsDegraded: pods "kube-apiserver-ip-10-0-76-10" not found
StaticPodsDegraded: pods "kube-apiserver-ip-10-0-53-159" not found
Reason: NodeInstaller_InstallerPodFailed::StaticPods_Error
Status: True
Type: Degraded
Last Transition Time: 2020-02-03T15:29:17Z
Message: Progressing: 3 nodes are at revision 0; 0 nodes have achieved new revision 5
Status: True
Type: Progressing
Last Transition Time: 2020-02-03T15:29:11Z
Message: Available: 0 nodes are active; 3 nodes are at revision 0; 0 nodes have achieved new revision 5
Reason: _ZeroNodesActive
Status: False
Type: Available
Last Transition Time: 2020-02-03T15:29:11Z
Reason: AsExpected
Status: True
Type: Upgradeable
Extension: <nil>
Related Objects:
Group: operator.openshift.io
Name: cluster
Resource: kubeapiservers
Group: apiextensions.k8s.io
Name:
Resource: customresourcedefinitions
Group:
Name: openshift-config
Resource: namespaces
Group:
Name: openshift-config-managed
Resource: namespaces
Group:
Name: openshift-kube-apiserver-operator
Resource: namespaces
Group:
Name: openshift-kube-apiserver
Resource: namespaces
Versions:
Name: raw-internal
Version: 4.4.0-0.nightly-2020-02-03-081920
Events: <none>
The network was working tho:
Name: network
Namespace:
Labels: <none>
Annotations: network.operator.openshift.io/last-seen-state: {"DaemonsetStates":[],"DeploymentStates":[]}
API Version: config.openshift.io/v1
Kind: ClusterOperator
Metadata:
Creation Timestamp: 2020-02-03T15:26:49Z
Generation: 1
Resource Version: 86143
Self Link: /apis/config.openshift.io/v1/clusteroperators/network
UID: 22cb8503-d6d6-4d15-ba32-8dd964f2c6f3
Spec:
Status:
Conditions:
Last Transition Time: 2020-02-03T20:31:56Z
Status: False
Type: Degraded
Last Transition Time: 2020-02-03T15:26:49Z
Status: True
Type: Upgradeable
Last Transition Time: 2020-02-03T15:35:40Z
Status: False
Type: Progressing
Last Transition Time: 2020-02-03T15:28:17Z
Status: True
Type: Available
Extension: <nil>
Related Objects:
Group:
Name: applied-cluster
Namespace: openshift-network-operator
Resource: configmaps
Group: apiextensions.k8s.io
Name: network-attachment-definitions.k8s.cni.cncf.io
Resource: customresourcedefinitions
Group:
Name: openshift-multus
Resource: namespaces
Group: rbac.authorization.k8s.io
Name: multus
Resource: clusterroles
Group:
Name: multus
Namespace: openshift-multus
Resource: serviceaccounts
Group: rbac.authorization.k8s.io
Name: multus
Resource: clusterrolebindings
Group: apps
Name: multus
Namespace: openshift-multus
Resource: daemonsets
Group:
Name: multus-admission-controller
Namespace: openshift-multus
Resource: services
Group: rbac.authorization.k8s.io
Name: multus-admission-controller-webhook
Resource: clusterroles
Group: rbac.authorization.k8s.io
Name: multus-admission-controller-webhook
Resource: clusterrolebindings
Group: admissionregistration.k8s.io
Name: multus.openshift.io
Resource: validatingwebhookconfigurations
Group:
Name: openshift-service-ca
Namespace: openshift-network-operator
Resource: configmaps
Group: apps
Name: multus-admission-controller
Namespace: openshift-multus
Resource: daemonsets
Group:
Name: multus-admission-controller-monitor-service
Namespace: openshift-multus
Resource: services
Group: rbac.authorization.k8s.io
Name: prometheus-k8s
Namespace: openshift-multus
Resource: roles
Group: rbac.authorization.k8s.io
Name: prometheus-k8s
Namespace: openshift-multus
Resource: rolebindings
Group:
Name: openshift-ovn-kubernetes
Resource: namespaces
Group:
Name: ovn-kubernetes-node
Namespace: openshift-ovn-kubernetes
Resource: serviceaccounts
Group: rbac.authorization.k8s.io
Name: openshift-ovn-kubernetes-node
Resource: clusterroles
Group: rbac.authorization.k8s.io
Name: openshift-ovn-kubernetes-node
Resource: clusterrolebindings
Group:
Name: ovn-kubernetes-controller
Namespace: openshift-ovn-kubernetes
Resource: serviceaccounts
Group: rbac.authorization.k8s.io
Name: openshift-ovn-kubernetes-controller
Resource: clusterroles
Group: rbac.authorization.k8s.io
Name: openshift-ovn-kubernetes-controller
Resource: clusterrolebindings
Group: rbac.authorization.k8s.io
Name: openshift-ovn-kubernetes-sbdb
Namespace: openshift-ovn-kubernetes
Resource: roles
Group: rbac.authorization.k8s.io
Name: openshift-ovn-kubernetes-sbdb
Namespace: openshift-ovn-kubernetes
Resource: rolebindings
Group:
Name: ovnkube-config
Namespace: openshift-ovn-kubernetes
Resource: configmaps
Group:
Name: ovnkube-db
Namespace: openshift-ovn-kubernetes
Resource: services
Group: apps
Name: ovs-node
Namespace: openshift-ovn-kubernetes
Resource: daemonsets
Group: network.operator.openshift.io
Name: ovn
Namespace: openshift-ovn-kubernetes
Resource: operatorpkis
Group:
Name: ovn-kubernetes-master
Namespace: openshift-ovn-kubernetes
Resource: services
Group:
Name: ovn-kubernetes-node
Namespace: openshift-ovn-kubernetes
Resource: services
Group: rbac.authorization.k8s.io
Name: prometheus-k8s
Namespace: openshift-ovn-kubernetes
Resource: roles
Group: rbac.authorization.k8s.io
Name: prometheus-k8s
Namespace: openshift-ovn-kubernetes
Resource: rolebindings
Group: policy
Name: ovn-raft-quorum-guard
Namespace: openshift-ovn-kubernetes
Resource: poddisruptionbudgets
Group: apps
Name: ovnkube-master
Namespace: openshift-ovn-kubernetes
Resource: daemonsets
Group: apps
Name: ovnkube-node
Namespace: openshift-ovn-kubernetes
Resource: daemonsets
Group:
Name: openshift-network-operator
Resource: namespaces
Versions:
Name: operator
Version: 4.4.0-0.nightly-2020-02-03-081920
Events: <none>
It's needed to look at node logs to ascertain why the apiserver died.
Is it possible to ssh to them?
Thanks
I couldn't find kube apiserver logs , they must have been GC'd. I'd need some QE guy that is closer to my region to spin it up so I can jump on the env quickly to tail on logs before they eventually get lost. Hi Ricardo, I still see this issue in latest v4.4 image. QE can re test it after https://bugzilla.redhat.com/show_bug.cgi?id=1796844 get fixed. Moving to 4.5 since this is an ovn-kubernetes issue. this issue did not be reproduced with the latest 4.4 and 4.5 nightly build. This issue should be fixed by some PR merged. Move this bug to verified. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409 |