Bug 1890130

Summary: multitenant mode consistently fails CI
Product: OpenShift Container Platform Reporter: Casey Callendrello <cdc>
Component: NetworkingAssignee: Casey Callendrello <cdc>
Networking sub component: openshift-sdn QA Contact: zhaozhanqi <zzhao>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: weliang, wking
Version: 4.7Keywords: Upgrades
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-24 15:27:22 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1890297, 1893809    

Description Casey Callendrello 2020-10-21 13:47:54 UTC
Looking at this failed CI run, (which is on 4.5, but this hasn't changed), etcd never comes up. Not yet entirely sure why, but one definite problem is that openshift-etcd-operator cannot reach etcd because it is missing a NetworkNamespace.



1: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-network-operator/842/pull-ci-openshift-cluster-network-operator-release-4.5-e2e-aws-sdn-multi/1318584974605029376

Comment 8 Weibin Liang 2020-11-03 19:54:19 UTC
There no accepted v4.7 nightly images available for several days, verify this bug in 4.7.0-0.ci-2020-11-03-102229
[weliang@weliang tools]$ oc get network.operator.openshift.io cluster -o jsonpath='{.spec.defaultNetwork.openshiftSDNConfig.mode}'
Multitenant

[weliang@weliang tools]$ oc get clusterversion
NAME      VERSION                        AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.ci-2020-11-03-102229   True        False         4h3m    Cluster version is 4.7.0-0.ci-2020-11-03-102229
[weliang@weliang tools]$ oc get project | grep etcd
openshift-etcd                                                    Active
openshift-etcd-operator                                           Active
[weliang@weliang tools]$ oc get all -n openshift-etcd
NAME                                                               READY   STATUS      RESTARTS   AGE
pod/etcd-ip-10-0-146-46.us-east-2.compute.internal                 3/3     Running     0          4h16m
pod/etcd-ip-10-0-170-179.us-east-2.compute.internal                3/3     Running     0          4h14m
pod/etcd-ip-10-0-205-197.us-east-2.compute.internal                3/3     Running     0          4h17m
pod/etcd-quorum-guard-9cbd67bdd-44cdg                              1/1     Running     0          4h2m
pod/etcd-quorum-guard-9cbd67bdd-87knb                              1/1     Running     0          3h58m
pod/etcd-quorum-guard-9cbd67bdd-scs88                              1/1     Running     0          4h
pod/revision-pruner-3-ip-10-0-146-46.us-east-2.compute.internal    0/1     Completed   0          4h
pod/revision-pruner-3-ip-10-0-170-179.us-east-2.compute.internal   0/1     Completed   0          4h1m
pod/revision-pruner-3-ip-10-0-205-197.us-east-2.compute.internal   0/1     Completed   0          3h58m

NAME                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/etcd          ClusterIP   172.30.132.189   <none>        2379/TCP,9979/TCP   4h34m
service/host-etcd-2   ClusterIP   None             <none>        2379/TCP            4h34m

NAME                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/etcd-quorum-guard   3/3     3            3           4h22m

NAME                                          DESIRED   CURRENT   READY   AGE
replicaset.apps/etcd-quorum-guard-9cbd67bdd   3         3         3       4h22m
[weliang@weliang tools]$ oc get all -n openshift-etcd-operator
NAME                                 READY   STATUS    RESTARTS   AGE
pod/etcd-operator-5c5bbf95bf-qpjg7   1/1     Running   0          4h

NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
service/metrics   ClusterIP   172.30.79.39   <none>        443/TCP   4h34m

NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/etcd-operator   1/1     1            1           4h33m

NAME                                       DESIRED   CURRENT   READY   AGE
replicaset.apps/etcd-operator-5c5bbf95bf   1         1         1       4h33m
[weliang@weliang tools]$ oc get ns openshift-etcd-operator
NAME                      STATUS   AGE
openshift-etcd-operator   Active   4h34m
[weliang@weliang tools]$ oc get ns openshift-etcd
NAME             STATUS   AGE
openshift-etcd   Active   4h35m
[weliang@weliang tools]$

Comment 11 errata-xmlrpc 2021-02-24 15:27:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633