Bug 1890130 - multitenant mode consistently fails CI
Summary: multitenant mode consistently fails CI
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.7
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 4.7.0
Assignee: Casey Callendrello
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks: 1890297 1893809
TreeView+ depends on / blocked
 
Reported: 2020-10-21 13:47 UTC by Casey Callendrello
Modified: 2021-02-24 15:27 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:27:22 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-network-operator pull 844 0 None closed Bug 1890130: openshift-sdn: multitenant: join openshift-etcd-operator to etcd 2021-02-15 21:53:11 UTC
Github openshift sdn pull 209 0 None closed Bug 1890130: fix pod creation deadlock 2021-02-15 21:53:11 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:27:46 UTC

Description Casey Callendrello 2020-10-21 13:47:54 UTC
Looking at this failed CI run, (which is on 4.5, but this hasn't changed), etcd never comes up. Not yet entirely sure why, but one definite problem is that openshift-etcd-operator cannot reach etcd because it is missing a NetworkNamespace.



1: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-network-operator/842/pull-ci-openshift-cluster-network-operator-release-4.5-e2e-aws-sdn-multi/1318584974605029376

Comment 8 Weibin Liang 2020-11-03 19:54:19 UTC
There no accepted v4.7 nightly images available for several days, verify this bug in 4.7.0-0.ci-2020-11-03-102229
[weliang@weliang tools]$ oc get network.operator.openshift.io cluster -o jsonpath='{.spec.defaultNetwork.openshiftSDNConfig.mode}'
Multitenant

[weliang@weliang tools]$ oc get clusterversion
NAME      VERSION                        AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.ci-2020-11-03-102229   True        False         4h3m    Cluster version is 4.7.0-0.ci-2020-11-03-102229
[weliang@weliang tools]$ oc get project | grep etcd
openshift-etcd                                                    Active
openshift-etcd-operator                                           Active
[weliang@weliang tools]$ oc get all -n openshift-etcd
NAME                                                               READY   STATUS      RESTARTS   AGE
pod/etcd-ip-10-0-146-46.us-east-2.compute.internal                 3/3     Running     0          4h16m
pod/etcd-ip-10-0-170-179.us-east-2.compute.internal                3/3     Running     0          4h14m
pod/etcd-ip-10-0-205-197.us-east-2.compute.internal                3/3     Running     0          4h17m
pod/etcd-quorum-guard-9cbd67bdd-44cdg                              1/1     Running     0          4h2m
pod/etcd-quorum-guard-9cbd67bdd-87knb                              1/1     Running     0          3h58m
pod/etcd-quorum-guard-9cbd67bdd-scs88                              1/1     Running     0          4h
pod/revision-pruner-3-ip-10-0-146-46.us-east-2.compute.internal    0/1     Completed   0          4h
pod/revision-pruner-3-ip-10-0-170-179.us-east-2.compute.internal   0/1     Completed   0          4h1m
pod/revision-pruner-3-ip-10-0-205-197.us-east-2.compute.internal   0/1     Completed   0          3h58m

NAME                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/etcd          ClusterIP   172.30.132.189   <none>        2379/TCP,9979/TCP   4h34m
service/host-etcd-2   ClusterIP   None             <none>        2379/TCP            4h34m

NAME                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/etcd-quorum-guard   3/3     3            3           4h22m

NAME                                          DESIRED   CURRENT   READY   AGE
replicaset.apps/etcd-quorum-guard-9cbd67bdd   3         3         3       4h22m
[weliang@weliang tools]$ oc get all -n openshift-etcd-operator
NAME                                 READY   STATUS    RESTARTS   AGE
pod/etcd-operator-5c5bbf95bf-qpjg7   1/1     Running   0          4h

NAME              TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
service/metrics   ClusterIP   172.30.79.39   <none>        443/TCP   4h34m

NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/etcd-operator   1/1     1            1           4h33m

NAME                                       DESIRED   CURRENT   READY   AGE
replicaset.apps/etcd-operator-5c5bbf95bf   1         1         1       4h33m
[weliang@weliang tools]$ oc get ns openshift-etcd-operator
NAME                      STATUS   AGE
openshift-etcd-operator   Active   4h34m
[weliang@weliang tools]$ oc get ns openshift-etcd
NAME             STATUS   AGE
openshift-etcd   Active   4h35m
[weliang@weliang tools]$

Comment 11 errata-xmlrpc 2021-02-24 15:27:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.