Description of problem: ----------------------- Cluster is built with Multitenant plugin via customized manifest: $ cat manifests/cluster-network-03-config.yml apiVersion: operator.openshift.io/v1 kind: Network metadata: name: cluster spec: defaultNetwork: type: OpenShiftSDN openshiftSDNConfig: mode: Multitenant Appears that after a successful build, the authentication operator goes into degraded mode: [alchan-redhat.com@clientvm 0 ~]$ oc get co authentication NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.4.0 True False True 55m [alchan-redhat.com@clientvm 0 ~]$ oc get co authentication -o json | jq '.status.conditions[0]' { "lastTransitionTime": "2020-06-28T20:43:19Z", "message": "IngressStateEndpointsDegraded: Unhealthy addresses found: 10.129.0.30:Get https://10.129.0.30:6443/healthz: dial tcp 10.129.0.30:6443: connect: connection timed out,10.130.0.29:Get https://10.130.0.29:6443/healthz: dial tcp 10.130.0.29:6443: connect: connection timed out", "reason": "IngressStateEndpoints_UnhealthyAddresses", "status": "True", "type": "Degraded" } The 10.129.0.30 & 10.130.0.29 IPs are oauth-openshift pods in openshift-authentication namespace. [alchan-redhat.com@clientvm 0 ~]$ oc get netnamespaces | grep authentication openshift-authentication 9296695 openshift-authentication-operator 7693696 Since they are in different netid, it prevents the authentication-operator pod connecting to oauth-openshift pods. The workaround appears to be joining the two projects: $ oc adm pod-network join-projects --to=openshift-authentication openshift-authentication-operator The authentication operator then is not degraded anymore. Version-Release number of selected component (if applicable): ------------------------------------------------------------- - 4.4.0 has this issue. - Latest 4.4.9 appears to be fine and does NOT has such issue. It appears that in 4.4.9, those two projects are all in the netid 1: [alchan-redhat.com@clientvm 0 ~]$ oc get co authentication NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.4.9 True False False 9m44s [alchan-redhat.com@clientvm 0 ~]$ oc get netnamespaces | grep authentication openshift-authentication 1 openshift-authentication-operator 1 - Have not tested any other version in between 4.4.0 to 4.4.9. Questions: ---------- - In which 4.4.z version is this fixed?
What action(s) are expected of the api/auth team that suggested assignment to me? It's not at all clear to me from the comments that appear on this bz.
It was fixed in 4.4.8 with https://github.com/openshift/cluster-network-operator/pull/657 related to https://bugzilla.redhat.com/show_bug.cgi?id=1841507. The question about what happens for upgrades if someone worked around the problem (comment 4) is best addressed by the SDN team. Reassigning.
This bz is a duplicate of [1]. The fix is already merged for 4.5 [1] and backported to 4.4. For future reference, the list of namespaces to join when running in multitenant mode is maintained by the sdn team (openshift-sdn component). 1: https://bugzilla.redhat.com/show_bug.cgi?id=1837575 2: https://github.com/openshift/cluster-network-operator/pull/650 3: https://github.com/openshift/cluster-network-operator/pull/657 *** This bug has been marked as a duplicate of bug 1837575 ***
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days