Bug 1980135
Summary: | On an IPv6 single stack cluster traffic between master nodes is sent via default gw instead of local subnet | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Marius Cornea <mcornea> | |
Component: | Networking | Assignee: | Jaime Caamaño Ruiz <jcaamano> | |
Networking sub component: | ovn-kubernetes | QA Contact: | Marius Cornea <mcornea> | |
Status: | CLOSED ERRATA | Docs Contact: | Padraig O'Grady <pogrady> | |
Severity: | urgent | |||
Priority: | unspecified | CC: | achernet, agurenko, astoycos, danw, dphillip, ealcaniz, jcaamano, jdee, mifiedle, pogrady, rolove, yboaron, yprokule, yroblamo | |
Version: | 4.8 | |||
Target Milestone: | --- | |||
Target Release: | 4.9.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause: When using IPv6 DHCP, node interface addresses might be leased with a /128 prefix.
Consequence: OVN-Kubernetes uses the same prefix to infer the node's network and thus routes any other address traffic, including traffic to other cluster nodes, through the gateway.
Fix: OVN-Kubernetes inspects the node's routing table and checks for the wider routing entry for the node's interface address and uses that prefix to infer the node's network.
Result: Traffic to other cluster nodes is no longer routed through the gateway.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1994624 (view as bug list) | Environment: | ||
Last Closed: | 2021-10-18 17:38:24 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1994624 |
Description
Marius Cornea
2021-07-07 21:14:04 UTC
To validate I tried blocking one the master nodes in address the forwarding chain on the router: ip6tables -I FORWARD -s 2620:52:0:11c::20 -j DROP and as a result authentication operator got degraded: I0707 21:47:51.882426 1 request.go:668] Waited for 1.194851493s due to client-side throttling, not priority and fairness, request: GET:https://[fd02::1]:443/api/v1/namespaces/openshift-oauth-apiserver/services/api I0707 21:47:53.150944 1 status_controller.go:211] clusteroperator/authentication diff {"status":{"conditions":[{"lastTransitionTime":"2021-07-07T21:45:26Z","message":"All is well","reason":"AsExpected","status":"False","type":"Degraded"},{"lastTransitionTime":"2021-07-07T21:10:41Z","message":"AuthenticatorCertKeyProgressing: All is well","reason":"AsExpected","status":"False","type":"Progressing"},{"lastTransitionTime":"2021-07-07T21:47:53Z","message":"WellKnownAvailable: The well-known endpoint is not yet available: failed to GET kube-apiserver oauth endpoint https://[2620:52:0:11c::20]:6443/.well-known/oauth-authorization-server: dial tcp [2620:52:0:11c::20]:6443: i/o timeout","reason":"WellKnown_NotReady","status":"False","type":"Available"},{"lastTransitionTime":"2021-07-05T09:21:37Z","message":"All is well","reason":"AsExpected","status":"True","type":"Upgradeable"}]}} I0707 21:47:53.155857 1 event.go:282] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-authentication-operator", Name:"authentication-operator", UID:"f7c40d64-aa1e-4560-920f-16c361819931", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/authentication changed: Available changed from True to False ("WellKnownAvailable: The well-known endpoint is not yet available: failed to GET kube-apiserver oauth endpoint https://[2620:52:0:11c::20]:6443/.well-known/oauth-authorization-server: dial tcp [2620:52:0:11c::20]:6443: i/o timeout") E0707 21:47:53.173068 1 base_controller.go:266] WellKnownReadyController reconciliation failed: failed to GET kube-apiserver oauth endpoint https://[2620:52:0:11c::20]:6443/.well-known/oauth-authorization-server: dial tcp [2620:52:0:11c::20]:6443: i/o timeout I0707 21:47:53.173877 1 status_controller.go:211] clusteroperator/authentication diff {"status":{"conditions":[{"lastTransitionTime":"2021-07-07T21:45:26Z","message":"WellKnownReadyControllerDegraded: failed to GET kube-apiserver oauth endpoint https://[2620:52:0:11c::20]:6443/.well-known/oauth-authorization-server: dial tcp [2620:52:0:11c::20]:6443: i/o timeout","reason":"AsExpected","status":"False","type":"Degraded"},{"lastTransitionTime":"2021-07-07T21:10:41Z","message":"AuthenticatorCertKeyProgressing: All is well","reason":"AsExpected","status":"False","type":"Progressing"},{"lastTransitionTime":"2021-07-07T21:47:53Z","message":"WellKnownAvailable: The well-known endpoint is not yet available: failed to GET kube-apiserver oauth endpoint https://[2620:52:0:11c::20]:6443/.well-known/oauth-authorization-server: dial tcp [2620:52:0:11c::20]:6443: i/o timeout","reason":"WellKnown_NotReady","status":"False","type":"Available"},{"lastTransitionTime":"2021-07-05T09:21:37Z","message":"All is well","reason":"AsExpected","status":"True","type":"Upgradeable"}]}} I0707 21:47:53.178825 1 event.go:282] Event(v1.ObjectReference{Kind:"Deployment", Namespace:"openshift-authentication-operator", Name:"authentication-operator", UID:"f7c40d64-aa1e-4560-920f-16c361819931", APIVersion:"apps/v1", ResourceVersion:"", FieldPath:""}): type: 'Normal' reason: 'OperatorStatusChanged' Status for clusteroperator/authentication changed: Degraded message changed from "All is well" to "WellKnownReadyControllerDegraded: failed to GET kube-apiserver oauth endpoint https://[2620:52:0:11c::20]:6443/.well-known/oauth-authorization-server: dial tcp [2620:52:0:11c::20]:6443: i/o timeout" This is a consequence of using stateful DHCPv6 IA_NA address allocation for the host interfaces which assigns to them a /128 prefix. OVN kubernetes uses these same addresses to configure the internal gateway routers and as they not provide any subnet information pod to node traffic is routed through the hosts gateway instead. Unfortunately OVN does not support RA so it is not aware of other routing information. And alternative configuration that should work is to use static or SLAAC instead of stateful DHCPv6. In that way, the host interface address would have or would acquire the link local prefix shared through RA and OVN kubernetes would be aware of it. If stateful DHCPv6 IA_NA allocation is required and traversing the gateway is not acceptable, then we might have to add support to pass on the CNO machineNetwork configuration to OVN kubernetes so that we can add it as static route on the internal gateway routers. As a reference, please see https://bugzilla.redhat.com/show_bug.cgi?id=1973704 ... it's the same problem but with an ipv6 dhcp that comes from an IT router (Juniper). Juniper sends DHCP addresses with mask /128 and /64... ovn is taking the /128 mask and creates routes with this /128 mask instead of /64 one. As a consequence, the nodes cannot communicate between each other, and the deployment fails. (In reply to Jaime Caamaño Ruiz from comment #6) > If stateful DHCPv6 IA_NA allocation is required and traversing the gateway > is not acceptable, then we might have to add support to pass on the CNO > machineNetwork configuration to OVN kubernetes so that we can add it as > static route on the internal gateway routers. This won't work because the machineNetwork isn't guaranteed to be a single subnet. But anyway, https://github.com/ovn-org/ovn-kubernetes/pull/2338 looks like the right fix to me *** Bug 1973704 has been marked as a duplicate of this bug. *** [kni@sealusa2 ~]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.nightly-2021-08-26-013855 True False 105m Cluster version is 4.9.0-0.nightly-2021-08-26-013855 [kni@sealusa2 ~]$ oc -n openshift-ovn-kubernetes exec -it ovnkube-master-85nhm -c ovnkube-master -- ovn-nbctl find Logical_Router_Port | grep -A1 rtoe-GR name : rtoe-GR_worker-0-0.ocp-edge-cluster-0.qe.lab.redhat.com networks : ["fd2e:6f44:5dd8::47/64"] -- name : rtoe-GR_master-0-0.ocp-edge-cluster-0.qe.lab.redhat.com networks : ["fd2e:6f44:5dd8::58/64"] -- name : rtoe-GR_worker-0-1.ocp-edge-cluster-0.qe.lab.redhat.com networks : ["fd2e:6f44:5dd8::34/64"] -- name : rtoe-GR_master-0-1.ocp-edge-cluster-0.qe.lab.redhat.com networks : ["fd2e:6f44:5dd8::5b/64"] -- name : rtoe-GR_master-0-2.ocp-edge-cluster-0.qe.lab.redhat.com networks : ["fd2e:6f44:5dd8::8f/64"] ip link show baremetal-0 70: baremetal-0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether 52:54:00:1e:8e:e3 brd ff:ff:ff:ff:ff:ff [kni@sealusa2 ~]$ sudo tcpdump -i baremetal-0 -ennn ether host 52:54:00:1e:8e:e3 and tcp port 6443 dropped privs to tcpdump tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on baremetal-0, link-type EN10MB (Ethernet), capture size 262144 bytes Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |