Bug 1942488 - [4.7] dual stack nodes with OVN single ipv6 fails on bootstrap phase
Summary: [4.7] dual stack nodes with OVN single ipv6 fails on bootstrap phase
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 4.7.z
Assignee: Dan Winship
QA Contact: Daniel Del Ciancio
URL:
Whiteboard:
Depends On: 1939740
Blocks: 1942506
TreeView+ depends on / blocked
 
Reported: 2021-03-24 12:59 UTC by Dan Winship
Modified: 2021-11-23 21:56 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: When bringing up a single-stack IPv6 cluster on nodes with IPv4 addresses, kubelet might use the IPv4 IP as the node IP rather than the IPv6 IP Consequence: host-network pods have IPv4 IPs rather than IPv6, making them unreachable from IPv6-only pods Fix: the node-IP-picking code was fixed to handle this case Result: nodes will have IPv6 IPs, not IPv4
Clone Of: 1939740
: 1942506 (view as bug list)
Environment:
Last Closed: 2021-05-24 17:14:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
co-network -o yaml (9.80 KB, text/plain)
2021-05-18 20:37 UTC, raj.sarvaiya@bell.ca
no flags Details
Logs from the ovnkube-master in crashloopbackoff (4.09 MB, text/plain)
2021-05-19 13:21 UTC, raj.sarvaiya@bell.ca
no flags Details
install-config-4.6-v6-ovn (1.20 KB, text/plain)
2021-07-26 15:27 UTC, raj.sarvaiya@bell.ca
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift baremetal-runtimecfg pull 132 0 None open Bug 1942488: sort AddressesDefault by route priority, ifindex, and IPv4/IPv6 preference 2021-04-13 14:03:29 UTC
Github openshift machine-config-operator pull 2525 0 None open Bug 1942488: Use new --prefer-ipv6 flag to "runtimecfg node-ip" as appropriate 2021-04-13 14:07:32 UTC
Red Hat Product Errata RHSA-2021:1561 0 None None None 2021-05-24 17:15:10 UTC

Comment 4 zhaozhanqi 2021-05-06 00:59:44 UTC
@ddelcian any chance you can help verified this bug? since QE do not have this kind of environment immediately.

Comment 6 Daniel Del Ciancio 2021-05-10 13:14:09 UTC
Hi,

Can you provide details as to how the customer could test this?

Thanks!

Comment 7 zhaozhanqi 2021-05-10 13:31:20 UTC
(In reply to Daniel Del Ciancio from comment #6)
> Hi,
> 
> Can you provide details as to how the customer could test this?
> 
> Thanks!

try with this build quay.io/openshift-release-dev/ocp-release:4.7.10-x86_64

Comment 8 Daniel Del Ciancio 2021-05-10 14:07:12 UTC
Hi again,
The customer is planning to skip 4.7 and move straight to 4.8 for testing.

Can you confirm which 4.8 release includes this fix?  Does candidate-4.8 channel include it at this time?

Comment 9 Dan Winship 2021-05-10 16:28:07 UTC
I don't know what's in candidate-4.8, but it's fixed in any 4.8.0-fc.* release

Comment 11 zhaozhanqi 2021-05-14 05:53:25 UTC
@Daniel

Can customer have a try with quay.io/openshift-release-dev/ocp-release:4.8.0-fc.3-x86_64 build , thanks

Comment 14 raj.sarvaiya@bell.ca 2021-05-18 15:11:26 UTC
Was able to get partially through, @daniel I am seeing the same issue I saw with the forced/workaround upgrade on an existing cluster (CIDRs for network)

```bash
oc get co
NAME                                       VERSION      AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.8.0-fc.3   True        False         False      6m37s
baremetal                                  4.8.0-fc.3   True        False         False      26m
cloud-credential                           4.8.0-fc.3   True        False         False      35m
cluster-autoscaler                         4.8.0-fc.3   True        False         False      25m
config-operator                            4.8.0-fc.3   True        False         False      27m
console                                    4.8.0-fc.3   True        False         False      13m
csi-snapshot-controller                    4.8.0-fc.3   True        False         False      26m
dns                                        4.8.0-fc.3   True        False         False      26m
etcd                                       4.8.0-fc.3   True        False         False      25m
image-registry                             4.8.0-fc.3   True        False         False      21m
ingress                                    4.8.0-fc.3   True        False         False      11m
insights                                   4.8.0-fc.3   True        False         False      25m
kube-apiserver                             4.8.0-fc.3   True        False         False      24m
kube-controller-manager                    4.8.0-fc.3   True        False         False      25m
kube-scheduler                             4.8.0-fc.3   True        False         False      25m
kube-storage-version-migrator              4.8.0-fc.3   True        False         False      19m
machine-api                                4.8.0-fc.3   True        False         False      26m
machine-approver                           4.8.0-fc.3   True        False         False      25m
machine-config                             4.8.0-fc.3   True        False         False      25m
marketplace                                4.8.0-fc.3   True        False         False      25m
monitoring                                 4.8.0-fc.3   True        False         False      14m
network                                                 False       True          True       31m
node-tuning                                4.8.0-fc.3   True        False         False      25m
openshift-apiserver                        4.8.0-fc.3   True        False         False      21m
openshift-controller-manager               4.8.0-fc.3   True        False         False      26m
openshift-samples                          4.8.0-fc.3   True        False         False      21m
operator-lifecycle-manager                 4.8.0-fc.3   True        False         False      25m
operator-lifecycle-manager-catalog         4.8.0-fc.3   True        False         False      25m
operator-lifecycle-manager-packageserver   4.8.0-fc.3   True        False         False      22m
service-ca                                 4.8.0-fc.3   True        False         False      27m
storage                                    4.8.0-fc.3   True        False         False      26m
```

Can someone provide me what the CIDRs should be for my network settings below to not run into this issue where network won't come up?
```yaml
networking:
  stack: IPV6
  vlan:
    cluster: 603
    storage: 540
  networkType: OVNKubernetes
  machineCIDR: 2605:b100:0000:4::/64
  clusterNetwork:
  - cidr: 2605:b100:283::/56
    hostPrefix: 64
  serviceNetwork:
  - 2605:b100:283:104::/112
```

Comment 15 Dan Winship 2021-05-18 20:08:06 UTC
Can you get an "oc adm must-gather" output from the cluster?

Or if that doesn't work, at least get "oc get clusteroperator network -o yaml"

FWIW, as the install got past the bootstrap phase, that means that the specific bug that was previously reported is now fixed, and we are now seeing a new bug. I'm going to mark this VERIFIED so that the 4.6 backport of the original bugfix can proceed, but we can continue to debug the new problem here for now.

Comment 16 raj.sarvaiya@bell.ca 2021-05-18 20:37:55 UTC
Created attachment 1784593 [details]
co-network -o yaml

ommitted managed fields

Comment 17 raj.sarvaiya@bell.ca 2021-05-18 20:40:27 UTC
  Normal   AddedInterface          72s                 multus             Add eth0 [2605:b100:283:5::e/64]
  Warning  FailedCreatePodSandBox  55s (x52 over 14m)  kubelet            (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_envoyv4v6-6584c9b6b7-xxxtt_bell-services_f78acab5-54eb-4ec1-bf22-1ae01e4982c8_0(3a44006cee034a1a2d53a99173f06ee82ba6c71140e1a8c1bbb733833dc60963): [bell-services/envoyv4v6-6584c9b6b7-xxxtt:envoyv4]: error adding container to network "envoyv4": failed to create macvlan: device or resource busy
  Normal   AddedInterface          55s


Hey Daniel and RH, also this ^^^ is the same error seen on the pod trying to attach a network attachment for the IPv4-IPv6 conversion with the in-place upgrade workaround

Comment 18 Dan Winship 2021-05-19 13:07:38 UTC
>       DaemonSet "openshift-ovn-kubernetes/ovnkube-master" rollout is not making progress - pod ovnkube-master-hhxcm is in CrashLoopBackOff State

can you get

  oc logs -n openshift-ovn-kubernetes ovnkube-master --all-containers

?

Comment 19 raj.sarvaiya@bell.ca 2021-05-19 13:18:07 UTC
@danw

Do i need to run it on a particular ovnkube master pod or from any ovnkube master pod?

Comment 20 raj.sarvaiya@bell.ca 2021-05-19 13:21:25 UTC
Created attachment 1784813 [details]
Logs from the ovnkube-master in crashloopbackoff

Attached logs from the ovnkube-master in crashloopbackoff

Comment 21 raj.sarvaiya@bell.ca 2021-05-21 13:39:05 UTC
Also I tried deploying a new cluster with the following settings

```yaml
  networkType: OVNKubernetes
  machineCIDR: 2605:b100:0000:9::/64
  clusterNetwork:
  - cidr: 2605:b100:283::/64
    hostPrefix: 64
  serviceNetwork:
  - 2605:b100:283:104::/112
```


Resulted in this

```log
evel=info msg=Waiting up to 30m0s for bootstrapping to complete...
level=error msg=Cluster operator network Degraded is True with RolloutHung: DaemonSet "openshift-ovn-kubernetes/ovnkube-master" rollout is not making progress - pod ovnkube-master-ndzr9 is in CrashLoopBackOff State
level=error msg=DaemonSet "openshift-ovn-kubernetes/ovnkube-master" rollout is not making progress - pod ovnkube-master-b2mb4 is in CrashLoopBackOff State
level=error msg=DaemonSet "openshift-ovn-kubernetes/ovnkube-master" rollout is not making progress - pod ovnkube-master-qlswg is in CrashLoopBackOff State
level=error msg=DaemonSet "openshift-ovn-kubernetes/ovnkube-master" rollout is not making progress - last change 2021-05-21T12:58:00Z
level=error msg=DaemonSet "openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - last change 2021-05-21T12:58:00Z
level=info msg=Cluster operator network ManagementStateDegraded is False with : 
level=info msg=Cluster operator network Progressing is True with Deploying: DaemonSet "openshift-multus/network-metrics-daemon" is waiting for other operators to become ready
level=info msg=DaemonSet "openshift-multus/multus-admission-controller" is waiting for other operators to become ready
level=info msg=DaemonSet "openshift-ovn-kubernetes/ovnkube-master" is not available (awaiting 3 nodes)
level=info msg=DaemonSet "openshift-ovn-kubernetes/ovnkube-node" is not available (awaiting 3 nodes)
level=info msg=DaemonSet "openshift-network-diagnostics/network-check-target" is waiting for other operators to become ready
level=info msg=Deployment "openshift-network-diagnostics/network-check-source" is waiting for other operators to become ready
level=info msg=Cluster operator network Available is False with Startup: The network is starting up
level=info msg=Use the following commands to gather logs from the cluster
level=info msg=openshift-install gather bootstrap --help
level=error msg=Bootstrap failed to complete: timed out waiting for the condition
level=error msg=Failed to wait for bootstrapping to complete. This error usually happens when there is a problem with control plane hosts that prevents the control plane operators from creating the control plane.
level=fatal msg=Bootstrap failed to complete
```

Anything wrong in my networking config shown above?

Comment 22 Dan Winship 2021-05-21 20:06:31 UTC
>  clusterNetwork:
>  - cidr: 2605:b100:283::/64
>    hostPrefix: 64

This says that each node should get a 64 (hostPrefix), but also that the entire cluster gets a /64 (cidr). So that won't work. The cidr needs to be big enough to contain at least as many /64s as you will have nodes.

Comment 23 raj.sarvaiya@bell.ca 2021-05-23 02:10:46 UTC
Ah right. So on the flip side, we're running into the limitation that /56 clusternetwork CIDR is too big for IPv6, correct? Or am i confusing that for another issue?

I do see in upstream Kubernetes docs that /56 for pod-network-cidr is an acceptable setting, albeit for dual stack k8s
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/dual-stack-support/

Comment 24 raj.sarvaiya@bell.ca 2021-05-23 02:26:03 UTC
I am also re-creating the first deployment this time on another cluster, and will try my best to get must-gather logs.

The settings from the first deployment are

```yaml
  networkType: OVNKubernetes
  machineCIDR: 2605:b100:0000:4::/64
  clusterNetwork:
  - cidr: 2605:b100:283::/56
    hostPrefix: 64
  serviceNetwork:
  - 2605:b100:283:104::/112
```

Comment 25 raj.sarvaiya@bell.ca 2021-05-24 00:14:20 UTC
So here's the status of the cluster.

```bash
oc get co
NAME                                       VERSION      AVAILABLE   PROGRESSING   DEGRADED   SINCE
authentication                             4.8.0-fc.3   False       False         True       20h
baremetal                                  4.8.0-fc.3   True        False         False      20h
cloud-credential                           4.8.0-fc.3   True        False         False      20h
cluster-autoscaler                         4.8.0-fc.3   True        False         False      20h
config-operator                            4.8.0-fc.3   True        False         False      20h
console                                    4.8.0-fc.3   False       True          False      20h
csi-snapshot-controller                    4.8.0-fc.3   True        False         False      20h
dns                                        4.8.0-fc.3   True        False         False      20h
etcd                                       4.8.0-fc.3   True        False         False      20h
image-registry                             4.8.0-fc.3   True        False         False      20h
ingress                                    4.8.0-fc.3   True        False         True       19h
insights                                   4.8.0-fc.3   True        False         True       20h
kube-apiserver                             4.8.0-fc.3   True        False         False      20h
kube-controller-manager                    4.8.0-fc.3   True        False         False      20h
kube-scheduler                             4.8.0-fc.3   True        False         False      20h
kube-storage-version-migrator              4.8.0-fc.3   True        False         False      20h
machine-api                                4.8.0-fc.3   True        False         False      20h
machine-approver                           4.8.0-fc.3   True        False         False      20h
machine-config                             4.8.0-fc.3   True        False         False      20h
marketplace                                4.8.0-fc.3   True        False         False      20h
monitoring                                 4.8.0-fc.3   True        False         False      7h36m
network                                    4.8.0-fc.3   True        False         False      20h
node-tuning                                4.8.0-fc.3   True        False         False      20h
openshift-apiserver                        4.8.0-fc.3   True        False         False      20h
openshift-controller-manager               4.8.0-fc.3   True        False         False      20h
openshift-samples                          4.8.0-fc.3   True        False         False      20h
operator-lifecycle-manager                 4.8.0-fc.3   True        False         False      20h
operator-lifecycle-manager-catalog         4.8.0-fc.3   True        False         False      20h
operator-lifecycle-manager-packageserver   4.8.0-fc.3   True        False         False      20h
service-ca                                 4.8.0-fc.3   True        False         False      20h
storage                                    4.8.0-fc.3   True        False         False      20h
```

I am unable to collect must-gather due to the pod failing to start. I will upload sosreports for one of the master nodes, and one of the nodes which is failing to assign an IPv4 macvlan network attachment for the IPv4<->IPv6 ingress envoy pod. This is likely preventing the console operator to come online because it's preventing communication between authentication pods and the rest of the cluster components

Here's the error for the envoy pod
```bash
  Warning  FailedCreatePodSandBox  2m16s (x4836 over 20h)  kubelet  (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_envoyv4v6-6584c9b6b7-kr5pg_bell-services_0db3f8ee-d765-4045-b946-d8566a66dd38_0(18ea57de08c04cde8b2f35ccad5496ab6f2cd9c8f66a3b4feaeefaf177824c2c): [bell-services/envoyv4v6-6584c9b6b7-kr5pg:envoyv4]: error adding container to network "envoyv4": failed to create macvlan: device or resource busy
```

Comment 26 Dan Winship 2021-05-24 13:30:15 UTC
(In reply to raj.sarvaiya from comment #23)
> Ah right. So on the flip side, we're running into the limitation that /56
> clusternetwork CIDR is too big for IPv6, correct? Or am i confusing that for
> another issue?

There is no "too big" restriction on clusternetwork. (Hm... well, some upstream documentation might say that it can't be bigger than /48, but that doesn't apply to ovn-kubernetes anyway. There should definitely not be a problem with a /56 anywhere).

The restrictions are:

  - serviceNetwork should be exactly /112
  - the clusterNetwork hostPrefix must be exactly 64
  - the clusterNetwork length therefore must be < 64

(In reply to raj.sarvaiya from comment #25)
> So here's the status of the cluster.

"oc get co -o yaml" would be more useful since that would have more detailed status

> one of the nodes which is
> failing to assign an IPv4 macvlan network attachment for the IPv4<->IPv6
> ingress envoy pod.

There is no "IPv4<->IPv6 ingress envoy pod" in a default OCP install, so this would be a problem with something you are running, not a problem with OCP itself...

-- Dan

Comment 28 errata-xmlrpc 2021-05-24 17:14:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.12 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:1561

Comment 29 raj.sarvaiya@bell.ca 2021-05-25 00:13:21 UTC
(In reply to Dan Winship from comment #26)
> (In reply to raj.sarvaiya from comment #23)
> > Ah right. So on the flip side, we're running into the limitation that /56
> > clusternetwork CIDR is too big for IPv6, correct? Or am i confusing that for
> > another issue?
> 
> There is no "too big" restriction on clusternetwork. (Hm... well, some
> upstream documentation might say that it can't be bigger than /48, but that
> doesn't apply to ovn-kubernetes anyway. There should definitely not be a
> problem with a /56 anywhere).
> 
> The restrictions are:
> 
>   - serviceNetwork should be exactly /112
>   - the clusterNetwork hostPrefix must be exactly 64
>   - the clusterNetwork length therefore must be < 64
 
Thanks for clearing that up! Really helps

> 
> "oc get co -o yaml" would be more useful since that would have more detailed
> status

Will this contain any sensitive info that needs to be scrubbed?


> > one of the nodes which is
> > failing to assign an IPv4 macvlan network attachment for the IPv4<->IPv6
> > ingress envoy pod.
> 
> There is no "IPv4<->IPv6 ingress envoy pod" in a default OCP install, so
> this would be a problem with something you are running, not a problem with
> OCP itself...
> 
> -- Dan


Yes that's something we need for certain envs to enable communication from outside over IPv4. For this we need the extra network attachments from multus to be functional

Comment 30 Dan Winship 2021-06-01 12:22:47 UTC
(In reply to raj.sarvaiya from comment #29)
> > "oc get co -o yaml" would be more useful since that would have more detailed
> > status
> 
> Will this contain any sensitive info that needs to be scrubbed?

nothing beyond node hostnames

> > > one of the nodes which is
> > > failing to assign an IPv4 macvlan network attachment for the IPv4<->IPv6
> > > ingress envoy pod.
> > 
> > There is no "IPv4<->IPv6 ingress envoy pod" in a default OCP install, so
> > this would be a problem with something you are running, not a problem with
> > OCP itself...
> 
> Yes that's something we need for certain envs to enable communication from
> outside over IPv4. For this we need the extra network attachments from
> multus to be functional

Yes, I just mean, if the cluster is failing because of that though, then you'd need to figure that out yourself, since we don't know any of the details of what that pod is doing, because it's something you created, not something we created.

Comment 31 raj.sarvaiya@bell.ca 2021-06-01 12:31:52 UTC
(In reply to Dan Winship from comment #30)

> > Will this contain any sensitive info that needs to be scrubbed?
> 
> nothing beyond node hostnames
> 

Alright attempting to recreate the conditions and deployment and will try to get you this.


> Yes, I just mean, if the cluster is failing because of that though, then
> you'd need to figure that out yourself, since we don't know any of the
> details of what that pod is doing, because it's something you created, not
> something we created.

If we can figure out and solve why multus is failing to assign a network attachment there, then I'm fairly certain the cluster will be at the desired functionality level. Basically the pod doesn't even start due to the issue so far being with multus. When using another CNI such as cilium, multus does start working

Comment 32 Daniel Del Ciancio 2021-07-22 17:31:49 UTC
Hi @raj.sarvaiya, can you provide the install-config.yaml for the OVN cluster?  Any difference between the Cilium and OVN network config?

Comment 33 raj.sarvaiya@bell.ca 2021-07-22 17:38:51 UTC
Hey Daniel, will have to regenerate this via a pipeline run. Can do this tomorrow afternoon, in the middle of a few things at the moment

Comment 34 raj.sarvaiya@bell.ca 2021-07-26 15:27:17 UTC
Created attachment 1805968 [details]
install-config-4.6-v6-ovn

install-config for a 4.6 ipv6 deployment

Comment 35 raj.sarvaiya@bell.ca 2021-07-26 15:30:37 UTC
I have attached the OVN install config, there's no difference between this one and the Cilium install-config, except for Cilium referring to networktype Cilium


Note You need to log in before you can comment on or make changes to this bug.