Hide Forgot
Upstream PR: https://github.com/ovn-org/ovn-kubernetes/pull/2123
My customer reached out to me for an ETA on the fix. Any ideas? Thanks!
(In reply to Daniel Del Ciancio from comment #5) > My customer reached out to me for an ETA on the fix. Any ideas? > > Thanks! Hi all, Customer reached our to me again on this. Any updates you can share on an ETA? Thanks!
This is a blocker issue preventing them from upgrading to 4.6, and they need to be on 4.6 in order to stabilize OVN and some IPv6 issues they are facing. So I've increased the severity and priority of this bug to align with the customer's expectations. Can you provide an approx ETA as to when we could expect the fix? They have been repeatedly asking me for an update. I'd appreciate your help in getting this prioritized. Thanks!
upstream PR https://github.com/ovn-org/ovn-kubernetes/pull/2134 waiting on reviews
(In reply to Daniel Del Ciancio from comment #8) > This is a blocker issue preventing them from upgrading to 4.6, and they need > to be on 4.6 in order to stabilize OVN and some IPv6 issues they are facing. > > So I've increased the severity and priority of this bug to align with the > customer's expectations. > > Can you provide an approx ETA as to when we could expect the fix? They have > been repeatedly asking me for an update. > I'd appreciate your help in getting this prioritized. > > Thanks! Daniel I would like to explain how IPv6 PODs assignment happens so we know the expected IPs Valid IPv6 IPs ranges from base to base + 65536 where base is the IPv6CIDR + 1 this is a known limitation for the current POD's IP allocation algorithm. so as long as the IP address fit in the above range there should be no issues and the PR to ensure this range limitation is enforced, having IP address outside this range won't work so if CIDR is 2605:b100:283:1::/64 for example valid IPv6 IPs will be from 2605:b100:283:1::1 to 2605:b100:283:1::ffff
UPDATE : The customer was able to update a dev cluster from 4.5.24 to 4.6.21. When it was failing on the rollout of the network operator update phase, they saw the following issue in the newest ovnkube-node pod, that was in a CrashLoopBackOff state: F0330 17:30:51.212374 4188378 ovnkube.go:130] failed to add neighbour entry 2605:b100:283:2::1 0a:58:4b:3a:df:53: file exists First i restarted all the pods in openshift-ovn-kubernetes namespace. All the ovnkube-node pods started having the same issue. So they force rebooted the nodes one-by-one, and were able to make that go away. After that the update reached at 100% showing as failing for monitoring and image-registry operators. The image registry failure is due to our egress pod not being able to come up, with the following macvlan or multus issue. Pod envoyv4v6-9558b6d4f-j2zcr Namespace bell-services 4 minutes ago Generated from kubelet on ocp87-worker-node-2 148 times in the last 42 minutes (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_envoyv4v6-9558b6d4f-j2zcr_bell-services_eb356843-f32c-4fe8-a9da-ae5534edd06e_0(5676e80d23c639dcfbbc8c77a9ef8d5fab7e706d7a8eefc1490bd46a35a3eff7): [bell-services/envoyv4v6-9558b6d4f-j2zcr:envoyv4]: error adding container to network "envoyv4": failed to create macvlan: device or resource busy Any ideas?
let us wait for the fix to the bootstrap issue and then try fresh cluster with 4.6.