Hide Forgot
Now that bz#2034527 is fixed the cluster in dualstack jobs is provisioning but most (not all) of the Feature:IPv6DualStack tests are failing e.g. (from https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi-ovn-dualstack/1481806814159835136) 9 passed and 10 failed [Feature:IPv6DualStack] should be able to reach pod on ipv4 and ipv6 ip [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Passed" [Feature:IPv6DualStack] Granular Checks: Services Secondary IP Family [LinuxOnly] should be able to handle large requests: udp [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Passed" [Feature:IPv6DualStack] Granular Checks: Services Secondary IP Family [LinuxOnly] should function for pod-Service: udp [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Failed" [Feature:IPv6DualStack] Granular Checks: Services Secondary IP Family [LinuxOnly] should function for node-Service: udp [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Failed" [Feature:IPv6DualStack] should create a single stack service with cluster ip from primary service range [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Passed" [Feature:IPv6DualStack] Granular Checks: Services Secondary IP Family [LinuxOnly] should function for node-Service: http [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Failed" [Feature:IPv6DualStack] Granular Checks: Services Secondary IP Family [LinuxOnly] should update endpoints: http [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Passed" [Feature:IPv6DualStack] should create service with ipv6,v4 cluster ip [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Failed" [Feature:IPv6DualStack] Granular Checks: Services Secondary IP Family [LinuxOnly] should function for endpoint-Service: udp [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Failed" [Feature:IPv6DualStack] Granular Checks: Services Secondary IP Family [LinuxOnly] should function for service endpoints using hostNetwork [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Failed" [Feature:IPv6DualStack] should create service with ipv4,v6 cluster ip [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Failed" [Feature:IPv6DualStack] Granular Checks: Services Secondary IP Family [LinuxOnly] should function for endpoint-Service: http [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Failed" [Feature:IPv6DualStack] should create pod, add ipv6 and ipv4 ip to pod ips [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Passed" [Feature:IPv6DualStack] Granular Checks: Services Secondary IP Family [LinuxOnly] should function for pod-Service: http [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Failed" [Feature:IPv6DualStack] should have ipv4 and ipv6 internal node ip [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Failed" [Feature:IPv6DualStack] should create service with ipv6 cluster ip [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Passed" [Feature:IPv6DualStack] should create service with ipv4 cluster ip [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Passed" [Feature:IPv6DualStack] Granular Checks: Services Secondary IP Family [LinuxOnly] should update endpoints: udp [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Passed" [Feature:IPv6DualStack] Granular Checks: Services Secondary IP Family [LinuxOnly] should be able to handle large requests: http [Suite:openshift/conformance/parallel] [Suite:k8s]" e2e test finished As "Passed"
I saw a couple of fixes in 1.23. To see the overall fail rate, we might wait; https://github.com/openshift/origin/pull/26711 I'm not sure it will totally clear all failures but we can assure that there is a real problem.
Looking at the kernel command line, we have: Jan 14 02:55:33.953747 localhost kernel: Command line: BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-b0c17d2741adcd571b08d69dbae84aa880469604b4ea9d5e7b6ccbe53c3a3cf2/vmlinuz-4.18.0-305.28.1.el8_4.x86_64 random.trust_cpu=on console=tty0 console=ttyS0,115200n8 ignition.platform.id=metal ignition.firstboot ostree=/ostree/boot.1/rhcos/b0c17d2741adcd571b08d69dbae84aa880469604b4ea9d5e7b6ccbe53c3a3cf2/0 ip=dhcp This is on both control plane and workers. We don't pass any ip= arg for dual-stack in CBO for workers: https://github.com/openshift/cluster-baremetal-operator/blob/master/provisioning/baremetal_pod.go#L326-L327 Although we do in the installer for the control plane: https://github.com/openshift/installer/blob/master/data/data/bootstrap/baremetal/files/usr/local/bin/startironic.sh.template#L100-L104 It's not clear where this is coming from. It could be that because we are booting the live ISO with this arg (for IPA) it is getting copied automatically to the installed image on disk? Or possibly it's a default arg?
Trying out setting ip=dhcp,dhcp6: https://github.com/openshift/cluster-baremetal-operator/pull/236
That had no effect. I wonder if the MCO sets the kernel command line flags after the initial boot+update+reboot.
(In reply to Zane Bitter from comment #5) > We don't pass any ip= arg for dual-stack in CBO for workers: Looking at the image-customization-controller its being started with { "name": "IP_OPTIONS", "value": "ip=dhcp" }, Looks to me like this is based on what the int-api resolves to https://github.com/openshift/cluster-baremetal-operator/blob/2655e07/controllers/provisioning_controller.go#L489-L495 > I0114 02:43:20.103478 1 provisioning_controller.go:495] "Network stack calculation" APIServerInternalHost="api-int.ostest.test.metalkube.org" NetworkStack=1 Are we expecting NetworkStack=3 for dualstack?
(In reply to Derek Higgins from comment #6) > > I0114 02:43:20.103478 1 provisioning_controller.go:495] "Network stack calculation" APIServerInternalHost="api-int.ostest.test.metalkube.org" NetworkStack=1 > > Are we expecting NetworkStack=3 for dualstack? I'd have thought so. There's no other time when NetworkStack should be 3, so why else would it exist?
I believe we set ip=dhcp in dual stack to ensure all of the nodes used the same ip version to resolve their hostnames. Otherwise we got some nodes using short names and others using fqdns depending on which DHCP server responded fastest.
Original patch adding this was here (for bug 1946079): https://github.com/openshift/cluster-baremetal-operator/pull/148#issue-903275273 > This should be equal to > "ip=dhcp" (when ipv4 only) > "ip=dhcp6" (when ipv6 only) > "" (when dual stack) It appears that leaving it empty for dual-stack was based on this comment: https://bugzilla.redhat.com/show_bug.cgi?id=1931852#c8 that "ip=dhcp,dhcp6" is not a thing, which was probably true at the time but NetworkManager has since been changed to accept it beginning with 1.34: https://networkmanager.dev/blog/networkmanager-1-34/ There were several rounds of patches to get it to work: https://github.com/openshift/cluster-baremetal-operator/pull/151 https://github.com/openshift/cluster-baremetal-operator/pull/158 https://github.com/openshift/cluster-baremetal-operator/pull/163 but none of these state as a goal that we should pass ip=dhcp on dual-stack. In fact they make it more explicit that we should pass "" for dual-stack: https://github.com/openshift/cluster-baremetal-operator/pull/158/commits/4589e4c490154ea20d6b28c69283ce3a6efd278f#diff-1575ce96065be1a97bee923445ae60115c8ce02b4a2736788012df8162407100R273-R274 Maybe @sdasu can comment, since she wrote https://github.com/openshift/cluster-baremetal-operator/pull/158/commits/5d37ad30322b511361c352ade461c96f5540190b I suspect that if we actually pass ip=dhcp,dhcp6 (now that it's supported) we will probably get consistent hostnames, since we'll wait for DHCP on *both* IPv4 and IPv6 networks, and everything will work.
Interesting. That doesn't match my recollection of the discussions I was involved in, but it's entirely possible I'm just wrong. I'm not sure ip=dhcp,dhcp6 helps here because it's not a question of all interfaces getting addresses, it's a question of what order they get addresses in. That can affect which hostname the system chooses (I think, anyway). If one system gets a DHCP response first with a shortname, and another gets a DHCP6 response first with an FQDN then their hostname formats will differ. I think this can be a problem even within a single system if DHCP returns first during inspection and the DHCP6 returns first after deployment. The CSR name may not match. However, this reminds me that ip=dhcp doesn't do what you expect anyway. Example: $ /usr/libexec/nm-initrd-generator ip=dhcp -s *** Connection 'default_connection' *** [connection] id=Wired Connection uuid=b4a119b4-ebfb-4ea9-8de2-93b24f276076 type=ethernet autoconnect-retries=1 multi-connect=3 permissions= [ipv4] dhcp-timeout=90 dns-search= may-fail=false method=auto [ipv6] addr-gen-mode=eui64 dhcp-timeout=90 dns-search= method=auto ... Note that even with ip=dhcp, ipv6 is still set to auto (I guess it just doesn't set may-fail to false?). That matches what I saw on the nodes in my investigation of https://bugzilla.redhat.com/show_bug.cgi?id=1982821 . With ip=dhcp6 the behavior is as expected. So I'm not sure ip=dhcp would help this situation anyway. And I'm hoping someone else still remembers some context about this since I clearly don't. :-)
(In reply to Ben Nemec from comment #10) > Interesting. That doesn't match my recollection of the discussions I was > involved in, but it's entirely possible I'm just wrong. I'm not sure > ip=dhcp,dhcp6 helps here because it's not a question of all interfaces > getting addresses, it's a question of what order they get addresses in. That > can affect which hostname the system chooses (I think, anyway). Won't https://github.com/openshift/image-customization-controller/blob/main/pkg/ignition/builder.go#L96-L99 ensure that we always use the IPv6 hostname if we get a DHCPv6 response? > If one system gets a DHCP response first with a shortname, and another gets > a DHCP6 response first with an FQDN then their hostname formats will differ. > I think this can be a problem even within a single system if DHCP returns > first during inspection and the DHCP6 returns first after deployment. The > CSR name may not match. That could be a problem if the provisioning network is v4 and we get a different hostname for v6. > However, this reminds me that ip=dhcp doesn't do what you expect anyway. I expect it to wait until we have an IPv4 address from DHCP before declaring the network up. I didn't expect it to disable IPv6.
Sandhya will take at look at https://bugzilla.redhat.com/show_bug.cgi?id=2040671#c9 to see if we can try that change.
There is some suggestion that the test failures might be nothing to do with the metal platform, and will be resolved by bug 2033751.
(In reply to Zane Bitter from comment #13) > There is some suggestion that the test failures might be nothing to do with > the metal platform, and will be resolved by bug 2033751. At least "[sig-network] [Feature:IPv6DualStack] should have ipv4 and ipv6 internal node ip" is still likely platform, because that means kubelet doesn't know about both node addresses enough to post them to the apiserver in its node.Status.Addresses.
(In reply to Zane Bitter from comment #13) > There is some suggestion that the test failures might be nothing to do with > the metal platform, and will be resolved by bug 2033751. 2 of the 10 tests the were failing now appear to be passing, I'm guessing due to the bump in kubebetes version https://bugzilla.redhat.com/show_bug.cgi?id=2033751 [Feature:IPv6DualStack] should create service with ipv4,v6 cluster ip [Suite:openshift/conformance/parallel] [Feature:IPv6DualStack] should create service with ipv6,v4 cluster ip [Suite:openshift/conformance/parallel] They started passing yesterday evening https://testgrid.k8s.io/redhat-openshift-ocp-release-4.10-informing#periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi-ovn-dualstack
I think this may be a timing thing. Check out these journal logs: Jan 22 00:04:45 master-0.ostest.test.metalkube.org NetworkManager[3935]: <info> [1642809885.0510] dhcp4 (br-ex): state changed unknown -> bound, address=192.168.111.20 [snip] Jan 22 00:04:46 master-0.ostest.test.metalkube.org NetworkManager[3935]: <info> [1642809886.2684] dhcp6 (br-ex): state changed unknown -> bound, address=fd2e:6f44:5dd8:c956::14 Meanwhile, in nodeip-configuration: Jan 22 00:04:45 master-0.ostest.test.metalkube.org bash[5361]: time="2022-01-22T00:04:45Z" level=debug msg="retrieved Address map map[0xc0002c1e60:[127.0.0.1/8 lo ::1/128] 0xc0003d4000:[fd00:1101::893:937e:62cd:8cf8/128] 0xc0003d46c0:[1> We somehow managed to retrieve the address map at the exact moment configure-ovs was reconfiguring the interface so it only had the v4 address and not the v6 one. This is happening consistently in my local env, and I see the same thing in a journal log from ci. Theoretically we should be able to fix this by enforcing an ordering on the two services, but I'm having trouble with that locally. When I add a dependency one of the services seems to not run at all. :-/
(In reply to Ben Nemec from comment #16) > We somehow managed to retrieve the address map at the exact moment > configure-ovs was reconfiguring the interface so it only had the v4 address > and not the v6 one. This is happening consistently in my local env, and I > see the same thing in a journal log from ci. > > Theoretically we should be able to fix this by enforcing an ordering on the > two services, but I'm having trouble with that locally. When I add a > dependency one of the services seems to not run at all. :-/ It looks to me like configure-ovs is the last thing to run before network-online.target: https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/units/ovs-configuration.service.yaml#L12 and node-ip-configuration is one of the first things to run after network-online.target: https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/units/nodeip-configuration.service.yaml#L7 So they are ordered. configure-ovs restarts network-manager and waits for connections to come up, but if it doesn't wait for DHCP to get both IP addresses again (which certainly appears to be the case) then that would explain why it's quite consistently not getting both IPs. Comparing to the last successful periodic job run, it seems quite possible that the difference was that the configure-ovs script used to contain an unconditional 5s sleep, which was removed: https://github.com/openshift/machine-config-operator/commit/9cc7ac42a69474566a6930f80f72190769319f30#diff-afb45a3711a77d94f26471d9d94a7f7a03d931d9e72bdf849f2e26e2711d6fd7L340 by https://github.com/openshift/machine-config-operator/pull/2864 The last successful periodic job used the commit immediate before that PR merged. The metal dualstack job for the PR failed with the same symptoms as we see here ("should have ipv4 and ipv6 internal node ip": https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/2864/pull-ci-openshift-machine-config-operator-master-e2e-metal-ipi-ovn-dualstack/1470852830548987904 - this test ran on 15 Dec when the periodic job was still passing, and the metal platform changes didn't merge until 17 Dec. The day after the PR merged, and before the next periodic job ran, the metal platform changes that caused bug 2034527 broke building the cluster at all in the metal-dualstack job, thus masking this problem until it was fixed.
Testing confirms it: https://github.com/openshift/machine-config-operator/pull/2924 Obviously just adding a 5s sleep isn't really a solution, we should actually wait for the DHCP to succeed or timeout.
timeout will cause a retry? Because there might be cases where DHCP is unavailable, even for few hours, and then return..
It should use the same logic the NetworkManager uses at startup to determine when the network is ready, instead of just declaring success as soon as the port is up without waiting for DHCP.
that make sense, but in the past we used to add ip=dhcp if api ip is ipv4 and ip=dhcp6 if it's ipv6 because otherwise, what happend in many cases (at least in our lab) is that node received ipv6 (but not ipv6) NM declare network ready, and the rest of the deployment failed, when trying to fetch something from api over ipv4 (because it was not configured)
From the latest job logs https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.10-e2e-metal-ipi-ovn-dualstack/1486826255205535744, these cases are pass Move this bug to verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056