Description of problem: On on-prem platforms (BM, OpenStack, Ovirt, Vsphere) it is currently not possible to deploy additional compute nodes on separate subnets due to a few assumptions about the node being on a subnet that includes the VIP. There are a few issues preventing from running additional workers on a separate subnet: - baremetal-runtimecfg render complains when the node is not on a subnet on which the VIP is routable - baremetal-runtimecfg node-ip accepts running without providing a VIP, but that means we need to update the templates in MCO and Kubelet doesn't allow changing the initial node role when registering it with the cluster. - we need to find a way to delete the keepalived manifest before kubelet starts Steps to Reproduce: 1. Deploy OpenShift as usual with one compute machine pool 2. After installation completes, create a new subnet with a route to the initial node subnet. 3. Create a MachineSet with the additional nodes being on the new subnet Actual results: Baremetal-runtimecfg fails to generate the configuration file, and even if it did keepalived pod would mess around with the networking. Expected results: New compute nodes can join the cluster when deployed on separate subnets. Additional info:
*** Bug 1905134 has been marked as a duplicate of this bug. ***
Hi, I'm facing this issue in 4.6 IPI on VMWare. Is this tag as an error to solve? If yes, as 4.6 is EUS will it be fixed in 4.6? Thank you.
Other comment is that this is not just for two subnets, it could be on subnet for masters, one subnet for infra and several subnets for workers. May be it should be configurable which nodes will be used for the VIPs.
Hi Javier, there are currently no plans to backport this changes to 4.6 and earlier versions unless there is a strong business case, as it would require a lot of testing. There is also work going on to make this architecture more flexible: https://github.com/openshift/enhancements/pull/524.
Hi Martin, I'm seeing this as a bug in the current version, as 4.6 is still in full support I was assuming this would be fixed in 4.6, also taking into account 4.6 is EUS. Being able to install in different VLANs is a feature of OpenShift, and it is not working with IPI. I do not have a business case more than currently customers with this requirement of using different VLANs wants to use 4.6 to have a period of stability due to the EUS and this error is avoiding them to use IPI.
Hi, I have created the following machineconfig, I overwrite the files related to keepalive pod and configuration and also disable the service "nodeip-configuration.service". apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: masters-chrony-configuration spec: config: ignition: config: {} security: tls: {} timeouts: {} version: 3.1.0 networkd: {} passwd: {} systemd: units: - contents: | [Unit] Description=Writes IP address configuration so that kubelet and crio services select a valid node IP # This only applies to VIP managing environments where the kubelet and crio IP # address picking logic is flawed and may end up selecting an address from a # different subnet or a deprecated address Wants=network-online.target After=network-online.target ignition-firstboot-complete.service Before=kubelet.service crio.service [Service] # Need oneshot to delay kubelet Type=oneshot # Would prefer to do Restart=on-failure instead of this bash retry loop, but # the version of systemd we have right now doesn't support it. It should be # available in systemd v244 and higher. ExecStart=/bin/bash -c " \ until \ /usr/bin/podman run --rm \ --authfile /var/lib/kubelet/config.json \ --net=host \ --volume /etc/systemd/system:/etc/systemd/system:z \ quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:44dffacfd1b61252df317adcdd1b549c06dfd9d98436adce60fd9a6ec72c7f97 \ node-ip \ set --retry-on-failure \ 192.168.1.247; \ do \ sleep 5; \ done" ExecStart=/bin/systemctl daemon-reload [Install] WantedBy=multi-user.target enabled: false name: nodeip-configuration.service storage: files: - filesystem: root overwrite: true path: "/etc/kubernetes/static-pod-resources/keepalived/keepalived.conf.tmpl" contents: source: data:,foo mode: 420 - filesystem: root overwrite: true path: "/etc/kubernetes/manifests/keepalived.yaml" contents: source: data:,kind%3A%20Pod%0AapiVersion%3A%20v1%0Ametadata%3A%0A%20%20name%3A%20foo-keepalived%0A%20%20namespace%3A%20openshift-vsphere-infra%20%0A%20%20labels%3A%0A%20%20%20%20app%3A%20vsphere-infra-vrrp%0Aspec%3A%0A%20%20containers%3A%0A%20%20-%20name%3A%20foo-keepalived%20%20%20%20%0A%20%20%20%20image%3A%20docker.io%2Fbusybox%20%20%0A%20%20hostNetwork%3A%20true%0A%20%20tolerations%3A%0A%20%20-%20operator%3A%20Exists%0A%20%20priorityClassName%3A%20system-node-critical mode: 420 osImageURL: "" I overwrite the files because I didn't find who to remove a file using an ignition file. With this machineconfig the worker nodes starts fine, there is no VIP for workers but that is not a problem because to be productive a Load Balancer is needed, the VIP for ingress is just "temporal" for installation. What do you think, could this be a valid/supported workaround for OCP 4.6? Thank you.
Appart of some "typo", I didn't comment that I tested it creating a cluster with 0 workers and master scheduling disabled, after the masters start I deploy the machineconfig and create machinesets using a network different of the masters network. It could be possible also to add it to the <install folder>/openshift folder to deploy it with the cluster creation.
In case Ingress VIP is mandatory it would be possible to do not apply this machineconfig to a pair of worker nodes deployed in the same VLAN as the masters and apply it only to the workers deployed in different VLANs.
Checked with 4.7.0-0.nightly-2020-12-17-201522, and it works well now. $ oc get machineset -A NAMESPACE NAME DESIRED CURRENT READY AVAILABLE AGE openshift-machine-api wj47ios1218a-rnntt-addit-0 1 1 1 1 8m9s openshift-machine-api wj47ios1218a-rnntt-worker-0 3 3 3 3 96m $ oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME wj47ios1218a-rnntt-addit-0-drz5s Ready worker 59s v1.20.0+87544c5 192.168.66.124 <none> Red Hat Enterprise Linux CoreOS 47.83.202012171642-0 (Ootpa) 4.18.0-240.8.1.el8_3.x86_64 cri-o://1.20.0-0.rhaos4.7.gitd388528.el8.39 wj47ios1218a-rnntt-master-0 Ready master 94m v1.20.0+87544c5 192.168.0.79 <none> Red Hat Enterprise Linux CoreOS 47.83.202012171642-0 (Ootpa) 4.18.0-240.8.1.el8_3.x86_64 cri-o://1.20.0-0.rhaos4.7.gitd388528.el8.39 wj47ios1218a-rnntt-master-1 Ready master 94m v1.20.0+87544c5 192.168.3.90 <none> Red Hat Enterprise Linux CoreOS 47.83.202012171642-0 (Ootpa) 4.18.0-240.8.1.el8_3.x86_64 cri-o://1.20.0-0.rhaos4.7.gitd388528.el8.39 wj47ios1218a-rnntt-master-2 Ready master 94m v1.20.0+87544c5 192.168.3.194 <none> Red Hat Enterprise Linux CoreOS 47.83.202012171642-0 (Ootpa) 4.18.0-240.8.1.el8_3.x86_64 cri-o://1.20.0-0.rhaos4.7.gitd388528.el8.39 wj47ios1218a-rnntt-worker-0-ftt9z Ready worker 80m v1.20.0+87544c5 192.168.0.250 <none> Red Hat Enterprise Linux CoreOS 47.83.202012171642-0 (Ootpa) 4.18.0-240.8.1.el8_3.x86_64 cri-o://1.20.0-0.rhaos4.7.gitd388528.el8.39 wj47ios1218a-rnntt-worker-0-gtsns Ready worker 80m v1.20.0+87544c5 192.168.0.108 <none> Red Hat Enterprise Linux CoreOS 47.83.202012171642-0 (Ootpa) 4.18.0-240.8.1.el8_3.x86_64 cri-o://1.20.0-0.rhaos4.7.gitd388528.el8.39 wj47ios1218a-rnntt-worker-0-zxcbd Ready worker 80m v1.20.0+87544c5 192.168.1.103 <none> Red Hat Enterprise Linux CoreOS 47.83.202012171642-0 (Ootpa) 4.18.0-240.8.1.el8_3.x86_64 cri-o://1.20.0-0.rhaos4.7.gitd388528.el8.39
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633