openshift-gcp-routes is required to send traffic from GCP load balancers (because it actually connects the VIP to the host networking). Right now it stops *before* networking stops, which means while kube-apiserver is draining we kill the vip, which causes disruption. We should terminate openshift-gcp-routes service when the network is shutting down, not before.
Verified on 4.8.0-0.nightly-2021-05-07-075528. openshift-gcp-routes.service is stopped after network online target. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-05-07-075528 True False 32m Cluster version is 4.8.0-0.nightly-2021-05-07-075528 $ oc get nodes NAME STATUS ROLES AGE VERSION ci-ln-9wr6012-f76d1-z7bjv-master-0 Ready master 53m v1.21.0-rc.0+291e731 ci-ln-9wr6012-f76d1-z7bjv-master-1 Ready master 53m v1.21.0-rc.0+291e731 ci-ln-9wr6012-f76d1-z7bjv-master-2 Ready master 53m v1.21.0-rc.0+291e731 ci-ln-9wr6012-f76d1-z7bjv-worker-b-gwn8c Ready worker 44m v1.21.0-rc.0+291e731 ci-ln-9wr6012-f76d1-z7bjv-worker-c-c2ndb Ready worker 44m v1.21.0-rc.0+291e731 ci-ln-9wr6012-f76d1-z7bjv-worker-d-2sc2x Ready worker 44m v1.21.0-rc.0+291e731 $ oc debug node/ci-ln-9wr6012-f76d1-z7bjv-master-0 Starting pod/ci-ln-9wr6012-f76d1-z7bjv-master-0-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# systemctl cat openshift-gcp-routes # /etc/systemd/system/openshift-gcp-routes.service [Unit] Description=Update GCP routes for forwarded IPs. ConditionKernelCommandLine=|ignition.platform.id=gce ConditionKernelCommandLine=|ignition.platform.id=gcp Before=network-online.target [Service] Type=simple ExecStart=/bin/bash /opt/libexec/openshift-gcp-routes.sh start ExecStopPost=/bin/bash /opt/libexec/openshift-gcp-routes.sh cleanup User=root RestartSec=30 Restart=always [Install] WantedBy=multi-user.target # Ensure that network-online.target will not complete until the node has working external LBs. RequiredBy=network-online.target sh-4.4# journalctl ...snip... May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: Stopped target Network is Online. May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: node-valid-hostname.service: Succeeded. May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: Stopped Ensure the node hostname is valid for the cluster. May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: node-valid-hostname.service: Consumed 0 CPU time May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: Stopping Update GCP routes for forwarded IPs.... May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: NetworkManager-wait-online.service: Succeeded. May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: Stopped Network Manager Wait Online. May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: NetworkManager-wait-online.service: Consumed 0 CPU time May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: Stopped target sshd-keygen.target. May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: systemd-user-sessions.service: Succeeded. May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: Stopped Permit User Sessions. May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: systemd-user-sessions.service: Consumed 13ms CPU time May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: Stopped target Remote File Systems. May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: Stopped target Network. May 07 17:15:11 ci-ln-9wr6012-f76d1-z7bjv-master-0.c.openshift-gce-devel-ci.inte systemd[1]: Stopping Network Manager...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438