Description of problem: As discussed in bug https://bugzilla.redhat.com/show_bug.cgi?id=1848374, open this bug to track the new issue in 4.6 Version-Release number of selected component (if applicable): 4.6.0-0.nightly-2020-07-06-180619 How reproducible: Always Steps to Reproduce: After kill the ovs-vswitchd in the node, it will put the sdn in CrashLoopBackOff status. Tested on two nodes, both sdn pods got CrashLoopBackOff. sh-4.4# pgrep ovs-vswitchd 1291 sh-4.4# pkill ovs-vswitchd sh-4.4# pgrep ovs-vswitchd sh-4.4 oc get pods -n openshift-sdn NAME READY STATUS RESTARTS AGE ovs-274sx 1/1 Running 0 23m ovs-dsd2g 1/1 Running 0 34m ovs-s4mr8 1/1 Running 0 23m ovs-tlggz 1/1 Running 0 33m ovs-vrxtn 1/1 Running 0 23m ovs-xw8mf 1/1 Running 0 34m sdn-454ss 0/1 CrashLoopBackOff 6 23m sdn-8dnhs 1/1 Running 0 33m sdn-controller-6gbz7 1/1 Running 0 34m sdn-controller-6kpdw 1/1 Running 1 34m sdn-controller-ftrdn 1/1 Running 0 33m sdn-hhzpv 1/1 Running 0 23m sdn-k6xdc 1/1 Running 0 34m sdn-kw7zn 0/1 CrashLoopBackOff 5 23m sdn-ntbmv 1/1 Running 0 34m oc logs sdn-kw7zn -n openshift-sdn I0707 01:30:47.634694 38474 cmd.go:121] Reading proxy configuration from /config/kube-proxy-config.yaml I0707 01:30:47.635974 38474 feature_gate.go:243] feature gates: &{map[]} I0707 01:30:47.636032 38474 cmd.go:216] Watching config file /config/kube-proxy-config.yaml for changes I0707 01:30:47.636078 38474 cmd.go:216] Watching config file /config/..2020_07_07_01_11_15.728054255/kube-proxy-config.yaml for changes I0707 01:30:47.670497 38474 node.go:150] Initializing SDN node "ip-10-0-139-165.us-east-2.compute.internal" (10.0.139.165) of type "redhat/openshift-ovs-networkpolicy" I0707 01:30:47.675538 38474 cmd.go:159] Starting node networking (v0.0.0-alpha.0-181-ga461a1f6) I0707 01:30:47.675557 38474 node.go:338] Starting openshift-sdn network plugin I0707 01:30:47.773151 38474 ovs.go:176] Error executing ovs-ofctl: ovs-ofctl: br0 is not a bridge or a socket I0707 01:30:47.773379 38474 sdn_controller.go:139] [SDN setup] full SDN setup required (plugin is not setup) I0707 01:31:17.808174 38474 ovs.go:176] Error executing ovs-vsctl: 2020-07-07T01:31:17Z|00002|fatal_signal|WARN|terminating with signal 14 (Alarm clock) F0707 01:31:17.808256 38474 cmd.go:111] Failed to start sdn: node SDN setup failed: signal: alarm clock Actual results: sdn pods got CrashLoopBackOff. Expected results: All the pods in openshift-sdn should in running status Additional info:
ovs-vswitchd is set to Restart=on-failure According to https://www.freedesktop.org/software/systemd/man/systemd.service.html#Restart= "on-failure" will not restart on clean exit code, so it won't restart on pkill it looks like the ovs-vswitchd.service is in RHCOS, so we need to modify it to be Restart="always"
The expectation that pkill ovs-vswitchd not permanently stop ovs-vswitchd is inherited from the previous behavior where CNO and the OVS Daemonset will restart OVS pods that are terminated. Does CNO have the ability to monitor the host OVS to also ensure such policy?
When moving to system ovs, we should rely on --monitor flag to ensure that any such in-advertent or in some cases intentional kill of the process will not result in vswitchd staying offline. I see that our RHCOS images don't have the flag set. i.e. I see --no-monitor in the service files.
I'm no systemd expert but I thought the plan was to move all daemon monitoring responsibilities to systemd and not try to use daemon monitoring stuff like --monitor. Looking at the original BZ from the test case https://bugzilla.redhat.com/show_bug.cgi?id=1669311 the original problem was OVS getting OOM killed which is either SIGTERM or SIGKILL. I'm not sure how the current systemd unit will react when OOM killed. It is possible the --monitor could also be OOM killed whereas systemd itself I imagine is immune and thus always present to restart daemons.
Scratch my previous comment. It looks like the service files do indeed have Restart=OnFailure set. However, SIGTERM is not one of the signals to which it will respond. From systemd documentation: Takes one of no, on-success, on-failure, on-abnormal, on-watchdog, on-abort, or always. If set to no (the default), the service will not be restarted. If set to on-success, it will be restarted only when the service process exits cleanly. In this context, a clean exit means an exit code of 0, or one of the signals SIGHUP, SIGINT, SIGTERM or SIGPIPE, and additionally, exit statuses and signals specified in SuccessExitStatus=. If set to on-failure, the service will be restarted when the process exits with a non-zero exit code, is terminated by a signal (including on core dump, but excluding the aforementioned four signals), when an operation (such as service reload) times out, and when the configured watchdog timeout is triggered. If set to on-abnormal, the service will be restarted when the process is terminated by a signal (including on core dump, excluding the aforementioned four signals), when an operation times out, or when the watchdog timeout is triggered. If set to on-abort, the service will be restarted only if the service process exits due to an uncaught signal not specified as a clean exit status. If set to on-watchdog, the service will be restarted only if the watchdog timeout for the service expires. If set to always, the service will be restarted regardless of whether it exited cleanly or not, got terminated abnormally by a signal, or hit a timeout. So back to the question Ross is asking. Do we want to support pkill as one of the modes? If so, we need to set the Restart=always.
Just for reference, here are the restart polices for other services on 4.6.0-0.nightly-2020-07-15-091743 sh-4.4# systemctl show '*' -p Restart -p Names --type=service --state=active | grep -A1 'Restart=[^n]' Restart=always Names=getty -- Restart=always Names=systemd-udevd.service -- Restart=always Names=systemd-logind.service -- Restart=on-failure Names=sshd.service -- Restart=on-failure Names=ovs-vswitchd.service -- Restart=on-abnormal Names=crio.service -- Restart=always Names=systemd-journald.service -- Restart=always Names=kubelet.service -- Restart=on-failure Names=ovsdb-server.service -- Restart=on-failure Names=sssd.service -- Restart=always Names=serial-getty -- Restart=on-failure Names=NetworkManager.service
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4196