Description of problem: atomic-openshift-node: I0926 18:36:18.268991 6814 kuberuntime_manager.go:513] Container {Name:openvswitch Image:registry.access.redhat.com/opens hift3/ose-node:v3.10 Command:[/bin/bash -c #!/bin/bash -node-1 atomic-openshift-node: set -euo pipefail -node-1 atomic-openshift-node: # if another process is listening on the cni-server socket, wait until it exits -node-1 atomic-openshift-node: trap 'kill $(jobs -p); exit 0' TERM -node-1 atomic-openshift-node: retries=0 -node-1 atomic-openshift-node: while true; do -node-1 atomic-openshift-node: if /usr/share/openvswitch/scripts/ovs-ctl status &>/dev/null; then -node-1 atomic-openshift-node: echo "warning: Another process is currently managing OVS, waiting 15s ..." 2>&1 -node-1 atomic-openshift-node: sleep 15 & wait -node-1 atomic-openshift-node: (( retries += 1 )) -node-1 atomic-openshift-node: else -node-1 atomic-openshift-node: break -node-1 atomic-openshift-node: fi -node-1 atomic-openshift-node: if [[ "${retries}" -gt 40 ]]; then -node-1 atomic-openshift-node: echo "error: Another process is currently managing OVS, exiting" 2>&1 -node-1 atomic-openshift-node: exit 1 -node-1 atomic-openshift-node: fi -node-1 atomic-openshift-node: done -node-1 atomic-openshift-node: # launch OVS -node-1 atomic-openshift-node: function quit { -node-1 atomic-openshift-node: /usr/share/openvswitch/scripts/ovs-ctl stop -node-1 atomic-openshift-node: exit 0 -node-1 atomic-openshift-node: } -node-1 atomic-openshift-node: trap quit SIGTERM -node-1 atomic-openshift-node: /usr/share/openvswitch/scripts/ovs-ctl start --system-id=random -node-1 atomic-openshift-node: # Restrict the number of pthreads ovs-vswitchd creates to reduce the -node-1 atomic-openshift-node: # amount of RSS it uses on hosts with many cores -node-1 atomic-openshift-node: # https://bugzilla.redhat.com/show_bug.cgi?id=1571379 -node-1 atomic-openshift-node: # https://bugzilla.redhat.com/show_bug.cgi?id=1572797 -node-1 atomic-openshift-node: if [[ `nproc` -gt 12 ]]; then -node-1 atomic-openshift-node: ovs-vsctl set Open_vSwitch . other_config:n-revalidator-threads=4 -node-1 atomic-openshift-node: ovs-vsctl set Open_vSwitch . other_config:n-handler-threads=10 -node-1 atomic-openshift-node: fi -node-1 atomic-openshift-node: while true; do sleep 5; done -node-1 atomic-openshift-node: ] Args:[] WorkingDir: Ports:[] EnvFrom:[] Env:[] Resources:{Limits:map[cpu:{i:{value:200 scale:-3} d:{Dec:<nil>} s:200m Form at:DecimalSI} memory:{i:{value:419430400 scale:0} d:{Dec:<nil>} s: Format:BinarySI}] Requests:map[cpu:{i:{value:100 scale:-3} d:{Dec:<nil>} s:100m Format:DecimalSI} memory:{i:{value:314572800 scale:0} d:{Dec:<nil>} s:300Mi Format:BinarySI}]} VolumeMounts:[{Name:host-modules ReadOnly:true MountPath:/lib/modules SubPath: MountPropagation:<nil>} {Name:host-run-ovs ReadOnly:false Mou ntPath:/run/openvswitch SubPath: MountPropagation:<nil>} {Name:host-run-ovs ReadOnly:false MountPath:/var/run/openvswitch SubPath: MountPropagation:<nil>} {Name:host-sys ReadOnly:true MountPa th:/sys SubPath: MountPropagation:<nil>} {Name:host-config-openvswitch ReadOnly:false MountPath:/etc/openvswitch SubPath: MountPropagation:<nil>} {Name:sdn-token-q47s4 ReadOnly:true MountPath :/var/run/secrets/kubernetes.io/serviceaccount SubPath: MountPropagation:<nil>}] VolumeDevices:[] LivenessProbe:nil ReadinessProbe:nil Lifecycle:nil TerminationMessagePath:/dev/termination-lo g TerminationMessagePolicy:File ImagePullPolicy:IfNotPresent SecurityContext:&SecurityContext{Capabilities:nil,Privileged:*true,SELinuxOptions:nil,RunAsUser:*0,RunAsNonRoot:nil,ReadOnlyRootFi lesystem:nil,AllowPrivilegeEscalation:nil,RunAsGroup:nil,} Stdin:false StdinOnce:false TTY:false} is dead, but RestartPolicy says that we should restart it. Version-Release number of selected component (if applicable): OCP 3.9 being used (v3.9.30) How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: ...OpenShift SDN network process is not (yet?) available... continually re-occurs. Expected results: SDN network to start Additional info:
That is just the kubelet echoing out the command it's about to run; it's not an error. You'll need to debug the SDN the same way you'd debug any other process. Look it up with "oc get pods", look for logs with "oc log", watch for events with "oc get events".
Please confirm that the openshift-sdn plugin is being used for the sdn plugin per Known Issues when upgrading to OpenShift 3.10 [1]. Currently using anything other than the default openshift-sdn plugin encounters a known issue that will not be addressed until "mid/late October 2018". Diagnostic Steps grep os_sdn_network_plugin_name -r * 2>/dev/null <file>:os_sdn_network_plugin_name='redhat/openshift-ovs-multitenant'\ [1] https://access.redhat.com/solutions/3631141