Bug 1701806

Summary: sdn pod CrashLoopBackOff after running sometime
Product: OpenShift Container Platform Reporter: Wang Haoran <haowang>
Component: NetworkingAssignee: Casey Callendrello <cdc>
Status: CLOSED ERRATA QA Contact: Meng Bo <bmeng>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.11.0CC: aos-bugs, bbennett, weliang
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-26 09:08:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Wang Haoran 2019-04-22 03:00:38 UTC
Description of problem:

The sdn pod is in CrashLoopBackOff

Version-Release number of selected component (if applicable):

openshift v3.11.43

How reproducible:

sometime
Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

1. oc logs sdn-rlpxh -n openshift-sdn
2019/04/22 02:40:38 socat[26065] E connect(5, AF=1 "/var/run/openshift-sdn/cni-server.sock", 40): Connection refused
User "sa" set.
Context "default-context" modified.
which: no openshift-sdn in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin)
I0422 02:40:39.281073   26040 start_network.go:200] Reading node configuration from /etc/origin/node/node-config.yaml
I0422 02:40:39.284989   26040 start_network.go:207] Starting node networking ip-10-107-58-0.sa-east-1.compute.internal (v3.11.43)
W0422 02:40:39.285157   26040 server.go:195] WARNING: all flags other than --config, --write-config-to, and --cleanup are deprecated. Please begin using a config file ASAP.
I0422 02:40:39.285197   26040 feature_gate.go:230] feature gates: &{map[]}
I0422 02:40:39.286445   26040 transport.go:160] Refreshing client certificate from store
I0422 02:40:39.286472   26040 certificate_store.go:131] Loading cert/key pair from "/etc/origin/node/certificates/kubelet-client-current.pem".
I0422 02:40:39.301211   26040 node.go:147] Initializing SDN node of type "redhat/openshift-ovs-multitenant" with configured hostname "ip-10-107-58-0.sa-east-1.compute.internal" (IP ""), iptables sync period "30s"
I0422 02:40:39.311230   26040 node.go:289] Starting openshift-sdn network plugin
I0422 02:40:39.380773   26040 ovs.go:166] Error executing ovs-ofctl: ovs-ofctl: /var/run/openvswitch/br0.mgmt: failed to open socket (Connection refused)
I0422 02:40:39.380793   26040 sdn_controller.go:139] [SDN setup] full SDN setup required (plugin is not setup)
I0422 02:41:09.401652   26040 ovs.go:166] Error executing ovs-vsctl: 2019-04-22T02:41:09Z|00002|fatal_signal|WARN|terminating with signal 14 (Alarm clock)
F0422 02:41:09.401700   26040 network.go:46] SDN node startup failed: node SDN setup failed: signal: alarm clock

2. ovs pod logs:
oc logs ovs-26gks -n openshift-sdn
Starting ovsdb-server [  OK  ]
Configuring Open vSwitch system IDs [  OK  ]
Starting ovs-vswitchd [  OK  ]
Enabling remote OVSDB managers [  OK  ]

3. ovs logs inside the pod:


sh-4.2# tail -100 /var/log/openvswitch/ovs-vswitchd.log
2019-04-22T00:55:03.942Z|15644|bridge|INFO|bridge br0: added interface veth6b1dc255 on port 2093
2019-04-22T00:55:04.042Z|15645|connmgr|INFO|br0<->unix#38236: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T00:55:04.064Z|15646|connmgr|INFO|br0<->unix#38238: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T00:55:27.630Z|15647|connmgr|INFO|br0<->unix#38243: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T00:55:27.643Z|15648|connmgr|INFO|br0<->unix#38245: 4 flow_mods in the last 0 s (4 deletes)
2019-04-22T00:55:27.663Z|15649|bridge|INFO|bridge br0: deleted interface veth6b1dc255 on port 2093
2019-04-22T01:00:09.258Z|15650|bridge|INFO|bridge br0: added interface veth51bb6ff5 on port 2094
2019-04-22T01:00:09.344Z|15651|connmgr|INFO|br0<->unix#38267: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:00:09.374Z|15652|connmgr|INFO|br0<->unix#38269: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:00:09.539Z|15653|bridge|INFO|bridge br0: added interface veth4afab9d1 on port 2095
2019-04-22T01:00:09.648Z|15654|connmgr|INFO|br0<->unix#38271: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:00:09.671Z|15655|connmgr|INFO|br0<->unix#38273: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:00:10.136Z|15656|bridge|INFO|bridge br0: added interface veth24517e72 on port 2096
2019-04-22T01:00:10.147Z|15657|connmgr|INFO|br0<->unix#38275: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:00:10.246Z|15658|connmgr|INFO|br0<->unix#38277: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:00:15.197Z|15659|connmgr|INFO|br0<->unix#38281: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:00:15.218Z|15660|connmgr|INFO|br0<->unix#38283: 4 flow_mods in the last 0 s (4 deletes)
2019-04-22T01:00:15.251Z|15661|bridge|INFO|bridge br0: deleted interface veth51bb6ff5 on port 2094
2019-04-22T01:00:20.606Z|15662|connmgr|INFO|br0<->unix#38285: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:00:20.630Z|15663|connmgr|INFO|br0<->unix#38287: 4 flow_mods in the last 0 s (4 deletes)
2019-04-22T01:00:20.681Z|15664|bridge|INFO|bridge br0: deleted interface veth4afab9d1 on port 2095
2019-04-22T01:00:26.538Z|15665|connmgr|INFO|br0<->unix#38289: 3 flow_mods in the last 0 s (3 adds)
2019-04-22T01:00:48.361Z|15666|connmgr|INFO|br0<->unix#38293: 1 flow_mods in the last 0 s (1 deletes)
2019-04-22T01:01:32.065Z|15667|connmgr|INFO|br0<->unix#38298: 2 flow_mods in the last 0 s (2 adds)
2019-04-22T01:02:10.325Z|15668|connmgr|INFO|br0<->unix#38302: 1 flow_mods in the last 0 s (1 deletes)
2019-04-22T01:05:06.141Z|15669|bridge|INFO|bridge br0: added interface veth63c7081a on port 2097
2019-04-22T01:05:06.241Z|15670|connmgr|INFO|br0<->unix#38317: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:05:06.278Z|15671|connmgr|INFO|br0<->unix#38319: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:06:08.619Z|15672|connmgr|INFO|br0<->unix#38326: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:06:08.631Z|15673|connmgr|INFO|br0<->unix#38328: 4 flow_mods in the last 0 s (4 deletes)
2019-04-22T01:06:08.654Z|15674|bridge|INFO|bridge br0: deleted interface veth63c7081a on port 2097
2019-04-22T01:07:19.965Z|15675|connmgr|INFO|br0<->unix#38337: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:07:19.978Z|15676|connmgr|INFO|br0<->unix#38339: 4 flow_mods in the last 0 s (4 deletes)
2019-04-22T01:07:19.999Z|15677|bridge|INFO|bridge br0: deleted interface veth24517e72 on port 2096
2019-04-22T01:10:10.944Z|15678|bridge|INFO|bridge br0: added interface veth258d8676 on port 2098
2019-04-22T01:10:11.058Z|15679|connmgr|INFO|br0<->unix#38352: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:10:11.093Z|15680|connmgr|INFO|br0<->unix#38354: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:10:11.456Z|15681|bridge|INFO|bridge br0: added interface veth5545143c on port 2099
2019-04-22T01:10:11.555Z|15682|connmgr|INFO|br0<->unix#38356: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:10:11.582Z|15683|connmgr|INFO|br0<->unix#38358: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:10:11.839Z|15684|bridge|INFO|bridge br0: added interface veth3d21a202 on port 2100
2019-04-22T01:10:11.944Z|15685|connmgr|INFO|br0<->unix#38360: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:10:11.964Z|15686|connmgr|INFO|br0<->unix#38362: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:10:12.037Z|15687|bridge|INFO|bridge br0: added interface veth6f500ede on port 2101
2019-04-22T01:10:12.050Z|15688|connmgr|INFO|br0<->unix#38364: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:10:12.143Z|15689|connmgr|INFO|br0<->unix#38366: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:10:18.704Z|15690|connmgr|INFO|br0<->unix#38370: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:10:18.745Z|15691|connmgr|INFO|br0<->unix#38372: 4 flow_mods in the last 0 s (4 deletes)
2019-04-22T01:10:18.788Z|15692|bridge|INFO|bridge br0: deleted interface veth258d8676 on port 2098
2019-04-22T01:10:19.761Z|15693|connmgr|INFO|br0<->unix#38374: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:10:19.782Z|15694|connmgr|INFO|br0<->unix#38376: 4 flow_mods in the last 0 s (4 deletes)
2019-04-22T01:10:19.815Z|15695|bridge|INFO|bridge br0: deleted interface veth5545143c on port 2099
2019-04-22T01:10:19.839Z|15696|connmgr|INFO|br0<->unix#38378: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:10:19.867Z|15697|connmgr|INFO|br0<->unix#38380: 4 flow_mods in the last 0 s (4 deletes)
2019-04-22T01:10:19.905Z|15698|bridge|INFO|bridge br0: deleted interface veth3d21a202 on port 2100
2019-04-22T01:10:25.342Z|15699|connmgr|INFO|br0<->unix#38382: 3 flow_mods in the last 0 s (3 adds)
2019-04-22T01:10:47.290Z|15700|connmgr|INFO|br0<->unix#38386: 1 flow_mods in the last 0 s (1 deletes)
2019-04-22T01:16:52.076Z|15701|connmgr|INFO|br0<->unix#38415: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:16:52.091Z|15702|connmgr|INFO|br0<->unix#38417: 4 flow_mods in the last 0 s (4 deletes)
2019-04-22T01:16:52.113Z|15703|bridge|INFO|bridge br0: deleted interface veth6f500ede on port 2101
2019-04-22T01:20:03.343Z|15704|bridge|INFO|bridge br0: added interface veth23d1afcb on port 2102
2019-04-22T01:20:03.436Z|15705|connmgr|INFO|br0<->unix#38433: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:20:03.455Z|15706|connmgr|INFO|br0<->unix#38435: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:20:03.478Z|15707|bridge|INFO|bridge br0: added interface vethf792a655 on port 2103
2019-04-22T01:20:03.538Z|15708|connmgr|INFO|br0<->unix#38437: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:20:03.561Z|15709|connmgr|INFO|br0<->unix#38439: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:20:04.839Z|15710|bridge|INFO|bridge br0: added interface veth7b1020b5 on port 2104
2019-04-22T01:20:04.936Z|15711|connmgr|INFO|br0<->unix#38441: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:20:04.961Z|15712|connmgr|INFO|br0<->unix#38443: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:20:05.442Z|15713|bridge|INFO|bridge br0: added interface veth848802cf on port 2105
2019-04-22T01:20:05.545Z|15714|connmgr|INFO|br0<->unix#38445: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:20:05.586Z|15715|connmgr|INFO|br0<->unix#38447: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:20:08.200Z|15716|connmgr|INFO|br0<->unix#38449: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:20:08.216Z|15717|connmgr|INFO|br0<->unix#38451: 4 flow_mods in the last 0 s (4 deletes)
2019-04-22T01:20:08.238Z|15718|bridge|INFO|bridge br0: deleted interface vethf792a655 on port 2103
2019-04-22T01:20:15.783Z|15719|connmgr|INFO|br0<->unix#38455: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:20:15.808Z|15720|connmgr|INFO|br0<->unix#38457: 4 flow_mods in the last 0 s (4 deletes)
2019-04-22T01:20:15.864Z|15721|bridge|INFO|bridge br0: deleted interface veth23d1afcb on port 2102
2019-04-22T01:20:18.002Z|15722|connmgr|INFO|br0<->unix#38459: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:20:18.019Z|15723|connmgr|INFO|br0<->unix#38461: 4 flow_mods in the last 0 s (4 deletes)
2019-04-22T01:20:18.043Z|15724|bridge|INFO|bridge br0: deleted interface veth7b1020b5 on port 2104
2019-04-22T01:20:24.429Z|15725|connmgr|INFO|br0<->unix#38463: 3 flow_mods in the last 0 s (3 adds)
2019-04-22T01:20:46.015Z|15726|connmgr|INFO|br0<->unix#38467: 1 flow_mods in the last 0 s (1 deletes)
2019-04-22T01:20:47.073Z|15727|bridge|INFO|bridge br0: added interface veth9d63c9be on port 2106
2019-04-22T01:20:47.151Z|15728|connmgr|INFO|br0<->unix#38469: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:20:47.237Z|15729|bridge|INFO|bridge br0: deleted interface veth9d63c9be on port 2106
2019-04-22T01:20:47.242Z|15730|bridge|WARN|could not open network device veth9d63c9be (No such device)
2019-04-22T01:20:47.248Z|15731|connmgr|INFO|br0<->unix#38471: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:20:47.266Z|15732|connmgr|INFO|br0<->unix#38473: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:20:47.336Z|15733|connmgr|INFO|br0<->unix#38475: 4 flow_mods in the last 0 s (4 deletes)
2019-04-22T01:30:06.245Z|15734|bridge|INFO|bridge br0: added interface veth691c72c0 on port 2107
2019-04-22T01:30:06.337Z|15735|connmgr|INFO|br0<->unix#38519: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:30:06.364Z|15736|connmgr|INFO|br0<->unix#38521: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:30:06.644Z|15737|bridge|INFO|bridge br0: added interface veth8205971f on port 2108
2019-04-22T01:30:06.739Z|15738|connmgr|INFO|br0<->unix#38523: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:30:06.769Z|15739|connmgr|INFO|br0<->unix#38525: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:30:06.846Z|15740|bridge|INFO|bridge br0: added interface vethb3f6d779 on port 2109
2019-04-22T01:30:06.949Z|15741|connmgr|INFO|br0<->unix#38527: 4 flow_mods in the last 0 s (4 adds)
2019-04-22T01:30:06.970Z|15742|connmgr|INFO|br0<->unix#38529: 2 flow_mods in the last 0 s (2 deletes)
2019-04-22T01:30:07.167Z|00002|daemon_unix(monitor)|INFO|pid 26778 died, killed (Killed), exiting
sh-4.2# tail -100 /var/log/openvswitch/ovsdb-server.log 
2019-04-19T03:26:40.249Z|00001|vlog|INFO|opened log file /var/log/openvswitch/ovsdb-server.log
2019-04-19T03:26:42.334Z|00002|ovsdb_server|INFO|ovsdb-server (Open vSwitch) 2.9.0
2019-04-19T03:26:50.365Z|00003|memory|INFO|3092 kB peak resident set size after 10.1 seconds
2019-04-19T03:26:50.365Z|00004|memory|INFO|cells:453 json-caches:1 monitors:1 sessions:2
2019-04-19T20:40:09.746Z|00005|ovsdb_file|INFO|/etc/openvswitch/conf.db: compacting database online (1545382217.129 seconds old, 10274 transactions)

Comment 1 Wang Haoran 2019-04-22 03:02:11 UTC
workaround is delete the ovs and sdn pod, and let it recreated.

Comment 4 Weibin Liang 2019-06-12 19:26:12 UTC
Install new cluster using v3.11.117, wait cluster running for several hours, do not see sdn pod CrashLoopBackOff

Comment 6 errata-xmlrpc 2019-06-26 09:08:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:1605