1702194 – [sdn-254]Application pod should not be killed after ovs restart

Bug 1702194 - [sdn-254]Application pod should not be killed after ovs restart

Summary: [sdn-254]Application pod should not be killed after ovs restart

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.1.0
Hardware:	All
OS:	All
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.1.0
Assignee:	Dan Winship
QA Contact:	Meng Bo
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-04-23 08:05 UTC by zhaozhanqi
Modified:	2019-06-04 10:47 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-04 10:47:50 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
crio_kubelet_loc_logs_previous_sdn logs (21.77 KB, application/gzip) 2019-04-24 02:47 UTC, zhaozhanqi	no flags	Details
oc logs for ovs and sdn pod (8.26 KB, application/gzip) 2019-04-24 03:01 UTC, zhaozhanqi	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift origin pull 22652	0	None	None	None	2019-04-24 13:38:56 UTC
Red Hat Product Errata	RHBA-2019:0758	0	None	None	None	2019-06-04 10:47:58 UTC

Description zhaozhanqi 2019-04-23 08:05:45 UTC

Description of problem:
When restart the ovs pod, the application pod also be reassigned ip and ovs ports also be recreated.

Version-Release number of selected component (if applicable):
4.1.0-0.nightly-2019-04-22-192604

How reproducible:
always

Steps to Reproduce:
1. New project and create the test pod
  oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/list_for_pods.json
2. check the node of test pod located
3. Delete the ovs pod of node where test pod located
4. Check the test pod again after the ovs pod is recreated
5. Check the logs of sdn pod

Actual results:
 the test pod ip is reassigned after the ovs pod is recreated
 
 step 5 logs:

  $ oc logs sdn-7xt8g -n openshift-sdn | grep node.go
I0423 06:27:00.453236   96643 node.go:148] Initializing SDN node of type "redhat/openshift-ovs-networkpolicy" with configured hostname "ip-172-31-143-62.eu-central-1.compute.internal" (IP ""), iptables sync period "30s"
I0423 06:27:00.464969   96643 node.go:267] Starting openshift-sdn network plugin
I0423 06:27:01.124169   96643 node.go:326] Starting openshift-sdn pod manager
I0423 06:27:01.128289   96643 node.go:334] OVS bridge has been recreated. Will reattach 12 existing pods...
I0423 06:27:01.137441   96643 node.go:389] Interface vethba581459 for pod 'openshift-ingress/router-default-bf9fcfb87-9zdfp' no longer exists
I0423 06:27:01.145792   96643 node.go:389] Interface veth5df0805e for pod 'openshift-monitoring/prometheus-k8s-0' no longer exists
I0423 06:27:01.152091   96643 node.go:389] Interface vethe86f2bf7 for pod 'openshift-monitoring/prometheus-adapter-54fd64868b-59xx2' no longer exists
I0423 06:27:01.160647   96643 node.go:389] Interface vetha5a4b482 for pod 'openshift-marketplace/redhat-operators-5cb954dc98-q2z27' no longer exists
I0423 06:27:01.165979   96643 node.go:389] Interface veth6150ab23 for pod 'openshift-marketplace/community-operators-544d88b5c4-qcr62' no longer exists
I0423 06:27:01.170942   96643 node.go:389] Interface veth3ca85a6d for pod 'openshift-monitoring/alertmanager-main-2' no longer exists
I0423 06:27:01.175796   96643 node.go:389] Interface veth037815a3 for pod 'openshift-dns/dns-default-78rsw' no longer exists
I0423 06:27:01.180541   96643 node.go:389] Interface veth490e9e59 for pod 'openshift-ssh-bastion/ssh-bastion-f4d5bbcbd-799lp' no longer exists
I0423 06:27:01.185259   96643 node.go:389] Interface vethfbf11a61 for pod 'openshift-image-registry/node-ca-k6qhv' no longer exists
I0423 06:27:01.190231   96643 node.go:389] Interface veth13354f47 for pod 'z1/test-rc-qfwjv' no longer exists
I0423 06:27:01.195061   96643 node.go:389] Interface veth69098196 for pod 'openshift-image-registry/image-registry-57b5d46fb-tjs66' no longer exists
I0423 06:27:01.200207   96643 node.go:389] Interface vethf811a976 for pod 'openshift-monitoring/alertmanager-main-0' no longer exists
I0423 06:27:01.200237   96643 node.go:350] openshift-sdn network plugin registering startup
I0423 06:27:01.200389   96643 node.go:368] openshift-sdn network plugin ready

Expected results:

test pod should be killed after the ovs pod is restarted.

Additional info:

Comment 1 Casey Callendrello 2019-04-23 08:57:29 UTC

Dan (Winship), can you take a look?

Comment 2 zhaozhanqi 2019-04-23 09:10:30 UTC

sorry for typo for the Expected results: 

test pod should NOT be killed after the ovs pod is restarted.

Comment 3 Dan Winship 2019-04-23 12:08:21 UTC

Can you attach the kubelet and crio logs from that node?

Also, the "oc logs" and "oc logs --previous" for openvswitch on that node.

Actually, also the "oc logs --previous" for sdn on that node

Comment 4 zhaozhanqi 2019-04-24 02:47:42 UTC

Created attachment 1557938 [details]
crio_kubelet_loc_logs_previous_sdn logs

Comment 5 zhaozhanqi 2019-04-24 03:01:34 UTC

Created attachment 1557950 [details]
oc logs for ovs and sdn pod

Comment 6 zhaozhanqi 2019-04-24 03:04:58 UTC

hi, Dan Winship

I added the following logs in the attachment:
oc logs --previous sdn-xxx
oc logs sdn-xxx
oc logs ovs-xxx
journalctl -u kubelet
journalctl -u crio

please let me know if you need more.

Comment 8 zhaozhanqi 2019-04-25 06:05:17 UTC

Verified this bug on 4.1.0-0.nightly-2019-04-25-002910

Comment 10 errata-xmlrpc 2019-06-04 10:47:50 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758

Note You need to log in before you can comment on or make changes to this bug.