Bug 1702194 - [sdn-254]Application pod should not be killed after ovs restart
Summary: [sdn-254]Application pod should not be killed after ovs restart
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.1.0
Hardware: All
OS: All
high
high
Target Milestone: ---
: 4.1.0
Assignee: Dan Winship
QA Contact: Meng Bo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-04-23 08:05 UTC by zhaozhanqi
Modified: 2019-06-04 10:47 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-04 10:47:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
crio_kubelet_loc_logs_previous_sdn logs (21.77 KB, application/gzip)
2019-04-24 02:47 UTC, zhaozhanqi
no flags Details
oc logs for ovs and sdn pod (8.26 KB, application/gzip)
2019-04-24 03:01 UTC, zhaozhanqi
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift origin pull 22652 0 None None None 2019-04-24 13:38:56 UTC
Red Hat Product Errata RHBA-2019:0758 0 None None None 2019-06-04 10:47:58 UTC

Description zhaozhanqi 2019-04-23 08:05:45 UTC
Description of problem:
When restart the ovs pod, the application pod also be reassigned ip and ovs ports also be recreated.

Version-Release number of selected component (if applicable):
4.1.0-0.nightly-2019-04-22-192604

How reproducible:
always

Steps to Reproduce:
1. New project and create the test pod
  oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/list_for_pods.json
2. check the node of test pod located
3. Delete the ovs pod of node where test pod located
4. Check the test pod again after the ovs pod is recreated
5. Check the logs of sdn pod

Actual results:
 the test pod ip is reassigned after the ovs pod is recreated
 
 step 5 logs:

  $ oc logs sdn-7xt8g -n openshift-sdn | grep node.go
I0423 06:27:00.453236   96643 node.go:148] Initializing SDN node of type "redhat/openshift-ovs-networkpolicy" with configured hostname "ip-172-31-143-62.eu-central-1.compute.internal" (IP ""), iptables sync period "30s"
I0423 06:27:00.464969   96643 node.go:267] Starting openshift-sdn network plugin
I0423 06:27:01.124169   96643 node.go:326] Starting openshift-sdn pod manager
I0423 06:27:01.128289   96643 node.go:334] OVS bridge has been recreated. Will reattach 12 existing pods...
I0423 06:27:01.137441   96643 node.go:389] Interface vethba581459 for pod 'openshift-ingress/router-default-bf9fcfb87-9zdfp' no longer exists
I0423 06:27:01.145792   96643 node.go:389] Interface veth5df0805e for pod 'openshift-monitoring/prometheus-k8s-0' no longer exists
I0423 06:27:01.152091   96643 node.go:389] Interface vethe86f2bf7 for pod 'openshift-monitoring/prometheus-adapter-54fd64868b-59xx2' no longer exists
I0423 06:27:01.160647   96643 node.go:389] Interface vetha5a4b482 for pod 'openshift-marketplace/redhat-operators-5cb954dc98-q2z27' no longer exists
I0423 06:27:01.165979   96643 node.go:389] Interface veth6150ab23 for pod 'openshift-marketplace/community-operators-544d88b5c4-qcr62' no longer exists
I0423 06:27:01.170942   96643 node.go:389] Interface veth3ca85a6d for pod 'openshift-monitoring/alertmanager-main-2' no longer exists
I0423 06:27:01.175796   96643 node.go:389] Interface veth037815a3 for pod 'openshift-dns/dns-default-78rsw' no longer exists
I0423 06:27:01.180541   96643 node.go:389] Interface veth490e9e59 for pod 'openshift-ssh-bastion/ssh-bastion-f4d5bbcbd-799lp' no longer exists
I0423 06:27:01.185259   96643 node.go:389] Interface vethfbf11a61 for pod 'openshift-image-registry/node-ca-k6qhv' no longer exists
I0423 06:27:01.190231   96643 node.go:389] Interface veth13354f47 for pod 'z1/test-rc-qfwjv' no longer exists
I0423 06:27:01.195061   96643 node.go:389] Interface veth69098196 for pod 'openshift-image-registry/image-registry-57b5d46fb-tjs66' no longer exists
I0423 06:27:01.200207   96643 node.go:389] Interface vethf811a976 for pod 'openshift-monitoring/alertmanager-main-0' no longer exists
I0423 06:27:01.200237   96643 node.go:350] openshift-sdn network plugin registering startup
I0423 06:27:01.200389   96643 node.go:368] openshift-sdn network plugin ready

Expected results:

test pod should be killed after the ovs pod is restarted.

Additional info:

Comment 1 Casey Callendrello 2019-04-23 08:57:29 UTC
Dan (Winship), can you take a look?

Comment 2 zhaozhanqi 2019-04-23 09:10:30 UTC
sorry for typo for the Expected results: 

test pod should NOT be killed after the ovs pod is restarted.

Comment 3 Dan Winship 2019-04-23 12:08:21 UTC
Can you attach the kubelet and crio logs from that node?

Also, the "oc logs" and "oc logs --previous" for openvswitch on that node.

Actually, also the "oc logs --previous" for sdn on that node

Comment 4 zhaozhanqi 2019-04-24 02:47:42 UTC
Created attachment 1557938 [details]
crio_kubelet_loc_logs_previous_sdn logs

Comment 5 zhaozhanqi 2019-04-24 03:01:34 UTC
Created attachment 1557950 [details]
oc logs for ovs and sdn pod

Comment 6 zhaozhanqi 2019-04-24 03:04:58 UTC
hi, Dan Winship

I added the following logs in the attachment:
oc logs --previous sdn-xxx
oc logs sdn-xxx
oc logs ovs-xxx
journalctl -u kubelet
journalctl -u crio

please let me know if you need more.

Comment 8 zhaozhanqi 2019-04-25 06:05:17 UTC
Verified this bug on 4.1.0-0.nightly-2019-04-25-002910

Comment 10 errata-xmlrpc 2019-06-04 10:47:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.