Bug 1272435

Summary: Cannot reach the pod from other nodes in the same cluster with openshift-ovs-subnet plugin
Product: OKD Reporter: Meng Bo <bmeng>
Component: NetworkingAssignee: Dan Winship <danw>
Status: CLOSED CURRENTRELEASE QA Contact: Meng Bo <bmeng>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.xCC: aos-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-23 21:15:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Meng Bo 2015-10-16 12:01:48 UTC
Description of problem:
Try to ping the pod from all the nodes under openshift-ovs-subnet plugin.
Only the node which holding the pod can reach the pod, and all the other nodes cannot reach the pod.

But all the nodes can be reached from the pod.

Version-Release number of selected component (if applicable):
# openshift version
openshift v1.0.6-644-ga034e2f
kubernetes v1.1.0-alpha.1-653-g86b4e77


How reproducible:
always

Steps to Reproduce:
1. Setup multi-node env with 1 master 3 nodes
2. Create user and project
3. Create a pod with the user in his project
4. Try to ping the nodes ip from inside the pod
5. Try to ping the pod ip from all the nodes

Actual results:
4. All the nodes are accessible from the pod
5. Only the node which holding the pod can reach the pod

Expected results:
5. All the nodes should be able to access the pod in the cluster.

Additional info:
dump flow of the node which holding the pod:
[root@node1 ~]# ovs-ofctl dump-flows br0 -O openflow13
OFPST_FLOW reply (OF1.3) (xid=0x2):
 cookie=0x0, duration=1116.113s, table=0, n_packets=22, n_bytes=1816, priority=50 actions=output:2
 cookie=0x3, duration=1109.883s, table=0, n_packets=14, n_bytes=588, priority=100,arp,arp_tpa=10.1.0.2 actions=output:3
 cookie=0x3, duration=1109.885s, table=0, n_packets=32, n_bytes=2693, priority=100,ip,nw_dst=10.1.0.2 actions=output:3
 cookie=0x0, duration=1116.106s, table=0, n_packets=8, n_bytes=336, priority=100,arp,arp_tpa=10.1.0.1 actions=output:2
 cookie=0x0, duration=1116.101s, table=0, n_packets=6, n_bytes=599, priority=100,ip,nw_dst=10.1.0.1 actions=output:2
 cookie=0xa428039, duration=1115.567s, table=0, n_packets=3, n_bytes=126, priority=100,arp,arp_tpa=10.1.2.0/24 actions=set_field:10.66.128.57->tun_dst,output:1
 cookie=0xa428039, duration=1115.569s, table=0, n_packets=12, n_bytes=888, priority=100,ip,nw_dst=10.1.2.0/24 actions=set_field:10.66.128.57->tun_dst,output:1
 cookie=0xa42803d, duration=1115.572s, table=0, n_packets=3, n_bytes=126, priority=100,arp,arp_tpa=10.1.1.0/24 actions=set_field:10.66.128.61->tun_dst,output:1
 cookie=0xa42803d, duration=1115.575s, table=0, n_packets=14, n_bytes=1084, priority=100,ip,nw_dst=10.1.1.0/24 actions=set_field:10.66.128.61->tun_dst,output:1
 cookie=0xa42803c, duration=1115.580s, table=0, n_packets=0, n_bytes=0, priority=75,ip,nw_dst=10.1.0.0/24 actions=output:9
 cookie=0xa42803c, duration=1115.578s, table=0, n_packets=0, n_bytes=0, priority=75,arp,arp_tpa=10.1.0.0/24 actions=output:9


dump flow of other nodes:
[root@node2 ~]# ovs-ofctl dump-flows br0 -O openflow13 
OFPST_FLOW reply (OF1.3) (xid=0x2):
 cookie=0x0, duration=1157.090s, table=0, n_packets=7, n_bytes=510, priority=50 actions=output:2
 cookie=0x0, duration=1157.088s, table=0, n_packets=5, n_bytes=210, priority=100,arp,arp_tpa=10.1.1.1 actions=output:2
 cookie=0x0, duration=1157.086s, table=0, n_packets=14, n_bytes=1084, priority=100,ip,nw_dst=10.1.1.1 actions=output:2
 cookie=0xa428039, duration=1156.563s, table=0, n_packets=0, n_bytes=0, priority=100,arp,arp_tpa=10.1.2.0/24 actions=set_field:10.66.128.57->tun_dst,output:1
 cookie=0xa428039, duration=1156.566s, table=0, n_packets=0, n_bytes=0, priority=100,ip,nw_dst=10.1.2.0/24 actions=set_field:10.66.128.57->tun_dst,output:1
 cookie=0xa42803d, duration=1156.568s, table=0, n_packets=0, n_bytes=0, priority=75,arp,arp_tpa=10.1.1.0/24 actions=output:9
 cookie=0xa42803d, duration=1156.571s, table=0, n_packets=0, n_bytes=0, priority=75,ip,nw_dst=10.1.1.0/24 actions=output:9
 cookie=0xa42803c, duration=1156.576s, table=0, n_packets=9, n_bytes=714, priority=100,ip,nw_dst=10.1.0.0/24 actions=set_field:10.66.128.60->tun_dst,output:1
 cookie=0xa42803c, duration=1156.573s, table=0, n_packets=3, n_bytes=126, priority=100,arp,arp_tpa=10.1.0.0/24 actions=set_field:10.66.128.60->tun_dst,output:1

Comment 1 Dan Winship 2015-10-16 13:43:17 UTC
This was also just reported as https://github.com/openshift/origin/issues/5152.
It will be fixed by https://github.com/openshift/openshift-sdn/pull/183.

Comment 2 Meng Bo 2015-10-19 03:06:48 UTC
Checked on latest origin code with openshift-sdn version "57af6adcf067052b7ad97d4747ed4d3390c3a94a".

The issue has been fixed.

Comment 3 Dan Winship 2015-10-19 13:48:18 UTC
this has been merged into origin now

Comment 4 Meng Bo 2015-10-22 05:55:40 UTC
Verify fixed with latest origin code.