Bug 1544455

Summary: [3.7] [egressip] Incorrect openflow rule added after deleting namespace and then reusing egress IP
Product: OpenShift Container Platform Reporter: Dan Winship <danw>
Component: NetworkingAssignee: Ben Bennett <bbennett>
Status: CLOSED CURRENTRELEASE QA Contact: Meng Bo <bmeng>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.7.1CC: aos-bugs, bbennett, bmeng, vlaad, wsun
Target Milestone: ---   
Target Release: 3.7.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: Due to incorrect cleanup of internal state, if you deleted a "static per-project egress IPs" from one project and then tried to reuse that IP for a different project, the OVS rules for the new project would be created incorrectly. Consequence: The egress IP would not be used for the new project, and might start being used again for some traffic from the old project. Fix: The internal state is now cleaned up correctly when removing an egress IP. Result: Egress traffic works as expected.
Story Points: ---
Clone Of: 1543786 Environment:
Last Closed: 2018-10-08 12:45:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1543786    
Bug Blocks: 1544454    

Description Dan Winship 2018-02-12 14:34:00 UTC
+++ This bug was initially created as a clone of Bug #1543786 +++

Description of problem:
There is one incorrect openflow rule is added to table 100 when adding the egressIP to the hostsubnet object, 
table=100, priority=100,ip,reg0=0x4f4cf0 actions=set_field:4a:bb:0a:9e:0f:ab->eth_dst,set_field:0x4f4cf0->pkt_mark,goto_table:101
(where the 0x4f4cf0 is the netID of a deleted namespace)
which causes that the egressIP will not work totally.


Version-Release number of selected component (if applicable):
v3.9.0-0.41.0

How reproducible:
always

Steps to Reproduce:
1. Setup multinode env with multitenant or networkpolicy plugin
2. Create project named b1
3. Patch the hostsubnet of node1 to make it as egress node
4. Patch the netnamespace of project b1
5. Delete the project b1
6. Remove the egressIP value from hostsubnet of node1
7. Create another project named b2
8. Patch the hostsubnet of any node 

Actual results:
After step 8, the incorrect openflow rule will be added to table 100

table=100, priority=100,ip,reg0=0x2fbc89 actions=move:NXM_NX_REG0[]->NXM_NX_TUN_ID[0..31],set_field:10.66.140.15->tun_dst,output:1

The value of reg0 is the netid of the deleted project b1
And the egressIP will not work after that.

Expected results:
Should not add the incorrect rule when adding the egressip to hostsubnet

Additional info:
Netid of project b1
[root@ose-master ~]# oc get netnamespace
NAME              NETID      EGRESS IPS
b1                3128457    []
default           0          []
kube-public       13569059   []
kube-system       4330111    []
openshift         721723     []
openshift-infra   8764350    []
openshift-node    13969432   []

Netid of project b2
[root@ose-master ~]# oc get netnamespace 
NAME              NETID      EGRESS IPS
b2                2059132    []
default           0          []
kube-public       13569059   []
kube-system       4330111    []
openshift         721723     []
openshift-infra   8764350    []
openshift-node    13969432   []


Openflow rules after step4:
table=80, priority=300,ip,nw_src=10.128.0.1 actions=output:NXM_NX_REG2[]
table=80, priority=200,ct_state=+rpl,ip actions=output:NXM_NX_REG2[]
table=80, priority=0 actions=drop
table=90, priority=100,ip,nw_dst=10.128.3.0/24 actions=move:NXM_NX_REG0[]->NXM_NX_TUN_ID[0..31],set_field:10.66.140.15->tun_dst,output:1
table=90, priority=0 actions=drop
table=100, priority=100,ip,reg0=0x2fbc89 actions=set_field:96:56:c4:62:03:f2->eth_dst,set_field:0x12fbc88->pkt_mark,goto_table:101
table=100, priority=0 actions=goto_table:101
table=101, priority=51,tcp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=51,udp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=0 actions=output:2

Openflow rules after step6:
table=80, priority=300,ip,nw_src=10.128.0.1 actions=output:NXM_NX_REG2[]
table=80, priority=200,ct_state=+rpl,ip actions=output:NXM_NX_REG2[]
table=80, priority=0 actions=drop
table=90, priority=100,ip,nw_dst=10.128.3.0/24 actions=move:NXM_NX_REG0[]->NXM_NX_TUN_ID[0..31],set_field:10.66.140.15->tun_dst,output:1
table=90, priority=0 actions=drop
table=100, priority=0 actions=goto_table:101
table=101, priority=51,tcp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=51,udp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=0 actions=output:2

Openflow rules after step7:
table=80, priority=300,ip,nw_src=10.128.0.1 actions=output:NXM_NX_REG2[]
table=80, priority=200,ct_state=+rpl,ip actions=output:NXM_NX_REG2[]
table=80, priority=0 actions=drop
table=90, priority=100,ip,nw_dst=10.128.3.0/24 actions=move:NXM_NX_REG0[]->NXM_NX_TUN_ID[0..31],set_field:10.66.140.15->tun_dst,output:1
table=90, priority=0 actions=drop
table=100, priority=0 actions=goto_table:101
table=101, priority=51,tcp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=51,udp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=0 actions=output:2

Openflow rules after step8:
table=80, priority=300,ip,nw_src=10.128.0.1 actions=output:NXM_NX_REG2[]
table=80, priority=200,ct_state=+rpl,ip actions=output:NXM_NX_REG2[]
table=80, priority=0 actions=drop
table=90, priority=100,ip,nw_dst=10.128.3.0/24 actions=move:NXM_NX_REG0[]->NXM_NX_TUN_ID[0..31],set_field:10.66.140.15->tun_dst,output:1
table=90, priority=0 actions=drop
table=100, priority=100,ip,reg0=0x2fbc89 actions=set_field:96:56:c4:62:03:f2->eth_dst,set_field:0x12fbc88->pkt_mark,goto_table:101
table=100, priority=0 actions=goto_table:101
table=101, priority=51,tcp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=51,udp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=0 actions=output:2

--- Additional comment from Dan Winship on 2018-02-09 09:52:25 EST ---

https://github.com/openshift/origin/pull/18547

--- Additional comment from Dan Winship on 2018-02-09 09:54:11 EST ---

> 8. Patch the hostsubnet of any node 

(using the same egressIP as before)

Comment 2 Meng Bo 2018-03-06 10:29:02 UTC
There still have problem, please refer to:

https://bugzilla.redhat.com/show_bug.cgi?id=1543786#c5

Comment 5 Meng Bo 2018-03-21 07:33:56 UTC
The bug has been fixed on build v3.7.39