Bug 1543786 - [egressip] Incorrect openflow rule added after deleting namespace and then reusing egress IP
Summary: [egressip] Incorrect openflow rule added after deleting namespace and then re...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.10.0
Assignee: Dan Winship
QA Contact: Meng Bo
URL:
Whiteboard:
: 1544454 (view as bug list)
Depends On:
Blocks: 1544454 1544455
TreeView+ depends on / blocked
 
Reported: 2018-02-09 10:18 UTC by Meng Bo
Modified: 2018-12-20 21:45 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Due to incorrect cleanup of internal state, if you deleted a "static per-project egress IPs" from one project and then tried to reuse that IP for a different project, the OVS rules for the new project would be created incorrectly. Consequence: The egress IP would not be used for the new project, and might start being used again for some traffic from the old project. Fix: The internal state is now cleaned up correctly when removing an egress IP. Result: Egress traffic works as expected.
Clone Of:
: 1544454 1544455 (view as bug list)
Environment:
Last Closed: 2018-12-20 21:11:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Origin (Github) 18547 0 None None None 2018-02-09 14:52:25 UTC
Origin (Github) 18808 0 None None None 2018-03-06 18:56:56 UTC

Description Meng Bo 2018-02-09 10:18:09 UTC
Description of problem:
There is one incorrect openflow rule is added to table 100 when adding the egressIP to the hostsubnet object, 
table=100, priority=100,ip,reg0=0x4f4cf0 actions=set_field:4a:bb:0a:9e:0f:ab->eth_dst,set_field:0x4f4cf0->pkt_mark,goto_table:101
(where the 0x4f4cf0 is the netID of a deleted namespace)
which causes that the egressIP will not work totally.


Version-Release number of selected component (if applicable):
v3.9.0-0.41.0

How reproducible:
always

Steps to Reproduce:
1. Setup multinode env with multitenant or networkpolicy plugin
2. Create project named b1
3. Patch the hostsubnet of node1 to make it as egress node
4. Patch the netnamespace of project b1
5. Delete the project b1
6. Remove the egressIP value from hostsubnet of node1
7. Create another project named b2
8. Patch the hostsubnet of any node 

Actual results:
After step 8, the incorrect openflow rule will be added to table 100

table=100, priority=100,ip,reg0=0x2fbc89 actions=move:NXM_NX_REG0[]->NXM_NX_TUN_ID[0..31],set_field:10.66.140.15->tun_dst,output:1

The value of reg0 is the netid of the deleted project b1
And the egressIP will not work after that.

Expected results:
Should not add the incorrect rule when adding the egressip to hostsubnet

Additional info:
Netid of project b1
[root@ose-master ~]# oc get netnamespace
NAME              NETID      EGRESS IPS
b1                3128457    []
default           0          []
kube-public       13569059   []
kube-system       4330111    []
openshift         721723     []
openshift-infra   8764350    []
openshift-node    13969432   []

Netid of project b2
[root@ose-master ~]# oc get netnamespace 
NAME              NETID      EGRESS IPS
b2                2059132    []
default           0          []
kube-public       13569059   []
kube-system       4330111    []
openshift         721723     []
openshift-infra   8764350    []
openshift-node    13969432   []


Openflow rules after step4:
table=80, priority=300,ip,nw_src=10.128.0.1 actions=output:NXM_NX_REG2[]
table=80, priority=200,ct_state=+rpl,ip actions=output:NXM_NX_REG2[]
table=80, priority=0 actions=drop
table=90, priority=100,ip,nw_dst=10.128.3.0/24 actions=move:NXM_NX_REG0[]->NXM_NX_TUN_ID[0..31],set_field:10.66.140.15->tun_dst,output:1
table=90, priority=0 actions=drop
table=100, priority=100,ip,reg0=0x2fbc89 actions=set_field:96:56:c4:62:03:f2->eth_dst,set_field:0x12fbc88->pkt_mark,goto_table:101
table=100, priority=0 actions=goto_table:101
table=101, priority=51,tcp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=51,udp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=0 actions=output:2

Openflow rules after step6:
table=80, priority=300,ip,nw_src=10.128.0.1 actions=output:NXM_NX_REG2[]
table=80, priority=200,ct_state=+rpl,ip actions=output:NXM_NX_REG2[]
table=80, priority=0 actions=drop
table=90, priority=100,ip,nw_dst=10.128.3.0/24 actions=move:NXM_NX_REG0[]->NXM_NX_TUN_ID[0..31],set_field:10.66.140.15->tun_dst,output:1
table=90, priority=0 actions=drop
table=100, priority=0 actions=goto_table:101
table=101, priority=51,tcp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=51,udp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=0 actions=output:2

Openflow rules after step7:
table=80, priority=300,ip,nw_src=10.128.0.1 actions=output:NXM_NX_REG2[]
table=80, priority=200,ct_state=+rpl,ip actions=output:NXM_NX_REG2[]
table=80, priority=0 actions=drop
table=90, priority=100,ip,nw_dst=10.128.3.0/24 actions=move:NXM_NX_REG0[]->NXM_NX_TUN_ID[0..31],set_field:10.66.140.15->tun_dst,output:1
table=90, priority=0 actions=drop
table=100, priority=0 actions=goto_table:101
table=101, priority=51,tcp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=51,udp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=0 actions=output:2

Openflow rules after step8:
table=80, priority=300,ip,nw_src=10.128.0.1 actions=output:NXM_NX_REG2[]
table=80, priority=200,ct_state=+rpl,ip actions=output:NXM_NX_REG2[]
table=80, priority=0 actions=drop
table=90, priority=100,ip,nw_dst=10.128.3.0/24 actions=move:NXM_NX_REG0[]->NXM_NX_TUN_ID[0..31],set_field:10.66.140.15->tun_dst,output:1
table=90, priority=0 actions=drop
table=100, priority=100,ip,reg0=0x2fbc89 actions=set_field:96:56:c4:62:03:f2->eth_dst,set_field:0x12fbc88->pkt_mark,goto_table:101
table=100, priority=0 actions=goto_table:101
table=101, priority=51,tcp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=51,udp,nw_dst=10.66.141.128,tp_dst=53 actions=output:2
table=101, priority=0 actions=output:2

Comment 1 Dan Winship 2018-02-09 14:52:25 UTC
https://github.com/openshift/origin/pull/18547

Comment 2 Dan Winship 2018-02-09 14:54:11 UTC
> 8. Patch the hostsubnet of any node 

(using the same egressIP as before)

Comment 4 Meng Bo 2018-02-22 08:14:21 UTC
Tested on OCP v3.9.0-0.47.0, issue has been fixed.

Add the deleted egress IP back after deleted the netnamespace, the egress IP for other netnamespace will work fine.

Comment 5 Meng Bo 2018-03-06 10:28:29 UTC
There is still problem when adding the deleted egressip back to node.
But the issue will not occur when the first time add it back, user need to repeat the steps 5-8 again. 
And check the openflow, the 2nd vnid will be used for the pkt_mark.


The openflow rules:
The netnamespace with vnid 0xa1103e is added after the one with 0x39a7b4
But the table 100 is using the old one

table=80, priority=100,reg0=0x46f52c,reg1=0x46f52c actions=output:NXM_NX_REG2[]
table=80, priority=100,reg0=0x39a7b4,reg1=0x39a7b4 actions=output:NXM_NX_REG2[]
table=80, priority=100,reg0=0xa1103e,reg1=0xa1103e actions=output:NXM_NX_REG2[]
table=80, priority=0 actions=drop
table=90, priority=100,ip,nw_dst=10.129.2.0/23 actions=move:NXM_NX_REG0[]->NXM_NX_TUN_ID[0..31],set_field:10.1.1.4->tun_dst,output:1
table=90, priority=0 actions=drop
table=100, priority=100,ip,reg0=0x39a7b4 actions=set_field:4e:22:d7:60:ae:cc->eth_dst,set_field:0x39a7b4->pkt_mark,goto_table:101
table=100, priority=0 actions=goto_table:101
table=101, priority=51,tcp,nw_dst=10.1.1.3,tp_dst=53 actions=output:2
table=101, priority=51,udp,nw_dst=10.1.1.3,tp_dst=53 actions=output:2

Comment 6 Dan Winship 2018-03-06 14:00:59 UTC
Fixed by https://github.com/openshift/origin/pull/18808

Comment 7 Meng Bo 2018-03-08 06:21:02 UTC
*** Bug 1544454 has been marked as a duplicate of this bug. ***

Comment 10 Meng Bo 2018-06-05 09:48:56 UTC
Tested on v3.10.0-0.58.0

Issue has been fixed.


Note You need to log in before you can comment on or make changes to this bug.