Description of problem: Customer is creating egressNetworkPolicy in a project. But system is just removing all rules from OVS for the netnamespace and adding a single "drop all" rule. In node logs we see: atomic-openshift-node[39469]: E0106 17:40:05.187734 39469 controller.go:506] multiple EgressNetworkPolicies in same network namespace (vwc-rec:default, m4d-rec:default) is not allowed; dropping all traffic have checked: - no global projects have egress policy defined. - there are no joined projects. - none of the projects have more than one egress policy defined. Two separate environments (3.3.1.7 and 3.3.1.3) do suffer from the issue. At the beginning only one node was affected by this. But now both nodes have this issue. It looks like more project related. Please advise on what to check/trace. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
This is https://github.com/openshift/origin/pull/12045 and it's fixed in 3.4 (v3.4.0.32). We did not backport the fix to 3.3. The relevant code didn't change much between 3.3 and 3.4 so it would be possible to do, but I don't know what the policy is for 3.3 bugfixes at this point... (There is no way to work around the bug other than backporting the bugfix.)
Tested on OCP 3.3.1.11 After adding multiple egresspolicy to a single namespace, the existing openflow rules will not be affected. And will add a new one to drop the traffic for specific project. From node log: Jan 22 03:03:10 node1 atomic-openshift-node[27026]: E0122 03:03:10.885757 27026 controller.go:506] multiple EgressNetwor kPolicies in same network namespace (bmengp1:default, bmengp1:default2) is not allowed; dropping all traffic Jan 22 03:03:10 node1 atomic-openshift-node[27026]: I0122 03:03:10.885809 27026 ovs.go:37] Executing: /usr/bin/ovs-ofctl -O OpenFlow13 del-flows br0 table=9, reg0=720494 Jan 22 03:03:10 node1 atomic-openshift-node[27026]: I0122 03:03:10.891489 27026 ovs.go:37] Executing: /usr/bin/ovs-ofctl -O OpenFlow13 add-flow br0 table=9, reg0=720494, priority=1, actions=drop Check the openflow rules: # ovs-ofctl dump-flows br0 -O openflow13 OFPST_FLOW reply (OF1.3) (xid=0x2): cookie=0x0, duration=221.113s, table=0, n_packets=0, n_bytes=0, priority=200,arp,in_port=1,arp_spa=10.1.0.0/16,arp_tpa=10.1.1.0/24 actions=move:NXM_NX_TUN_ID[0..31]- >NXM_NX_REG0[],goto_table:1 cookie=0x0, duration=221.110s, table=0, n_packets=0, n_bytes=0, priority=200,ip,in_port=1,nw_src=10.1.0.0/16,nw_dst=10.1.1.0/24 actions=move:NXM_NX_TUN_ID[0..31]->NX M_NX_REG0[],goto_table:1 cookie=0x0, duration=221.105s, table=0, n_packets=45, n_bytes=1890, priority=200,arp,in_port=2,arp_spa=10.1.1.1,arp_tpa=10.1.0.0/16 actions=goto_table:5 cookie=0x0, duration=221.102s, table=0, n_packets=3871, n_bytes=2493051, priority=200,ip,in_port=2 actions=goto_table:5 cookie=0x0, duration=221.095s, table=0, n_packets=2, n_bytes=84, priority=200,arp,in_port=3,arp_spa=10.1.1.0/24 actions=goto_table:5 cookie=0x0, duration=221.085s, table=0, n_packets=0, n_bytes=0, priority=200,ip,in_port=3,nw_src=10.1.1.0/24 actions=goto_table:5 cookie=0x0, duration=221.108s, table=0, n_packets=0, n_bytes=0, priority=150,in_port=1 actions=drop cookie=0x0, duration=221.098s, table=0, n_packets=16, n_bytes=1296, priority=150,in_port=2 actions=drop cookie=0x0, duration=221.058s, table=0, n_packets=38, n_bytes=3132, priority=150,in_port=3 actions=drop cookie=0x0, duration=221.050s, table=0, n_packets=41, n_bytes=1722, priority=100,arp actions=goto_table:2 cookie=0x0, duration=221.044s, table=0, n_packets=2231, n_bytes=239877, priority=100,ip actions=goto_table:2 cookie=0x0, duration=221.004s, table=0, n_packets=45, n_bytes=3558, priority=0 actions=drop cookie=0x0, duration=220.782s, table=1, n_packets=0, n_bytes=0, priority=100,tun_src=10.8.174.9 actions=goto_table:5 cookie=0x0, duration=221.001s, table=1, n_packets=0, n_bytes=0, priority=0 actions=drop cookie=0x0, duration=220.613s, table=2, n_packets=2, n_bytes=84, priority=100,arp,in_port=11,arp_spa=10.1.1.5,arp_sha=02:42:0a:01:01:05 actions=load:0->NXM_NX_REG0[],goto_table:5 cookie=0x0, duration=220.604s, table=2, n_packets=318, n_bytes=28460, priority=100,ip,in_port=11,nw_src=10.1.1.5 actions=load:0->NXM_NX_REG0[],goto_table:3 cookie=0x0, duration=220.992s, table=2, n_packets=0, n_bytes=0, priority=0 actions=drop cookie=0x0, duration=220.990s, table=3, n_packets=299, n_bytes=75681, priority=100,ip,nw_dst=172.30.0.0/16 actions=goto_table:4 cookie=0x0, duration=220.981s, table=3, n_packets=1932, n_bytes=164196, priority=0 actions=goto_table:5 cookie=0x0, duration=220.958s, table=4, n_packets=299, n_bytes=75681, priority=200,reg0=0 actions=output:2 ... ... ... cookie=0x0, duration=220.913s, table=8, n_packets=0, n_bytes=0, priority=0 actions=drop cookie=0x0, duration=22.721s, table=9, n_packets=0, n_bytes=0, priority=1,reg0=0xafe6e actions=drop cookie=0x0, duration=220.911s, table=9, n_packets=496, n_bytes=35890, priority=0 actions=output:2 cookie=0x0, duration=220.822s, table=253, n_packets=0, n_bytes=0, actions=note:01.01.00.00.00.00
Please ignore the comment#11 above. Tested with following steps To reproduce, tested on build 3.3.1.9 1. Create 10 projects 2. Add egress policy to each project 3. Check the openflow 4. Restart openshift node service 5. Check the openflow again Result: In step 3, the openflow rules for the project created in table9 with following contents, cookie=0x0, duration=1.722s, table=9, n_packets=0, n_bytes=0, priority=2,ip,reg0=0x5d687c,nw_dst=172.16.120.0/24 actions=output:2 cookie=0x0, duration=1.715s, table=9, n_packets=0, n_bytes=0, priority=1,ip,reg0=0x5d687c,nw_dst=10.66.140.0/24 actions=drop In step 5, the openflow rules are changed by the restart to cookie=0x0, duration=1.704s, table=9, n_packets=0, n_bytes=0, priority=1,reg0=0x5d687c actions=drop To verify, tested with the same steps above on build 3.3.1.11 The openflow rules for the project with egressnetworkpolicy will not be corrupted by the restart.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0199