Bug 1985336 - OpenShift SDN doesn't add NOTRACK rule to raw iptables table to prevent vxlan from reaching conntrack
Summary: OpenShift SDN doesn't add NOTRACK rule to raw iptables table to prevent vxlan...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.9.0
Assignee: Andrew Stoycos
QA Contact: Ying Wang
URL:
Whiteboard:
: 1973864 2005733 (view as bug list)
Depends On:
Blocks: 1995871
TreeView+ depends on / blocked
 
Reported: 2021-07-23 12:09 UTC by Pablo Alonso Rodriguez
Modified: 2022-02-08 21:13 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-10-18 17:40:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift sdn pull 335 0 None None None 2021-08-16 17:21:10 UTC
Github openshift sdn pull 339 0 None Merged [release-4.6] Bug 1995873: Disable conntrack for vxlan traffic 2021-09-22 17:33:51 UTC
Red Hat Knowledge Base (Solution) 6217361 0 None None None 2021-07-28 08:32:22 UTC
Red Hat Product Errata RHSA-2021:3759 0 None None None 2021-10-18 17:41:15 UTC

Description Pablo Alonso Rodriguez 2021-07-23 12:09:05 UTC
Description of problem:

OpenShift SDN uses vxlan for node-to-node traffic encapsulation but it doesn't add NOTRACK iptables rules to the raw table, so conntrack still processes OCP traffic even when not necessary.

This can have severe performance implications in a heavily loaded environment, leading to a non-working environment.

Version-Release number of selected component (if applicable):

Found on 4.6, but should apply to any OCP version including 3.11 and any 4.y as long as openshift-sdn is in use

How reproducible:

Always

Steps to Reproduce:
1. iptables -t raw -S


Actual results:

-P PREROUTING ACCEPT
-P OUTPUT ACCEPT

Expected results:

-P PREROUTING ACCEPT
-P OUTPUT ACCEPT
-A PREROUTING -p udp -m udp --dport 4789 -j NOTRACK
-A OUTPUT -p udp -m udp --dport 4789 -j NOTRACK

Additional info:

More context on performance impact in comments.

Besides, if this had been fixed, it could also have prevented OpenShift from being affected by any of the multiple conntrack bugs that impacted OCP in the past.

Comment 2 Pablo Alonso Rodriguez 2021-07-26 12:47:28 UTC
I have opened a pull request with a fix for master branch: https://github.com/openshift/sdn/pull/324 

Can you please review?

Thanks in advance and regards.

Comment 3 Stephen Cuppett 2021-07-28 01:27:57 UTC
*** Bug 1973864 has been marked as a duplicate of this bug. ***

Comment 4 Ying Wang 2021-07-29 06:10:28 UTC
I deployed a cluster using https://github.com/openshift/sdn/pull/324 from cluster-bot, and checked on nodes, iptables rules are added.

sh-4.4# iptables -t raw -S
-P PREROUTING ACCEPT
-P OUTPUT ACCEPT
-N OPENSHIFT-VXLAN-NOTRACK
-A PREROUTING -m comment --comment "disable conntrack for vxlan" -j OPENSHIFT-VXLAN-NOTRACK
-A OUTPUT -m comment --comment "disable conntrack for vxlan" -j OPENSHIFT-VXLAN-NOTRACK
-A OPENSHIFT-VXLAN-NOTRACK -p udp -m udp --dport 4789 -j NOTRACK


lilia@liliadeMacBook-Pro Downloads % oc version
Client Version: 4.7.5
Server Version: 4.8.0-0.ci.test-2021-07-29-030151-ci-ln-210rw9t-latest
Kubernetes Version: v1.21.1-1394+051ac4f6786868-dirty
lilia@liliadeMacBook-Pro Downloads %

Comment 5 Surya Seetharaman 2021-08-03 11:48:01 UTC
@trozet: Checked in OVN-K. We are already disabling conntrack for geneve. All good (https://github.com/openshift/cluster-network-operator/blob/f202ceea725fc3cf315b3883206462a2b4defadd/bindata/network/ovn-kubernetes/ovnkube-node.yaml#L215).

2021-06-04T19:21:11.970092353Z + echo 'I0604 19:21:11.969676557 - disable conntrack on geneve port'
2021-06-04T19:21:11.970104775Z I0604 19:21:11.969676557 - disable conntrack on geneve port
2021-06-04T19:21:11.970127952Z + iptables -t raw -A PREROUTING -p udp --dport 6081 -j NOTRACK
2021-06-04T19:21:11.983552565Z + iptables -t raw -A OUTPUT -p udp --dport 6081 -j NOTRACK

Comment 13 Ying Wang 2021-08-23 09:11:41 UTC
Checked on version 4.9.0-0.nightly-2021-08-22-070405 with SDN network, and get iptables on nodes as below.


sh-4.4# iptables -t raw -S
-P PREROUTING ACCEPT
-P OUTPUT ACCEPT
-N OPENSHIFT-NOTRACK
-A PREROUTING -m comment --comment "disable conntrack for vxlan" -j OPENSHIFT-NOTRACK
-A OUTPUT -m comment --comment "disable conntrack for vxlan" -j OPENSHIFT-NOTRACK
-A OPENSHIFT-NOTRACK -p udp -m udp --dport 4789 -j NOTRACK


% oc version
Client Version: 4.9.0-0.nightly-2021-08-18-144658
Server Version: 4.9.0-0.nightly-2021-08-22-070405
Kubernetes Version: v1.22.0-rc.0+5c2f7cd

Comment 15 Jacob Tanenbaum 2021-09-22 17:33:52 UTC
*** Bug 2005733 has been marked as a duplicate of this bug. ***

Comment 17 errata-xmlrpc 2021-10-18 17:40:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759


Note You need to log in before you can comment on or make changes to this bug.