Bug 2005733 - conntrack [UNREPLIED] state for UDP 4789
Summary: conntrack [UNREPLIED] state for UDP 4789
Keywords:
Status: CLOSED DUPLICATE of bug 1985336
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6.z
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Jacob Tanenbaum
QA Contact: zhaozhanqi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-09-20 00:05 UTC by Robin Cernin
Modified: 2024-12-20 21:07 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-09-22 17:33:52 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift sdn pull 339 0 None Merged [release-4.6] Bug 1995873: Disable conntrack for vxlan traffic 2021-09-22 17:33:51 UTC

Description Robin Cernin 2021-09-20 00:05:53 UTC
Description of problem:

node-exporter triggers alarm on NodeHighNumberConntrackEntriesUsed.

We think it may be related to https://github.com/torvalds/linux/commit/e15d4cdf27cb0c1e977270270b2cea12e0955edd, basically host-to-host communication over UDP port 4789 (VXLAN traffic) is getting dropped somewhere in the network.

UDP is quite unreliable, and this is no problem, as retries are handled by higher TCP layer. However because of the bug this is stale entry and never cleaned up.

From https://github.com/torvalds/linux/commit/e15d4cdf27cb0c1e977270270b2cea12e0955edd:
~~~
netfilter: conntrack: do not renew entry stuck in tcp SYN_SENT state
Consider:
  client -----> conntrack ---> Host

client sends a SYN, but $Host is unreachable/silent.
Client eventually gives up and the conntrack entry will time out.

However, if the client is restarted with same addr/port pair, it
may prevent the conntrack entry from timing out.

This is noticeable when the existing conntrack entry has no NAT
transformation or an outdated one and port reuse happens either
on client or due to a NAT middlebox.

This change prevents refresh of the timeout for SYN retransmits,
so entry is going away after nf_conntrack_tcp_timeout_syn_sent
seconds (default: 60).

Entry will be re-created on next connection attempt, but then
nat rules will be evaluated again.
~~~


Version-Release number of selected component (if applicable):

4.6


Note You need to log in before you can comment on or make changes to this bug.