Description of problem: When pods associated with a service are removed, the endpoint objects are updated immediately but there is a significant delay for this information to be reflected in the NAT table on the cluster nodes. This delay in NAT table updates causes connectivity problems for applications running within OpenShift because traffic is being routed to stale pod IP's. Version-Release number of selected component (if applicable): OCP 3.11.161 How reproducible: Consistently Actual results: A lag is observed in NAT Table updates related to endpoint changes. Expected results: NAT Table updates would happen within 1-2 seconds of an endpoint change.
Many problems with iptables in 3.11 can be resolved by setting `iptablesSyncPeriod` in the node-config to something large like '1h'. There is a large set of iptables performance fixes making their way toward 3.11, which bug 1795416 is the tracking bug for. We don't know exactly when this will be in an errata; at the moment we are working on finalizing the 4.2 backport, and then 3.11 is next.
*** This bug has been marked as a duplicate of bug 1795416 ***