Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1481782

Summary:	3.5 check for and use new iptables-restore 'wait' argument
Product:	OpenShift Container Platform	Reporter:	Steven Walter <stwalter>
Component:	Networking	Assignee:	Ben Bennett <bbennett>
Status:	CLOSED ERRATA	QA Contact:	Meng Bo <bmeng>
Severity:	urgent	Docs Contact:
Priority:	unspecified
Version:	3.5.1	CC:	aos-bugs, danw, dcbw, eparis, erich, pdwyer, yadu
Target Milestone:	---
Target Release:	3.5.z
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	Cause: the iptables proxy was not properly locking its use of iptables. Consequence: the iptables proxy could conflict with docker and the openshift-node process and cause a failure to start containers. Fix: the iptables proxy now locks its use of iptables. Result: pod creation failures due to improper locking of iptables should no longer occur	Story Points:	---
Clone Of:
Clones:	1484133 (view as bug list)		Environment:
Last Closed:	2017-09-07 19:13:34 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Steven Walter 2017-08-15 17:17:25 UTC

Description of problem:

From github issue https://github.com/kubernetes/kubernetes/pull/43575

Starting in OCP 3.6 we check for and use the "wait" argument introduced to iptables in RHEL 7.4. This allows us to avoid issues when xtables is locked. RHEL is not planning to backport the fix to 7.3 due to discussion in https://bugzilla.redhat.com/show_bug.cgi?id=1438597

This bug is a request that we check for the flag in OCP 3.5. It needs to be possible to run without the flag (if on 7.3) or with the flag (if on 7.4). Because it is possible to run OCP 3.5 on RHEL 7.4, we need to take advantage of this flag.

xtables locking can cause a node to have many pods stuck ContainerCreating, including builds etc, that do not resolve themselves.

Version-Release number of selected component (if applicable):
3.5

Considering these 4 scenarios:

3.5 on rhel7.3 -- vulnerable to xtables lock
3.5 on rhel7.4 -- currently vulnerable to xtables lock, bug would resolve this
3.6 on rhel7.3 -- vulnerable to xtables lock, we advise upgrade to 7.4
3.6 on rhel7.4 -- safe

If this bug is resolved it would allow customers to *only* upgrade RHEL from 7.3 to 7.4, rather than needing to upgrade to 3.6 as well

Comment 1 Steven Walter 2017-08-15 17:23:36 UTC

For bugzilla searching purposes, the messages associated with this bug are:

Error syncing pod, skipping: failed to "SetupNetwork" for "package-event-source-139-deploy_vibe-develop" with SetupNetworkError: "Failed to setup network for pod \"example\" using network plugins \"cni\": CNI request failed with status 400: 'Failed to ensure that nat chain POSTROUTING jumps to MASQUERADE: error checking rule: exit status 4: 

Another app is currently holding the xtables lock; waiting (7s) for it to exit...\nAnother app is currently holding the xtables lock; waiting (9s) for it to exit...\n


Note that the first message also appears in https://bugzilla.redhat.com/1417234 -- in that event it is logspam

Comment 2 Dan Williams 2017-08-15 17:48:07 UTC

(In reply to Steven Walter from comment #0)
> Considering these 4 scenarios:
> 
> 3.5 on rhel7.3 -- vulnerable to xtables lock
> 3.5 on rhel7.4 -- currently vulnerable to xtables lock, bug would resolve
> this
> 3.6 on rhel7.3 -- vulnerable to xtables lock, we advise upgrade to 7.4

Correction: 3.6 on rhel7.3 should also be safe.  The code has fallback locking if iptables-restore doesn't support the wait argument.

Comment 3 Steven Walter 2017-08-16 16:08:36 UTC

For the purpose of recovery if a cluster hits this, what is the best way of releasing the lock if it is stuck locked? We should be able to determine what is holding the lock with:

# lsof /run/xtables.lock

Or

# find /proc -regex '\/proc\/[0-9]+\/fd\/.*' -type l -lname "*xtables.lock*" -printf "%p -> %l\n" 2> /dev/null

But not sure how to release the lock (restarting any services?) -- customer ended up scaling everything down, rebooting all the hosts, and scaling back up, but there's surely an easier way

Comment 6 Yan Du 2017-08-30 07:20:38 UTC

Test on OCP3.5 + rhel7.3 / OCP3.5 + rhel7.4
oc v3.5.5.31.24
kubernetes v1.5.2+43a9be4
iptables v1.4.21

Use an infinite loop to keep creating services which have same endpoint from master side.

On the node side, check the process which is opening the /run/xtables.lock file.

[root@ip-172-18-11-127 ~]# while true ; do lsof +c0 /run/xtables.lock ; done
COMMAND           PID USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME
iptables-restor 18773 root    3r   REG   0,18        0 24967 /run/xtables.lock
COMMAND           PID USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME
iptables-restor 18886 root    3r   REG   0,18        0 24967 /run/xtables.lock
COMMAND           PID USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME
iptables-restor 19039 root    3r   REG   0,18        0 24967 /run/xtables.lock
COMMAND           PID USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME
iptables-restor 19199 root    3r   REG   0,18        0 24967 /run/xtables.lock
COMMAND           PID USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME
iptables-restor 19279 root    3r   REG   0,18        0 24967 /run/xtables.lock
COMMAND           PID USER   FD   TYPE DEVICE SIZE/OFF  NODE NAME
iptables-restor 19612 root    3rW  REG   0,18        0 24967 /run/xtables.lock

And no "Resource temporarily unavailable (exit status 4)" in node log.

Comment 8 errata-xmlrpc 2017-09-07 19:13:34 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2670