Bug 1269454 - openshift-node should wait for xtables lock to be released
Summary: openshift-node should wait for xtables lock to be released
Keywords:
Status: CLOSED DUPLICATE of bug 1267670
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Containers
Version: 3.0.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Paul Weil
QA Contact: Chao Yang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-10-07 11:53 UTC by Evgheni Dereveanchin
Modified: 2019-08-15 05:37 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-10-14 11:31:48 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Evgheni Dereveanchin 2015-10-07 11:53:16 UTC
Description of problem:
openshift-node uses the iptables binary to set up rules at boot. Since RHEL 7.1 there is a locking mechanism to protect against two instances of iptables running simultaneously. Currently in the event of locking the process will fail producing inconsistent rulesets. This can happen when another script (firewalld/iptables/etc) is started at the same time as openshift-node

Version-Release number of selected component (if applicable):
3.0.2

How reproducible:
rarely in conditions when some other script is starting at the same time as openshift-node and using the iptables binary

Steps to Reproduce:
no clear reproducer at the moment. In some configuration another script may be running which also invokes iptables.

Actual results:

Sep 30 15:15:15 node1.demo.lan openshift-node[2587]: ++ iptables -nvL INPUT --line-numbers
Sep 30 15:46:27 node1.demo.lan openshift-node[2587]: ++ grep 'state RELATED,ESTABLISHED'
Sep 30 15:46:27 node1.demo.lan openshift-node[2587]: Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
Sep 30 15:46:27 node1.demo.lan openshift-node[2587]: ++ awk '{print $1}'
Sep 30 15:46:27 node1.demo.lan openshift-node[2587]: + lineno=
Sep 30 15:46:27 node1.demo.lan openshift-node[2587]: + iptables -I INPUT -p udp -m multiport --dports 4789 -m comment --comment '001 vxlan incoming' -j ACCEPT
Sep 30 15:46:27 node1.demo.lan openshift-node[2587]: Another app is currently holding the xtables lock. Perhaps you want to use the -w option?
Sep 30 15:46:27 node1.demo.lan openshift-node[2587]: E0930 15:46:27.927232    2587 kube.go:39] Error executing setup script.

Expected results:

Sep 30 15:15:15 node1.demo.lan openshift-node[2200]: ++ iptables -nvL INPUT --line-numbers
Sep 30 15:15:15 node1.demo.lan openshift-node[2200]: ++ grep 'state RELATED,ESTABLISHED'
Sep 30 15:15:15 node1.demo.lan openshift-node[2200]: ++ awk '{print $1}'
Sep 30 15:15:15 node1.demo.lan openshift-node[2200]: + lineno=1
Sep 30 15:15:15 node1.demo.lan openshift-node[2200]: + iptables -I INPUT 1 -p udp -m multiport --dports 4789 -m comment --comment '001 vxlan incoming' -j ACCEPT
Sep 30 15:15:15 node1.demo.lan openshift-node[2200]: + iptables -I INPUT 2 -i tun0 -m comment --comment 'traffic from docker for internet' -j ACCEPT

Additional info:

Note that due to locking listing and consecutive parsing goes wrong so that lineno is not set which breaks consecutive rules.
A solution (as noted in the error message) is to add a -w option to wait for the lock to be released or -w2 to wait for 2 seconds and then fail. Another option would be to catch these types of errors and wait or stop processing completely instead of trying to run incorrect iptables commands

Comment 1 Josep 'Pep' Turro Mauri 2015-10-14 11:31:48 UTC

*** This bug has been marked as a duplicate of bug 1267670 ***


Note You need to log in before you can comment on or make changes to this bug.