Bug 1269454
| Summary: | openshift-node should wait for xtables lock to be released | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Evgheni Dereveanchin <ederevea> |
| Component: | Containers | Assignee: | Paul Weil <pweil> |
| Status: | CLOSED DUPLICATE | QA Contact: | Chao Yang <chaoyang> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 3.0.0 | CC: | aos-bugs, jokerman, mmccomas, pep |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-10-14 11:31:48 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
*** This bug has been marked as a duplicate of bug 1267670 *** |
Description of problem: openshift-node uses the iptables binary to set up rules at boot. Since RHEL 7.1 there is a locking mechanism to protect against two instances of iptables running simultaneously. Currently in the event of locking the process will fail producing inconsistent rulesets. This can happen when another script (firewalld/iptables/etc) is started at the same time as openshift-node Version-Release number of selected component (if applicable): 3.0.2 How reproducible: rarely in conditions when some other script is starting at the same time as openshift-node and using the iptables binary Steps to Reproduce: no clear reproducer at the moment. In some configuration another script may be running which also invokes iptables. Actual results: Sep 30 15:15:15 node1.demo.lan openshift-node[2587]: ++ iptables -nvL INPUT --line-numbers Sep 30 15:46:27 node1.demo.lan openshift-node[2587]: ++ grep 'state RELATED,ESTABLISHED' Sep 30 15:46:27 node1.demo.lan openshift-node[2587]: Another app is currently holding the xtables lock. Perhaps you want to use the -w option? Sep 30 15:46:27 node1.demo.lan openshift-node[2587]: ++ awk '{print $1}' Sep 30 15:46:27 node1.demo.lan openshift-node[2587]: + lineno= Sep 30 15:46:27 node1.demo.lan openshift-node[2587]: + iptables -I INPUT -p udp -m multiport --dports 4789 -m comment --comment '001 vxlan incoming' -j ACCEPT Sep 30 15:46:27 node1.demo.lan openshift-node[2587]: Another app is currently holding the xtables lock. Perhaps you want to use the -w option? Sep 30 15:46:27 node1.demo.lan openshift-node[2587]: E0930 15:46:27.927232 2587 kube.go:39] Error executing setup script. Expected results: Sep 30 15:15:15 node1.demo.lan openshift-node[2200]: ++ iptables -nvL INPUT --line-numbers Sep 30 15:15:15 node1.demo.lan openshift-node[2200]: ++ grep 'state RELATED,ESTABLISHED' Sep 30 15:15:15 node1.demo.lan openshift-node[2200]: ++ awk '{print $1}' Sep 30 15:15:15 node1.demo.lan openshift-node[2200]: + lineno=1 Sep 30 15:15:15 node1.demo.lan openshift-node[2200]: + iptables -I INPUT 1 -p udp -m multiport --dports 4789 -m comment --comment '001 vxlan incoming' -j ACCEPT Sep 30 15:15:15 node1.demo.lan openshift-node[2200]: + iptables -I INPUT 2 -i tun0 -m comment --comment 'traffic from docker for internet' -j ACCEPT Additional info: Note that due to locking listing and consecutive parsing goes wrong so that lineno is not set which breaks consecutive rules. A solution (as noted in the error message) is to add a -w option to wait for the lock to be released or -w2 to wait for 2 seconds and then fail. Another option would be to catch these types of errors and wait or stop processing completely instead of trying to run incorrect iptables commands