Bug 1156200

Summary: oo-admin-ctl-iptables-port-proxy is needlessly slow under DNS failures
Product: OpenShift Container Platform Reporter: Luke Meyer <lmeyer>
Component: ContainersAssignee: Luke Meyer <lmeyer>
Status: CLOSED ERRATA QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: high    
Version: 2.2.0CC: anli, jialiu, jokerman, libra-onpremise-devel, mmccomas
Target Milestone: ---Keywords: EasyFix
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: rubygem-openshift-origin-node-1.31.3.5-1.el6op Doc Type: Bug Fix
Doc Text:
Cause: oo-admin-ctl-iptables-port-proxy uses the "iptables -L" command in several places, which by default attempts to reverse-resolve the IPs listed to hostnames. Since most of the IPs on an OSE node are internal and have no hostname associated, all configured nameservers may be consulted once for each rule in the tables. Consequence: If all nameservers are appropriately configured, "service openshift-iptables-port-proxy restart" tends to take a few seconds. If any nameservers are unreachable or slow in responding, resolving several thousand internal IPs was reported to take over an hour. Fix: Add the -n flag to iptables so that it does not attempt to reverse-resolve IPs, which was unnecessary to begin with. Result: "service openshift-iptables-port-proxy restart" completes in sub-second timing under either condition.
Story Points: ---
Clone Of:
: 1159997 (view as bug list) Environment:
Last Closed: 2014-11-03 19:55:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1159997    

Description Luke Meyer 2014-10-23 19:30:27 UTC
Description of problem:
If there is an unreachable DNS server in /etc/resolv.conf, "iptables -L" is extremely slow trying to reverse-resolve the IPs. This is run several places in oo-admin-ctl-iptables-port-proxy where we don't appear to need the reverse-resolve and could get effectively the same results with "iptables -nL" without trying to resolve anything. As it is this takes an extraordinarily long time to run under these conditions.

Comment 1 Anping Li 2014-10-24 00:08:57 UTC
The issue can be reproduce as following.
1. find one ose-2.2 openshift Env.
2. Enable district for the node
3. on node, enable firewall for gears
   time oo-gear-firewall -i enable -s enable
   (hint: the command may last for hours if there is unreadable DNS.)
4. add an unreachable DSN server in /etc/resolv.conf
5. run iptables -L (The command is very slow with unreachable DSN server  )

Comment 2 Luke Meyer 2014-10-28 17:03:20 UTC
https://github.com/openshift/origin-server/pull/5912 in Origin - even on a devenv with nothing wrong, it saved a second or two restarting the service. Add a bogus DNS server to the top of /etc/resolv.conf and it was a longer difference, like 15-30 seconds. Depending on how long it takes to give up on the DNS server each time, this could make a big difference.

Comment 3 openshift-github-bot 2014-10-28 17:56:50 UTC
Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/57d46aa5d9e14cc54f3f64052a32c7c96947110a
iptables-port-proxy: use -n on iptables -L

Since there's no evident need to have iptables perform reverse DNS
resolution on the entries in the tables, don't do that. It saves time
even when nothing is wrong with DNS config.

Bug 1156200 - oo-admin-ctl-iptables-port-proxy is needlessly slow under DNS failures
https://bugzilla.redhat.com/show_bug.cgi?id=1156200

Comment 6 Anping Li 2014-10-29 03:05:03 UTC
Verfied and pass on OSE-2.2 puddle 2014-10-28.1

[root@node2 ~]# time service openshift-iptables-port-proxy restart
192.168.0.2
192.168.0.2/32

real	0m0.132s
user	0m0.039s
sys	0m0.045s

Comment 8 errata-xmlrpc 2014-11-03 19:55:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2014-1796.html