Bug 1156200

Summary:	oo-admin-ctl-iptables-port-proxy is needlessly slow under DNS failures
Product:	OpenShift Container Platform	Reporter:	Luke Meyer <lmeyer>
Component:	Containers	Assignee:	Luke Meyer <lmeyer>
Status:	CLOSED ERRATA	QA Contact:	libra bugs <libra-bugs>
Severity:	medium	Docs Contact:
Priority:	high
Version:	2.2.0	CC:	anli, jialiu, jokerman, libra-onpremise-devel, mmccomas
Target Milestone:	---	Keywords:	EasyFix
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	rubygem-openshift-origin-node-1.31.3.5-1.el6op	Doc Type:	Bug Fix
Doc Text:	Cause: oo-admin-ctl-iptables-port-proxy uses the "iptables -L" command in several places, which by default attempts to reverse-resolve the IPs listed to hostnames. Since most of the IPs on an OSE node are internal and have no hostname associated, all configured nameservers may be consulted once for each rule in the tables. Consequence: If all nameservers are appropriately configured, "service openshift-iptables-port-proxy restart" tends to take a few seconds. If any nameservers are unreachable or slow in responding, resolving several thousand internal IPs was reported to take over an hour. Fix: Add the -n flag to iptables so that it does not attempt to reverse-resolve IPs, which was unnecessary to begin with. Result: "service openshift-iptables-port-proxy restart" completes in sub-second timing under either condition.	Story Points:	---
Clone Of:
Clones:	1159997 (view as bug list)		Environment:
Last Closed:	2014-11-03 19:55:43 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1159997

Description Luke Meyer 2014-10-23 19:30:27 UTC

Description of problem:
If there is an unreachable DNS server in /etc/resolv.conf, "iptables -L" is extremely slow trying to reverse-resolve the IPs. This is run several places in oo-admin-ctl-iptables-port-proxy where we don't appear to need the reverse-resolve and could get effectively the same results with "iptables -nL" without trying to resolve anything. As it is this takes an extraordinarily long time to run under these conditions.

Comment 1 Anping Li 2014-10-24 00:08:57 UTC

The issue can be reproduce as following.
1. find one ose-2.2 openshift Env.
2. Enable district for the node
3. on node, enable firewall for gears
   time oo-gear-firewall -i enable -s enable
   (hint: the command may last for hours if there is unreadable DNS.)
4. add an unreachable DSN server in /etc/resolv.conf
5. run iptables -L (The command is very slow with unreachable DSN server  )

Comment 2 Luke Meyer 2014-10-28 17:03:20 UTC

https://github.com/openshift/origin-server/pull/5912 in Origin - even on a devenv with nothing wrong, it saved a second or two restarting the service. Add a bogus DNS server to the top of /etc/resolv.conf and it was a longer difference, like 15-30 seconds. Depending on how long it takes to give up on the DNS server each time, this could make a big difference.

Comment 3 openshift-github-bot 2014-10-28 17:56:50 UTC

Commit pushed to master at https://github.com/openshift/origin-server

https://github.com/openshift/origin-server/commit/57d46aa5d9e14cc54f3f64052a32c7c96947110a
iptables-port-proxy: use -n on iptables -L

Since there's no evident need to have iptables perform reverse DNS
resolution on the entries in the tables, don't do that. It saves time
even when nothing is wrong with DNS config.

Bug 1156200 - oo-admin-ctl-iptables-port-proxy is needlessly slow under DNS failures
https://bugzilla.redhat.com/show_bug.cgi?id=1156200

Comment 6 Anping Li 2014-10-29 03:05:03 UTC

Verfied and pass on OSE-2.2 puddle 2014-10-28.1

[root@node2 ~]# time service openshift-iptables-port-proxy restart
192.168.0.2
192.168.0.2/32

real	0m0.132s
user	0m0.039s
sys	0m0.045s

Comment 8 errata-xmlrpc 2014-11-03 19:55:43 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2014-1796.html