1174469 – whois lookups hanging with high CPU usage

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1174469 - whois lookups hanging with high CPU usage

Summary: whois lookups hanging with high CPU usage

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	jwhois
Sub Component:
Version:	6.6
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Vitezslav Crhonek
QA Contact:	BaseOS QE - Apps
Docs Contact:
URL:
Whiteboard:
Depends On:	469412
Blocks:
TreeView+	depends on / blocked

Reported:	2014-12-15 21:56 UTC by Orion Poplawski
Modified:	2017-12-06 10:35 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	469412
Environment:
Last Closed:	2017-12-06 10:35:45 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Orion Poplawski 2014-12-15 21:56:08 UTC

+++ This bug was initially created as a clone of Bug #469412 +++

Description of problem:
I am having intermittent problems with whois lookups hanging, and using large amounts of CPU time when they do. This seems to affect looking up against different servers, but doesn't happen all of the time - but when it is happening, it affects a high proportion of the lookups I do. The problem is not reproducible at will but happens to me on a fairly regular basis - perhaps every few weeks for a period of a few days.

My machine makes several tens of whois lookups on IP addresses per day (I'm using Fail2ban with an action to complain to the ISPs of offending IPs - http://www.gloomytrousers.co.uk/open_source/fail2ban.shtml) so I'm guessing I may occasionally be falling foul of rate-limiting for whois lookups against particular servers - but I would not expect whois to hang and consume excessive CPU in this case. I'm guessing it may be a busy-waiting for a response, and this is only noticeable when a response does not come quickly.

During periods when it's happening, I can reproduce it by running the same lookups manually, and I can also perform other, apparently unrelated, lookups, some of which also fail and some succeed - against domain names as well as IP addresses.

Here's some results from top showing hung lookups from yesterday and today (the IP addresses are sources of spam, SSH brute force attacks, and the like; the last is a manual lookup):
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                 
21068 root      20   0 89860  904  676 R 72.1  0.0 164:20.51 whois 125.167.105.130

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                 
  700 root      20   0 89860 1048  820 R 25.1  0.1  31:35.02 whois 211.214.161.93
26946 root      20   0 89860 1048  820 R 23.4  0.1 244:51.37 whois 122.167.13.13
30168 root      20   0 89860 1048  820 R 21.7  0.1 100:07.98 whois 222.208.183.218

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                           
 5268 root      20   0 89860 1040  796 R 80.6  0.1   1:50.47 whois gloomytrousers.co.uk
 
I can perform the same lookups on another host (running CentOS4 and jwhois-3.2.2-6.EL4.1.i386) with no problems, and this host has never experienced the same problem of hung lookups despite running fail2ban in an identical config.

Version-Release number of selected component (if applicable):
[root@detritus ~]# rpm -q jwhois
jwhois-4.0-4.fc8.x86_64
[root@detritus ~]# uname -a
Linux detritus.local 2.6.25.11-60.fc8 #1 SMP Mon Jul 21 01:40:51 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:
Intermittent, periodically easy to reproduce, but long periods with no occurrence.

Steps to Reproduce:
1. (unable to determine the conditions which cause this)
  
Actual results:
whois lookups hang using high CPU

Expected results:
Lookup should either succeed, or fail and exit, and not use excessive CPU while running/waiting for a response.

Additional info:
I'd like to collect some additional info to aid tracking this one down - anyone got any suggestions for how to collect anything which might help?

I've tried setting "connect-timeout" in /etc/jwhois.conf but it made no difference.

--- Additional comment from Russell Odom on 2008-11-01 05:47:06 EDT ---

OK, it appears to have started behaving again - the above lookups are working once again.

By way of a test, I tried this...

[root@detritus ~]# host whois.nic.uk
whois.nic.uk has address 213.248.210.12
[root@detritus ~]# iptables -I OUTPUT --dst 213.248.210.12 -j DROP
[root@detritus ~]# whois gloomytrousers.co.uk
[Querying whois.nic.uk]
^C

...and although the lookup hung, as expected, CPU usage was 0.

--- Additional comment from Joshua Roys on 2009-04-29 11:57:56 EDT ---

... because it sits in a do/while loop calling read() on a non-blocking socket.

This appears to fix it for me.

--- Additional comment from Joshua Roys on 2009-04-29 11:59:24 EDT ---

Because I can and hopefully it makes it easier for whoever may need to apply it.

--- Additional comment from Michael Breuer on 2010-01-08 02:32:52 EST ---

FWIW, I'm seeing this too - but only when IPV6 is involved. 
From strace whois 74.220.121.126:
<normal stuff I suppose>
connect(3, {sa_family=AF_INET, sin_port=htons(43), sin_addr=inet_addr("199.212.0.43")}, 16) = 0
getsockname(3, {sa_family=AF_INET6, sin6_port=htons(40891), inet_pton(AF_INET6, "::ffff:68.192.13.200", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0
close(3)                                = 0
socket(PF_INET6, SOCK_STREAM, IPPROTO_TCP) = 3
fcntl(3, F_GETFL)                       = 0x2 (flags O_RDWR)
fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK)    = 0
connect(3, {sa_family=AF_INET6, sin6_port=htons(43), inet_pton(AF_INET6, "2001:500:4:1::81", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EINPROGRESS (Operation now in progress)
select(1024, NULL, [3], NULL, {75, 0})  = 1 (out [3], left {74, 979698})
getsockopt(3, SOL_SOCKET, SO_ERROR, [-4985675827644465152], [4]) = 0
write(3, "74.220.121.126\r\n", 16)      = 16
read(3, 0x7fffbacf4d30, 1023)           = -1 EAGAIN (Resource temporarily unavailable)
read(3, 0x7fffbacf4d30, 1023)           = -1 EAGAIN (Resource temporarily unavailable)
<continues forever>

--- Additional comment from Fedora Update System on 2010-01-26 10:03:56 EST ---

jwhois-4.0-19.fc12 has been submitted as an update for Fedora 12.
http://admin.fedoraproject.org/updates/jwhois-4.0-19.fc12

--- Additional comment from Fedora Update System on 2010-01-27 20:03:40 EST ---

jwhois-4.0-19.fc12 has been pushed to the Fedora 12 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update jwhois'.  You can provide feedback for this update here: http://admin.fedoraproject.org/updates/F12/FEDORA-2010-1172

--- Additional comment from David J. Schwartz on 2010-01-29 19:08:46 EST ---

Fedora 11 has this bug in its latest jwhois package. I believe the best fix is simply to turn off non-blocking mode after connecting. I suggest the following patch. (I will try to contact the jwhois maintainers too.)

--- jwhois-4.0/src/utils_old.c  2010-01-29 16:00:21.261869369 -0800
+++ jwhois-4.0/src/utils.c      2010-01-29 16:00:43.007869124 -0800
@@ -298,6 +298,11 @@ make_connect(const char *host, int port)
       break;
     }
 #endif
+  flags = fcntl(sockfd, F_GETFL, 0);
+  if (fcntl(sockfd, F_SETFL, flags&~O_NONBLOCK) == -1)
+    {
+      return -1;
+    }
 
   return sockfd;
 }

--- Additional comment from Vitezslav Crhonek on 2010-02-01 06:55:53 EST ---

(In reply to comment #15)
> Fedora 11 has this bug in its latest jwhois package. I believe the best fix is
> simply to turn off non-blocking mode after connecting. I suggest the following
> patch. (I will try to contact the jwhois maintainers too.)
> 
> --- jwhois-4.0/src/utils_old.c  2010-01-29 16:00:21.261869369 -0800
> +++ jwhois-4.0/src/utils.c      2010-01-29 16:00:43.007869124 -0800
> @@ -298,6 +298,11 @@ make_connect(const char *host, int port)
>        break;
>      }
>  #endif
> +  flags = fcntl(sockfd, F_GETFL, 0);
> +  if (fcntl(sockfd, F_SETFL, flags&~O_NONBLOCK) == -1)
> +    {
> +      return -1;
> +    }
> 
>    return sockfd;
>  }    

Did you try jwhois-4.0-14.fc11 from testing repository? It should be fixed there...

--- Additional comment from Fedora Update System on 2010-02-11 09:46:15 EST ---

jwhois-4.0-19.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.

--- Additional comment from Kai on 2012-12-21 14:18:39 EST ---

I'm having the very same issue on CentOS 6.3 with jwhois 4.0-19.el6, attached strace output.

Comment 1 Orion Poplawski 2014-12-15 21:57:40 UTC

Looks like jwhois-4.0-19.el6.x86_64 is missing this fix, and possibly others.  I'm seeing it effectively shutdown fail2ban while it waits for whois to complete.

Comment 3 Vitezslav Crhonek 2015-01-19 15:09:26 UTC

(In reply to Orion Poplawski from comment #1)
> Looks like jwhois-4.0-19.el6.x86_64 is missing this fix, and possibly
> others.  I'm seeing it effectively shutdown fail2ban while it waits for
> whois to complete.

Yes, it's missing this patch (named 'jwhois-4.0-select.patch' in newer releases) - this one should suffice to fix the issue.

Comment 5 Jan Kurik 2017-12-06 10:35:45 UTC

Red Hat Enterprise Linux 6 is in the Production 3 Phase. During the Production 3 Phase, Critical impact Security Advisories (RHSAs) and selected Urgent Priority Bug Fix Advisories (RHBAs) may be released as they become available.

The official life cycle policy can be reviewed here:

http://redhat.com/rhel/lifecycle

This issue does not meet the inclusion criteria for the Production 3 Phase and will be marked as CLOSED/WONTFIX. If this remains a critical requirement, please contact Red Hat Customer Support to request a re-evaluation of the issue, citing a clear business justification. Note that a strong business justification will be required for re-evaluation. Red Hat Customer Support can be contacted via the Red Hat Customer Portal at the following URL:

https://access.redhat.com/

Note You need to log in before you can comment on or make changes to this bug.