Bug 104733

Summary: Most programs hang on DNS lookups
Product: [Retired] Red Hat Raw Hide Reporter: Tom Lofgren <rh-bugs>
Component: glibcAssignee: Jakub Jelinek <jakub>
Status: CLOSED WORKSFORME QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 1.0CC: fweimer
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-09-28 06:12:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tom Lofgren 2003-09-19 21:32:53 UTC
Description of problem:
Lots of applications hang on DNS lookups.  Most applications that do DNS queries
are affected, such as dig, nslookup, mozilla, etc.  Interestingly enough, the
telnet and ssh command-line clients work fine.

Version-Release number of selected component (if applicable):
glibc-2.3.2-82.i686.rpm, but I believe I saw it with a slightly older rawhide
versions also.  The one in stock RH 9, glibc-2.3.2-11.9.i686.rpm is good.

How reproducible:
I can reproduce it every time on my machine.  In fact, I can't get the programs
to *not* hang 

Steps to Reproduce:
1. dig www.cnn.com
    
Actual results:
No output: process is hung.

Expected results:
Regular dig output of IP addresses.

Additional info:
Here is the last part of an strace of the dig:

open("/etc/resolv.conf", O_RDONLY)      = 5
fstat64(5, {st_mode=S_IFREG|0644, st_size=105, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x40016000
read(5, "; generated by /sbin/dhclient-sc"..., 4096) = 105
read(5, "", 4096)                       = 0
close(5)                                = 0
munmap(0x40016000, 4096)                = 0
rt_sigaction(SIGHUP, {0x402111d0, ~[RTMIN], SA_RESTORER, 0x402479c8}, NULL, 8) = 0
rt_sigsuspend([] <unfinished ...>

So it hangs in rt_sigsuspend().  When I first googled for this, I came across
 http://www.ussg.iu.edu/hypermail/linux/kernel/0308.0/0742.html
This was what made me downgrade glibc and try again.  Using the stock RH9,
everything works fine.  It seems to me that this is an old bug in the threading
code that has been reintroduced, but I'll let you guys draw the conclusions.

It also striked me as a bit odd that no one else has noticed this, since a
commonly used app like mozilla won't run.  Maybe it's something that's specific
to my configuration?

Comment 1 Tom Lofgren 2003-09-22 18:05:23 UTC
I forgot to mention that just like in the case of the issue mentioned on the
linux kernel list (linked), if I hit Ctrl-Z and either fg or bg the job after
that, it temporarily unhangs.  dig, for example, then actually returns the ip
addresses, and then hangs again (where one would expect it to exit.  Mozilla
loads the page, and then hangs.  Subsequent attempts of doing the same thing
seem ineffective.


Comment 2 Ulrich Drepper 2004-09-28 06:12:23 UTC
There have been several bugs fixed in the thread library.  I cannot
say whether you'll find your problems are gone since I never had such
problems myself and nobody else reported anything like this.  And the
fact that your strace shows the use of SIGHUP makes it unlikely that
this is a libpthread problem: we do not use SIGHUP in libpthread.

Anyway, try the code in FC3t2 or maybe the last FC2 update.  If there
is still a problem, reopen the bug with some more details and
determine where the process got stuck.  As mentioned above, I do not
think this call got stuck in libpthread or anywhere in libc.