Bug 104733 - Most programs hang on DNS lookups
Summary: Most programs hang on DNS lookups
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Raw Hide
Classification: Retired
Component: glibc
Version: 1.0
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2003-09-19 21:32 UTC by Tom Lofgren
Modified: 2016-11-24 15:05 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-09-28 06:12:23 UTC
Embargoed:


Attachments (Terms of Use)

Description Tom Lofgren 2003-09-19 21:32:53 UTC
Description of problem:
Lots of applications hang on DNS lookups.  Most applications that do DNS queries
are affected, such as dig, nslookup, mozilla, etc.  Interestingly enough, the
telnet and ssh command-line clients work fine.

Version-Release number of selected component (if applicable):
glibc-2.3.2-82.i686.rpm, but I believe I saw it with a slightly older rawhide
versions also.  The one in stock RH 9, glibc-2.3.2-11.9.i686.rpm is good.

How reproducible:
I can reproduce it every time on my machine.  In fact, I can't get the programs
to *not* hang 

Steps to Reproduce:
1. dig www.cnn.com
    
Actual results:
No output: process is hung.

Expected results:
Regular dig output of IP addresses.

Additional info:
Here is the last part of an strace of the dig:

open("/etc/resolv.conf", O_RDONLY)      = 5
fstat64(5, {st_mode=S_IFREG|0644, st_size=105, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x40016000
read(5, "; generated by /sbin/dhclient-sc"..., 4096) = 105
read(5, "", 4096)                       = 0
close(5)                                = 0
munmap(0x40016000, 4096)                = 0
rt_sigaction(SIGHUP, {0x402111d0, ~[RTMIN], SA_RESTORER, 0x402479c8}, NULL, 8) = 0
rt_sigsuspend([] <unfinished ...>

So it hangs in rt_sigsuspend().  When I first googled for this, I came across
 http://www.ussg.iu.edu/hypermail/linux/kernel/0308.0/0742.html
This was what made me downgrade glibc and try again.  Using the stock RH9,
everything works fine.  It seems to me that this is an old bug in the threading
code that has been reintroduced, but I'll let you guys draw the conclusions.

It also striked me as a bit odd that no one else has noticed this, since a
commonly used app like mozilla won't run.  Maybe it's something that's specific
to my configuration?

Comment 1 Tom Lofgren 2003-09-22 18:05:23 UTC
I forgot to mention that just like in the case of the issue mentioned on the
linux kernel list (linked), if I hit Ctrl-Z and either fg or bg the job after
that, it temporarily unhangs.  dig, for example, then actually returns the ip
addresses, and then hangs again (where one would expect it to exit.  Mozilla
loads the page, and then hangs.  Subsequent attempts of doing the same thing
seem ineffective.


Comment 2 Ulrich Drepper 2004-09-28 06:12:23 UTC
There have been several bugs fixed in the thread library.  I cannot
say whether you'll find your problems are gone since I never had such
problems myself and nobody else reported anything like this.  And the
fact that your strace shows the use of SIGHUP makes it unlikely that
this is a libpthread problem: we do not use SIGHUP in libpthread.

Anyway, try the code in FC3t2 or maybe the last FC2 update.  If there
is still a problem, reopen the bug with some more details and
determine where the process got stuck.  As mentioned above, I do not
think this call got stuck in libpthread or anywhere in libc.


Note You need to log in before you can comment on or make changes to this bug.