Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 533367

Summary: [RHEL5] Segfault after DNS name resolution
Product: Red Hat Enterprise Linux 5 Reporter: Tomas Smetana <tsmetana>
Component: glibcAssignee: Andreas Schwab <schwab>
Status: CLOSED ERRATA QA Contact: qe-baseos-tools-bugs
Severity: high Docs Contact:
Priority: high    
Version: 5.4CC: drepper, ebachalo, fweimer, jbardin, mlichvar, pmuller, rvokal, spoyarek
Target Milestone: rc   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: glibc-2.5-54 Doc Type: Bug Fix
Doc Text:
Prior to this update, a DNS resolver could fail to report an appropriate error when the supplied buffer was too small. This resulted in a truncated response instead of asking the caller to resize the buffer and try again. With this update, small buffers are handled correctly and the DNS resolver no longer fails.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-01-14 00:03:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Proposed patch
none
simple reproducer none

Description Tomas Smetana 2009-11-06 11:53:10 UTC
Description of problem:
Sendmail uses gethostbyname(3) function call to resolve host names and tries to get the IPv6 address type first.  It might happen that a hostname is valid but has no address of the specified type.  Sendmail doesn't count with such a situation and segfaults in the makeconnection() function.

The backtrace of one of the crashes looks like this:
(gdb) bt
#0  0x00cb7f7d in strchr () from /lib/i686/nosegneg/libc.so.6
#1  0x003c02fe in makeconnection (host=0x46ca60 "123456789.com.", port=0, mci=0x8114c7c, e=0x47a5a0, enough=0) at daemon.c:2383
#2  0x003c93ba in deliver (e=0x47a5a0, firstto=0x810fe7c) at deliver.c:2136
#3  0x003cd019 in sendenvelope (e=0x47a5a0, mode=105) at deliver.c:919
#4  0x003cdd74 in sendall (e=0x47a5a0, mode=105) at deliver.c:765
#5  0x004039ab in dowork (qgrp=0, qdir=0, id=0x810e91a "nA2K71Gn032210", forkflag=0, requeueflag=0, e=0x47a5a0) at queue.c:3662
#6  0x00403f6f in runner_work (e=0x47a5a0, sequenceno=1, didfork=0, skip=1, njobs=1) at queue.c:1813
#7  0x004049f1 in run_work_group (wgrp=0, flags=17) at queue.c:2268
#8  0x0040505f in runqueue (forkflag=1, verbose=0, persistent=0, runall=1) at queue.c:1533
#9  0x003af369 in main (argc=3, argv=0xbfee6830, envp=0xbfee6834) at main.c:2363

The error comes from line 2383 in daemon.c:
2381 #if NETINET6
2382 ⋅   ⋅     case AF_INET6:
2383 ⋅   ⋅   ⋅   memmove(&addr.sin6.sin6_addr,
2384 ⋅   ⋅   ⋅   ⋅   hp->h_addr,
2385 ⋅   ⋅   ⋅   ⋅   IN6ADDRSZ);
2386 ⋅   ⋅   ⋅   break;

I think the problem is that the address list for IPv6 is empty:
(gdb) print hp->h_addr_list[0]
$76 = 0x0

Version-Release number of selected component (if applicable):
sendmail-8.13.8-2.el5

How reproducible:
Always

Steps to Reproduce:
1. telnet localhost 25
Trying 127.0.0.1...
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
220 localhost.localdomain ESMTP Sendmail 8.13.8/8.13.8; Fri, 6 Nov 2009 12:14:58 +0100
mail from: root@localhost
250 2.1.0 root@localhost... Sender ok
rcpt to: hope_this_doesnt_exist
250 2.1.5 hope_this_doesnt_exist... Recipient ok
data
354 Enter mail, end with "." on a line by itself
test
.
250 2.0.0 nA6BEweJ002418 Message accepted for delivery
quit

2. sendmail -v -q

Actual results:
Running /var/spool/mqueue/nA6BEweJ002418  (sequence 1 of 3)
hope_this_doesnt_exist... Connecting to goodtimesdot.com. via esmtp...
Segmentation fault

Expected results:
Mail gets sent or an eventual error is cleanly handled.

Additional info:
I was using goodtimesdot.com domain for the tests.  This domain has a long DNS record which needs to be resend via TCP.  Not sure whether this is the sufficient condition for the error to occur.

Comment 1 Tomas Smetana 2009-11-06 11:56:49 UTC
Created attachment 367818 [details]
Proposed patch

This looks to fix the reproducer in my testing environment.  Basically it just tries to detect the empty IPv6 address list and re-run the query for IPv4 if needed.  The patch is not very pretty but shows where the problem is.

Comment 2 Tomas Smetana 2009-11-06 13:39:51 UTC
Just a note.  This is probably more a problem of glibc -- the gethostbyname(3) function is deprecated and it's not working well with IPv6.  The sole fact that the (hp && !hp->h_addr) condition can be true is not quite OK.  I think the correct way of solving the issue would be to replace gethostby* functions by getaddrinfo in sendmail.

Comment 4 Tomas Smetana 2009-11-09 10:57:36 UTC
Sorry, I forgot to mention that the problem is reproducible only on 32-bit x86.

Comment 5 Miroslav Lichvar 2009-11-09 12:15:21 UTC
Ok, this is really better to fix in glibc. The gethostbyname call should return NULL if there is no address for the name instead of the empty list.

It seems to happen only when the DNS response doesn't fit in UDP packet.

Comment 6 Andreas Schwab 2009-11-09 14:48:14 UTC
I cannot reproduce that.

Comment 7 Miroslav Lichvar 2009-11-09 15:29:18 UTC
Created attachment 368242 [details]
simple reproducer

Comment 8 Tomas Smetana 2009-11-09 15:36:45 UTC
Hi Andreas, as for the original problem I can't add anything else than what is in the comment #0 (really needs to be reproduced on i386).  Mirek had told me he had a simpler reproducer so I asked him to post it here (comment #7).

Regards.

Comment 9 Siddhesh Poyarekar 2009-11-10 13:34:16 UTC
The problem happens only when all of the following is true:

0) arch is i386
1) response is larger than what would fit in a single dns packet
2) the request is IPv6
3) The dns query is done with tcp (either with a retry or due to RES_USEVC)

So if one uses RES_IGNTC in _res.options, this works fine. The result comes out as an IPv4 address formatted as IPv6 -- I'm not sure that is correct either since it's actually returning an IPv4 address. So maybe h_addr_type should be updated?

Comment 10 Ulrich Drepper 2009-11-10 13:40:44 UTC
What do you expect from the test case?  It fails with return value 1 as I think it should.  This is with x86-64 and x86 on F12.

Comment 11 Siddhesh Poyarekar 2009-11-10 17:56:10 UTC
The problem is on RHEL-5.4. The difference is in the value returned in hostent under a number of conditions.

1) On x86_64 returns an ipv4 address list inside the hostent object
2) on x86 it returns an empty address list inside the hostent object
3) On x86 with RES_IGNTC it returns the ipv4 list inside the hostent object

In all the above cases h_errno is set to 1, so this could be worked around by checking h_errno regardless of the value of hostent. But this breaks applications that assume that a non-null hostent means successful name lookup. The man page sort of leads one to think that way, so I assume there must be a number of such applications out there.

gethostbyname seems to behave consistently on my F 11 box for x86 as well as x86_64; fails with h_errno=2 *and* a NULL hostent. But I had got the test case to crash on another F 11 x86 box, so something must have been fixed in F-11.

Comment 12 Siddhesh Poyarekar 2009-11-12 06:12:22 UTC
Correction to the F-11 observation I made in comment 11. It does segfault on x86 and gives an incorrect result on x86_64. In both cases h_errno is set to 1 despite the contents of the returned hostent object.

My previous observations were probably a result of some DNS server problems since I was testing remotely.

Comment 14 Andreas Schwab 2009-12-14 17:18:15 UTC
*** Bug 545160 has been marked as a duplicate of this bug. ***

Comment 18 Martin Prpič 2010-12-02 11:17:45 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Prior to this update, a DNS resolver could fail to report an appropriate error when the supplied buffer was too small. This resulted in a truncated response instead of asking the caller to resize the buffer and try again. With this update, small buffers are handled correctly and the DNS resolver no longer fails.

Comment 20 errata-xmlrpc 2011-01-14 00:03:24 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0109.html