Bug 172175
Summary: | segfault in getaddrinfo() | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Joseph Shraibman <jks> | ||||
Component: | glibc | Assignee: | Jakub Jelinek <jakub> | ||||
Status: | CLOSED NOTABUG | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 4.0 | CC: | drepper | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | i686 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2005-11-03 00:44:20 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Joseph Shraibman
2005-11-01 03:13:59 UTC
How can a build a glibc rpm with debuginfo? Just grab it from ftp://people.redhat.com/jakub/glibc/2.3.4-2.13/ OK thanks. That should really be easier to find. Anyway now I have 4 different backtraces all showing the same thing. (gdb) bt #0 *__GI_getaddrinfo (name=0x7bcfffd8 "web64.miraclehosting.com", service=0x7d1fcc8a "domain", hints=0x7cb03628, pai=0x7cb03624) at ../sysdeps/posix/getaddrinfo.c:1593 #1 0x7d1f0f08 in Java_java_net_Inet6AddressImpl_lookupAllHostAddr () from /usr/local/jdk1.5.0_05/jre/lib/i386/libnet.so #2 0xb22b6838 in ?? () #3 0x7bd221dc in ?? () #4 0x7cb036b4 in ?? () #5 0x7cb036b0 in ?? () #6 0x7cb03684 in ?? () #7 0x00000000 in ?? () (gdb) bt #0 *__GI_getaddrinfo (name=0x74f80ca0 "web64.miraclehosting.com", service=0x7d3dac8a "domain", hints=0x7b0b5a28, pai=0x7b0b5a24) at ../sysdeps/posix/getaddrinfo.c:1593 #1 0x7d3cef08 in Java_java_net_Inet6AddressImpl_lookupAllHostAddr () from /usr/local/jdk1.5.0_05/jre/lib/i386/libnet.so #2 0xb296557d in ?? () #3 0x7a914ed4 in ?? () #4 0x7b0b5a88 in ?? () #5 0x7b0b5a84 in ?? () #6 0xae314830 in ?? () #7 0x82249198 in ?? () #8 0x00000000 in ?? () (gdb) bt #0 *__GI_getaddrinfo (name=0x81e0a50 "web64.miraclehosting.com", service=0x7d77ac8a "domain", hints=0x7c6646f4, pai=0x7c6646f0) at ../sysdeps/posix/getaddrinfo.c:1593 #1 0x7d76ef08 in Java_java_net_Inet6AddressImpl_lookupAllHostAddr () from /usr/local/jdk1.5.0_05/jre/lib/i386/libnet.so #2 0xb22b6838 in ?? () #3 0x7b6ea55c in ?? () #4 0x7c664780 in ?? () #5 0x7c66477c in ?? () #6 0x7c664750 in ?? () #7 0x00000000 in ?? () (gdb) bt #0 *__GI_getaddrinfo (name=0x87262d8 "web64.miraclehosting.com", service=0x7d108c8a "domain", hints=0x7a81a9ac, pai=0x7a81a9a8) at ../sysdeps/posix/getaddrinfo.c:1593 #1 0x7d0fcf08 in Java_java_net_Inet6AddressImpl_lookupAllHostAddr () from /usr/local/jdk1.5.0_05/jre/lib/i386/libnet.so #2 0xb22b6838 in ?? () #3 0x7d211f34 in ?? () #4 0x7a81aa38 in ?? () #5 0x7a81aa34 in ?? () #6 0x7a81aa08 in ?? () #7 0x00000000 in ?? () (gdb) I don't understand two things. 1) How can the memory of results not be accessable? It is declared right there on the stack. 2) Why is the segfault happen on 1593 and not the line above it? results[i].dest_addr = q; <== this is fine results[i].got_source_addr = false; <== this causes a segfault I made a small test program to try and replicate the bug, but in my test program the call to getaddrinfo() returns with EAI_SOCKTYPE (gdb) p *hints $4 = {ai_flags = 2, ai_family = 0, ai_socktype = 0, ai_protocol = 0, ai_addrlen = 0, ai_addr = 0x0, ai_canonname = 0x0, ai_next = 0x0} (gdb) p *pai $5 = (struct addrinfo *) 0x14 (gdb) p **pai $6 = {ai_flags = 0, ai_family = 0, ai_socktype = 0, ai_protocol = 0, ai_addrlen = 0, ai_addr = 0x0, ai_canonname = 0x0, ai_next = 0x0} (gdb) p i $12 = 0 (gdb) p results[i] Cannot access memory at address 0x7a812600 (gdb) p results Cannot access memory at address 0x7a812600 (gdb) p nresults $13 = 246 (gdb) p results[i].dest_addr Cannot access memory at address 0x7a812600 (gdb) p results Cannot access memory at address 0x7a812600 (gdb) list 1593 results[i].got_source_addr = false; 1594 1595 /* If we just looked up the address for a different 1596 protocol, reuse the result. */ 1597 if (last != NULL && last->ai_addrlen == q->ai_addrlen 1598 && memcmp (last->ai_addr, q->ai_addr, q->ai_addrlen) == 0) 1599 { 1600 memcpy (&results[i].source_addr, &results[i - 1].source_addr, 1601 results[i - 1].source_addr_len); 1602 results[i].source_addr_len = results[i - 1].source_addr_len; (gdb) list - 1583 { 1584 /* Sort results according to RFC 3484. */ 1585 struct sort_result results[nresults]; 1586 struct addrinfo *q; 1587 struct addrinfo *last = NULL; 1588 char *canonname = NULL; 1589 1590 for (i = 0, q = p; q != NULL; ++i, last = q, q = q->ai_next) 1591 { 1592 results[i].dest_addr = q; Created attachment 120656 [details]
small c program that does not produce the same segfault
The output of this program is:
result of call is -7
EAI_SOCKTYPE
0: name: ���
Segmentation fault
The testcase in #5 has many bugs. One is that getaddrinfo fails, *res is undefined. Another one is that: http://www.opengroup.org/onlinepubs/009695399/functions/freeaddrinfo.html " In this hints structure every member other than ai_flags, ai_family, ai_socktype, and ai_protocol shall be set to zero or a null pointer." Plus those 4 fields of course need to be set to meaningful values. As for the segfault in #4, my guess would be that the JDK calls getaddrinfo with prohibitively small thread stack. The results array is VLA, sizeof (results[0]) == 136 bytes if I count well, so for a huge number of nresults (246 in this case) that is ~ 32KB allocation on the stack. So, if JDK limits the thread stack size to 32K or smaller and calls getaddrinfo, it would obviously crash. We could consider using here __libc_use_alloca I guess (then it would only use on i?86 at most max (4KB, thread_stack_size / 4) for the array and otherwise fallback to alloca), but still I'd say that JDK is severely broken if it calls glibc functions that need a lot of stack with so limited stack size (examples would be e.g. *printf, *scanf, getaddrinfo and various others). That may not be the jvm's fault. I specifically set the stack size to be small because I wanted to squeeze in a the most number of threads possible into the jvm. I assumed the worst that would happen would be a java.lang.OutOfMemoryError, which I would catch and handle. Is there any way to tell from the core if it did run out of stack? I'm not very adept at using gdb. BTW The reason I set the input the way I did in my testcase is because that is the way it looked like it was being set in the core, according to gdb. OK I can reliably recreate the problem using a test java program and a small stack size. There still might be something to be done here about allocating that array off the stack, but I'm going to mark this as NOTABUG for now. BTW I can replicate the problem on fedora core 3 and redhat 9 too, but they fail in different places. Here is a backtrace from fc3: (gdb) bt #0 0x003f85f1 in phys_pages_info () from /lib/tls/libc.so.6 #1 0x003be735 in sysconf () from /lib/tls/libc.so.6 #2 0x0035bae1 in qsort () from /lib/tls/libc.so.6 #3 0x003e2468 in getaddrinfo () from /lib/tls/libc.so.6 #4 0xb229af08 in Java_java_net_Inet6AddressImpl_lookupAllHostAddr () from /mnt/space/fc3local/jdk1.5.0_04/jre/lib/i386/libnet.so #5 0xb289242b in ?? () #6 0x0940f934 in ?? () #7 0xbf878ac8 in ?? () #8 0xbf878ac4 in ?? () #9 0xbf878a98 in ?? () #10 0x8cc861c0 in ?? () #11 0xbf878ac8 in ?? () #12 0x8cc86758 in ?? () #13 0x00000000 in ?? () |