Bug 2147595 - crash in hostname resolution by NIS when address sanitizer is in use
Summary: crash in hostname resolution by NIS when address sanitizer is in use
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: libnsl2
Version: 37
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: Ondřej Sloup
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-11-24 11:00 UTC by Jochen
Modified: 2023-06-27 12:10 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)
small test programm calling gethostbyname (3.62 KB, text/x-csrc)
2022-11-24 11:00 UTC, Jochen
no flags Details
small test programm calling getpwuid_r (3.97 KB, text/x-csrc)
2022-11-25 10:38 UTC, Jochen
no flags Details

Description Jochen 2022-11-24 11:00:05 UTC
Created attachment 1926963 [details]
small test programm calling gethostbyname

Description of problem:
My minimal test program crashes in `gethostbyname` function with a DEADLYSIGNAL (SIGSEGV) when built with clang's address sanitizer (`-fsanitize=address`), but _only_ if the hostname must be solved by libnsl2 (the machine is configured as a NIS(YP) client).
I'm not 100% sure if this is really related to the libnsl2 module or if there is an issue with clang or underlying asan libraries.

Version-Release number of selected component (if applicable):
2.0.0-4

How reproducible:
# /usr/bin/clang -g -fsanitize=address -fno-omit-frame-pointer libnsl2crash.c
# ./a.out example.com
gethostbyname("example.com")...AddressSanitizer:DEADLYSIGNAL
=================================================================
==248166==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 bp 0x7ffc28ba64c0 sp 0x7ffc28ba5c78 T0)
==248166==Hint: pc points to the zero page.
==248166==The signal is caused by a READ memory access.
==248166==Hint: address points to the zero page.
    #0 0x0  (<unknown module>)
    #1 0x7f5d20e05e42  (/lib64/libnsl.so.3+0x3e42) (BuildId: 9486128142acf0b2aab30643ec361f2d7836d19c)
    #2 0x7f5d20e0624d  (/lib64/libnsl.so.3+0x424d) (BuildId: 9486128142acf0b2aab30643ec361f2d7836d19c)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (<unknown module>) 
==248166==ABORTING


Steps to Reproduce:
1. compile the example code with clang as outlined above
2. configure the machine as NIS client
3. invoke the compiled program with a non-local hostname to resolve

Actual results:
gethostbyname("example.com")...AddressSanitizer:DEADLYSIGNAL
...crash (SEGV)


Expected results:
gethostbyname("example.com") ... official name=example.com
  h_addr_list[0]=93.184.216.34
done


Additional info:
# gdb -ex r --args ./a.out example.com
gethostbyname("example.com")...
Program received signal SIGSEGV, Segmentation fault.
0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000000000457e6d in __interceptor_xdrstdio_create.part.0 ()
#2  0x00007ffff75dae43 in __yp_bind.part.0 () from /lib64/libnsl.so.3
#3  0x00007ffff75db24e in do_ypcall () from /lib64/libnsl.so.3
#4  0x00007ffff75dbae7 in yp_match () from /lib64/libnsl.so.3
#5  0x00007ffff79091a0 in internal_gethostbyname2_r () from /lib64/libnss_nis.so.2
#6  0x00007ffff790b360 in _nss_nis_gethostbyname_r () from /lib64/libnss_nis.so.2
#7  0x00007ffff7dceada in gethostbyname_r@@GLIBC_2.2.5 () from /lib64/libc.so.6
#8  0x00007ffff7dce1e9 in gethostbyname () from /lib64/libc.so.6
#9  0x0000000000463683 in gethostbyname ()
#10 0x00000000005158f1 in main (argc=2, argv=0x7fffffffd948) at libnsl2crash.c:88

# cat /etc/fedora-release
Fedora release 37 (Thirty Seven)

# uname -a
Linux fedora 6.0.9-300.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Nov 16 17:36:22 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

# clang --version
clang version 15.0.4 (Fedora 15.0.4-1.fc37)
Target: x86_64-redhat-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

Comment 1 Jochen 2022-11-25 10:37:09 UTC
The issue is even a bit worse, because it not only happens indirectly when calling `gethostbyname`, but also indirectly when calling `getpwuid_r` and similar. But again: it only crashes if NIS is consulted.

I've created another example program to demonstrate that: libnsl2crash2.c
After installing the debug symbols for libnsl2-2.0.0-4.fc37, I get a slightly better stack trace:

#0  0x0000000000000000 in ?? ()
#1  0x0000000000457e3d in __interceptor_xdrstdio_create.part.0 ()
#2  0x00007ffff75d9e43 in yp_bind_file (ysd=0x612000000340, domain=0x7ffff75de020 <ypdomainname> "XXXXXXXXXXXXXXXXXXX")
    at /usr/src/debug/libnsl2-2.0.0-4.fc37.x86_64/src/do_ypcall.c:109
#3  __yp_bind (domain=domain@entry=0x7ffff75de020 <ypdomainname> "XXXXXXXXXXXXXXXXXXX", ypdb=ypdb@entry=0x7fffffffccd0)
    at /usr/src/debug/libnsl2-2.0.0-4.fc37.x86_64/src/do_ypcall.c:275
#4  0x00007ffff75da24e in __yp_bind (ypdb=0x7fffffffccd0, domain=0x7ffff75de020 <ypdomainname> "XXXXXXXXXXXXXXXXXXX")
    at /usr/src/debug/libnsl2-2.0.0-4.fc37.x86_64/src/do_ypcall.c:254
#5  do_ypcall (domain=0x7ffff75de020 <ypdomainname> "XXXXXXXXXXXXXXXXXXX", prog=prog@entry=3, xargs=xargs@entry=0x7ffff75d8ce0 <xdr_ypreq_key>, 
    req=req@entry=0x7fffffffcd40 " \340]\367\377\177", xres=xres@entry=0x7ffff75d8d50 <xdr_ypresp_val>, resp=resp@entry=0x7fffffffcd20 "")
    at /usr/src/debug/libnsl2-2.0.0-4.fc37.x86_64/src/do_ypcall.c:442
#6  0x00007ffff75daae7 in do_ypcall_tr (resp=0x7fffffffcd20, xres=0x7ffff75d8d50 <xdr_ypresp_val>, req=0x7fffffffcd40 " \340]\367\377\177", 
    xargs=0x7ffff75d8ce0 <xdr_ypreq_key>, prog=3, domain=<optimized out>) at /usr/src/debug/libnsl2-2.0.0-4.fc37.x86_64/src/do_ypcall.c:475
#7  yp_match (indomain=<optimized out>, inmap=<optimized out>, inkey=<optimized out>, inkeylen=<optimized out>, outval=0x7fffffffcdd0, outvallen=0x7fffffffcdc4)
    at /usr/src/debug/libnsl2-2.0.0-4.fc37.x86_64/src/yp_match.c:48
#8  0x00007ffff790ae4b in _nss_nis_getpwuid_r () from /lib64/libnss_nis.so.2
#9  0x00007ffff7d85061 in getpwuid_r@@GLIBC_2.2.5 () from /lib64/libc.so.6
#10 0x00000000004735ec in getpwuid_r ()
#11 0x0000000000515a3a in main (argc=2, argv=0x7fffffffd958) at libnsl2crash2.c:79

Comment 2 Jochen 2022-11-25 10:38:33 UTC
Created attachment 1927381 [details]
small test programm calling getpwuid_r

Comment 3 Alexander Bokovoy 2022-11-28 12:42:43 UTC
I think this needs an upstream report. Fedora is driving NIS(+) removal in Fedora 38, but this looks like nothing specific to Fedora.

Few more questions before that.

1. Is this reproducible with clang only or a gcc-built example fails as well?

2. The crash happens in xdrstdio_create() implementation. Fedora uses tirpc, and libnsl2 links against tirpc. May be it is actually a bug in tirpc?

The code in question in libnsl2 is this:

  FILE *in = fopen (path, "rce");
  if (in != NULL)
    {
....
      XDR xdrs;
      xdrstdio_create (&xdrs, in, XDR_DECODE);
....

E.g. it passes a file object to initialize XDR stream and the code crashes there in TIRPC code.

Comment 4 Jochen 2022-11-28 13:06:25 UTC
Thank Alexander for looking into this!

(In reply to Alexander Bokovoy from comment #3)
> I think this needs an upstream report. Fedora is driving NIS(+) removal in
> Fedora 38, but this looks like nothing specific to Fedora.
> 
> Few more questions before that.
> 
> 1. Is this reproducible with clang only or a gcc-built example fails as well?

It also crashes with gcc (I had to install package libasan first in order to try it out):

# sudo dnf info libasan
Installed Packages
Name         : libasan
Version      : 12.2.1
Release      : 4.fc37
Architecture : x86_64
Size         : 1.3 M
Source       : gcc-12.2.1-4.fc37.src.rpm
[..]

This is the ASAN stacktrace from running the gcc sanitized example "libnslcrash2.c" (2nd example):

# gcc --version
gcc (GCC) 12.2.1 20221121 (Red Hat 12.2.1-4)
[..]

# gcc -g -fno-optimize-sibling-calls -fsanitize=address -fno-omit-frame-pointer libnsl2crash2.c 
# ./a.out 1076
getpwuid_r(1076) ... AddressSanitizer:DEADLYSIGNAL
=================================================================
==1413854==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x000000000000 bp 0x7ffd597ebe70 sp 0x7ffd597eb618 T0)
==1413854==Hint: pc points to the zero page.
==1413854==The signal is caused by a READ memory access.
==1413854==Hint: address points to the zero page.
    #0 0x0  (<unknown module>)
    #1 0x7fc2eb8cfe42 in yp_bind_file /usr/src/debug/libnsl2-2.0.0-4.fc37.x86_64/src/do_ypcall.c:109
    #2 0x7fc2eb8cfe42 in __yp_bind /usr/src/debug/libnsl2-2.0.0-4.fc37.x86_64/src/do_ypcall.c:275
    #3 0x7fc2eb8d024d in __yp_bind /usr/src/debug/libnsl2-2.0.0-4.fc37.x86_64/src/do_ypcall.c:254
    #4 0x7fc2eb8d024d in do_ypcall /usr/src/debug/libnsl2-2.0.0-4.fc37.x86_64/src/do_ypcall.c:442

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (<unknown module>) 
==1413854==ABORTING

It seems like the stack trace is identical. The last frame is suspicious, though. Perhaps the stack is smashed/garbled.


> 2. The crash happens in xdrstdio_create() implementation. Fedora uses tirpc,
> and libnsl2 links against tirpc. May be it is actually a bug in tirpc?

That could very well be, because the last stack frame is unknown.
The libtirpc version in use is:
# sudo dnf info libtirpc
Installed Packages
Name         : libtirpc
Version      : 1.3.3
Release      : 0.fc37
Architecture : i686
Size         : 218 k
Source       : libtirpc-1.3.3-0.fc37.src.rpm
[...]


Would be cool, if someone except myself is able to reproduce any of these issues.

Comment 5 Fedora Admin user for bugzilla script actions 2023-06-27 12:10:30 UTC
This package has changed maintainer in Fedora. Reassigning to the new maintainer of this component.


Note You need to log in before you can comment on or make changes to this bug.