Created attachment 1811292 [details] test program Description of problem: realpath(3) fails with EINVAL if the resolved_path argument is null and the program is compiled with -fsanitize=thread Version-Release number of selected component (if applicable): gcc-8.4.1-1.el8.x86_64 libtsan-8.4.1-1.el8.x86_64 glibc-2.28-151.el8.x86_64 How reproducible: Always Steps to Reproduce: 1. Compile the attached "test.cpp" program $ g++ -pthread -fsanitize=thread -o sanitize-ok test.cpp $ g++ -DFORCE_FAILURE -pthread -fsanitize=thread -o sanitize-fail test.cpp 2. Run both programs: $ ./sanitize-ok Main thread (before async and fork): /work/03001716 Main thread (child thread is running): /work/03001716 Child thread (before fork): /work/03001716 Child thread (after fork): /work/03001716 Forked subprocess: /work/03001716 Subprocess finished, exit status: 0 Main thread (after joining the child thread): /work/03001716 $ ./sanitize-fail Main thread (before async and fork): /work/03001716 Main thread (child thread is running): /work/03001716 Child thread (before fork): /work/03001716 Child thread (after fork): /work/03001716 Forked subprocess: <error Invalid argument> Subprocess finished, exit status: 0 Main thread (after joining the child thread): /work/03001716 Actual results: realpath fails with EINVAL if the resolved_path argument is null Expected results: realpath should succeed Additional info: -------------------------------------------------------------------------------- /* Return the canonical absolute name of file NAME. A canonical name does not contain any `.', `..' components nor any repeated path separators ('/') or symlinks. All path components must exist. If RESOLVED is null, the result is malloc'd; otherwise, if the canonical name is PATH_MAX chars or more, returns null with `errno' set to ENAMETOOLONG; if the name fits in fewer than PATH_MAX chars, returns the name in RESOLVED. If the name cannot be resolved and RESOLVED is non-NULL, it contains the path of the first component that cannot be resolved. If the path can be resolved, RESOLVED holds the same value as the value returned. */ char * __realpath (const char *name, char *resolved) { char *rpath, *dest, *extra_buf = NULL; const char *start, *end, *rpath_limit; long int path_max; int num_links = 0; if (name == NULL) { /* As per Single Unix Specification V2 we must return an error if either parameter is a null pointer. We extend this to allow the RESOLVED parameter to be NULL in case the we are expected to allocate the room for the return value. */ __set_errno (EINVAL); return NULL; } [...] } libc_hidden_def (__realpath) versioned_symbol (libc, __realpath, realpath, GLIBC_2_3); #if SHLIB_COMPAT(libc, GLIBC_2_0, GLIBC_2_3) char * attribute_compat_text_section __old_realpath (const char *name, char *resolved) { if (resolved == NULL) { __set_errno (EINVAL); return NULL; } return __realpath (name, resolved); } compat_symbol (libc, __old_realpath, realpath, GLIBC_2_0); #endif -------------------------------------------------------------------------------- So the actual function call depends on the version references: if GLIBC_2_3 is not referenced, __old_realpath is called, leading a null resolved argument. I confirmed that glibc is built with SHLIB_COMPAT(libc, GLIBC_2_0, GLIBC_2_3) true: $ objdump -x /usr/lib64/libc.so.6 | fgrep __old_realpath 0000000000138480 l .text 0000000000000000 .hidden .annobin___old_realpath.start 00000000001384a1 l .text 0000000000000000 .hidden .annobin___old_realpath.end 0000000000138480 l F .text 0000000000000021 __old_realpath Now let's compile the test code with several options: g++ -O3 -Wall -Wextra -pthread -o no-sanitize-ok test.cpp g++ -DFORCE_FAILURE -O3 -Wall -Wextra -pthread -o no-sanitize-fail test.cpp g++ -O3 -Wall -Wextra -pthread -fsanitize=thread -o sanitize-ok test.cpp g++ -DFORCE_FAILURE -O3 -Wall -Wextra -pthread -fsanitize=thread -o sanitize-fail test.cpp And compare the relevant ojdump outputs: $ objdump -x no-sanitize-fail > no-sanitize-fail.txt $ objdump -x sanitize-fail > sanitize-fail.txt $ sed -i -e 's/[0-9a-fA-F]{16}\>/00000000/g;s/[0-9a-fA-F]{6}\>/0000000000000000/g' no-sanitize-fail.txt sanitize-fail.txt $ diff -U 5 no-sanitize-fail.txt sanitize-fail.txt $ diff -U 5 no-sanitize-fail.txt sanitize-fail.txt| fgrep -B 5 -A 4 -w GLIBC_2.3 Version References: required from libgcc_s.so.1: 0x0000000000000000 GCC_3.0 required from libc.so.6: - 0x0000000000000000 GLIBC_2.14 - 0x0000000000000000 GLIBC_2.3 0x0000000000000000 GLIBC_2.2.5 required from libpthread.so.0: 0x0000000000000000 GLIBC_2.2.5 required from libstdc++.so.6: -- +0000000000405870 w F .text 0000000000000117 _ZNSt14__basic_futureIvEC1ERKSt10shared_ptrINSt13__future_base13_State_baseV2EE +0000000000000000 F *UND* 0000000000000000 __tsan_write4 0000000000000000 F *UND* 0000000000000000 _ZNSt15__exception_ptr13exception_ptr4swapERS0_@@CXXABI_1.3.3 -00000000006061f8 w .data 0000000000000000 data_start -0000000000403ac0 g O .rodata 0000000000000004 _IO_stdin_used -0000000000000000 F *UND* 0000000000000000 realpath@@GLIBC_2.3 +0000000000608288 w .data 0000000000000000 data_start +0000000000405a20 g O .rodata 0000000000000004 _IO_stdin_used 0000000000000000 F *UND* 0000000000000000 _ZNSt15__exception_ptr13exception_ptrC1ERKS0_@@CXXABI_1.3.3 +0000000000402320 F *UND* 0000000000000000 pthread_create $ readelf --dyn-syms -W no-sanitize-fail | grep realpath 8: 0000000000000000 0 FUNC GLOBAL DEFAULT UND realpath (9) $ readelf --dyn-syms -W sanitize-fail | grep realpath 3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND realpath So the no-TSAN code calls __realpath while the TSAN code calls __old_realpath. We found that the problem is caused by a mechanism in libtsan that attempts to circumvent a bug in old glibc versions. This is the code in sanitizer_common/sanitizer_common_interceptors.inc: -------------------------------------------------------------------------------- #if SANITIZER_INTERCEPT_REALPATH INTERCEPTOR(char *, realpath, const char *path, char *resolved_path) { void *ctx; COMMON_INTERCEPTOR_ENTER(ctx, realpath, path, resolved_path); if (path) COMMON_INTERCEPTOR_READ_RANGE(ctx, path, REAL(strlen)(path) + 1); // Workaround a bug in glibc where dlsym(RTLD_NEXT, ...) returns the oldest // version of a versioned symbol. For realpath(), this gives us something // (called __old_realpath) that does not handle NULL in the second argument. // Handle it as part of the interceptor. char *allocated_path = nullptr; if (!resolved_path) allocated_path = resolved_path = (char *)WRAP(malloc)(path_max + 1); char *res = REAL(realpath)(path, resolved_path); if (allocated_path && !res) WRAP(free)(allocated_path); if (res) COMMON_INTERCEPTOR_WRITE_RANGE(ctx, res, REAL(strlen)(res) + 1); return res; } #define INIT_REALPATH COMMON_INTERCEPT_FUNCTION(realpath); #else #define INIT_REALPATH #endif -------------------------------------------------------------------------------- Unfortunately this leads to always calling __old_realpath, as we can see while running the test program under gdb: (gdb) br main Breakpoint 1 at 0x402c20: file test.cpp, line 35. (gdb) run Starting program: /home/alanm/C03001716/TSAN_MT_fork_realpath/sanitize-fail [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [New Thread 0x7ffff45ff700 (LWP 19006)] Thread 1 "sanitize-fail" hit Breakpoint 1, main () at test.cpp:35 35 int main() { (gdb) rbreak old_realpath Breakpoint 2 at 0x7ffff5f2a480: file canonicalize.c, line 229. char *__old_realpath(const char *, char *); Breakpoint 3 at 0x7ffff5f2a4a1 <function, no debug info> .annobin___old_realpath.end; (gdb) cont Continuing. Thread 1 "sanitize-fail" hit Breakpoint 2, __old_realpath (name=0x405a44 ".", resolved=0x7b8400000000 "") at canonicalize.c:229 229 { (gdb) bt #0 __old_realpath (name=0x405a44 ".", resolved=0x7b8400000000 "") at canonicalize.c:229 #1 0x00007ffff6d45228 in __interceptor_realpath (path=path@entry=0x405a44 ".", resolved_path=0x7b8400000000 "", resolved_path@entry=0x0) at ../../../../libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:3282 #2 0x00000000004043f6 in test_realpath (where="Main thread (before async and fork)") at test.cpp:16 #3 0x0000000000402c54 in main () at /usr/include/c++/8/ext/new_allocator.h:79 (gdb) f 1 #1 0x00007ffff6d45228 in __interceptor_realpath (path=path@entry=0x405a44 ".", resolved_path=0x7b8400000000 "", resolved_path@entry=0x0) at ../../../../libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:3282 3282 char *res = REAL(realpath)(path, resolved_path);
Yes, it always calls __old_realpath, but why should it matter when it does what the new realpath does itself (malloc the buffer when NULL). The bug is in the testcase, just read the man page of fork: A process shall be created with a single thread. If a multi-threaded process calls fork(), the new process shall contain a replica of the calling thread and its entire address space, possibly including the states of mutexes and other resources. Consequently, to avoid errors, the child process may only execute async-signal-safe operations until such time as one of the exec functions is called. Fork handlers may be established by means of the pthread_atfork() function in order to maintain application invariants across fork() calls. Neither realpath, nor std::out << methods, nor the locking is async-signal-safe. So, don't do that after fork from a multi-threaded program.
That said, in GCC 11 and later one can do (untested): --- libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc.jj 2021-07-21 09:38:12.309460350 +0200 +++ libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc 2021-08-10 13:59:04.953296453 +0200 @@ -3664,21 +3664,11 @@ INTERCEPTOR(char *, realpath, const char void *ctx; COMMON_INTERCEPTOR_ENTER(ctx, realpath, path, resolved_path); if (path) COMMON_INTERCEPTOR_READ_RANGE(ctx, path, REAL(strlen)(path) + 1); - - // Workaround a bug in glibc where dlsym(RTLD_NEXT, ...) returns the oldest - // version of a versioned symbol. For realpath(), this gives us something - // (called __old_realpath) that does not handle NULL in the second argument. - // Handle it as part of the interceptor. - char *allocated_path = nullptr; - if (!resolved_path) - allocated_path = resolved_path = (char *)WRAP(malloc)(path_max + 1); - char *res = REAL(realpath)(path, resolved_path); - if (allocated_path && !res) WRAP(free)(allocated_path); if (res) COMMON_INTERCEPTOR_WRITE_RANGE(ctx, res, REAL(strlen)(res) + 1); return res; } -#define INIT_REALPATH COMMON_INTERCEPT_FUNCTION(realpath); +#define INIT_REALPATH COMMON_INTERCEPT_FUNCTION_GLIBC_VER_MIN(realpath, "GLIBC_2.3"); #else #define INIT_REALPATH #endif and then realpath could work with NULL second arg even when the interceptors are ignored. Backporting it to older GCC releases would require more work. And it wouldn't make your testcase valid...
I have created https://reviews.llvm.org/D107819 for this.
(In reply to Jakub Jelinek from comment #2) > Backporting it to older GCC releases would require more work. > And it wouldn't make your testcase valid... I'd be willing to put this patch into RHEL 9 gcc, but not RHEL 8 gcc, sorry. Would that be helpful, Carlos?
(In reply to Marek Polacek from comment #4) > (In reply to Jakub Jelinek from comment #2) > > Backporting it to older GCC releases would require more work. > > And it wouldn't make your testcase valid... > > I'd be willing to put this patch into RHEL 9 gcc, but not RHEL 8 gcc, sorry. > Would that be helpful, Carlos? The customer is using RHEL-8 but there is no pressure, since he already found a workaround: "We already have a workaround: boost::filesystem::canonical() doesn't call realpath(). It resolves relative paths, ".", and ".." internally, and calls readlink() for symlinks, but readlink() requires a buffer even in the latest POSIX, so there is no such issue with the function from Boost."
Great, thanks for the update. Jakub's patch has already been applied upstream: https://github.com/llvm/llvm-project/commit/faef0d042f523357fe5590e7cb6a8391cf0351a8 which means that it will eventually get into gcc, too. I'm closing this BZ then, but if there's any need to have this fixed in RHEL 9 too, please reopen.