Bug 1718844

Summary: glibc: Test suite failure: Malloc tests (e.g. malloc/tst-malloc-usable-tunables) fail sporadically, particularly on ppc64le (kernel bug 1749633)
Product: Red Hat Enterprise Linux 8 Reporter: Sergey Kolosov <skolosov>
Component: glibcAssignee: glibc team <glibc-bugzilla>
Status: CLOSED NOTABUG QA Contact: qe-baseos-tools-bugs
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.1CC: ashankar, codonell, dj, fweimer, mnewsome, pfrankli
Target Milestone: rcKeywords: Triaged
Target Release: 8.1Flags: pm-rhel: mirror+
Hardware: ppc64le   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1749633 (view as bug list) Environment:
Last Closed: 2020-07-20 13:48:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sergey Kolosov 2019-06-10 11:45:45 UTC
Description of problem:
malloc/tst-malloc-usable-tunables test sometimes fails on ppc64le

Version-Release number of selected component (if applicable):
glibc-2.28-59.el8
kernel-4.18.0-100.el8.ppc64le

How reproducible:
run glibc testsuite and check if malloc/tst-malloc-usable-tunables fails.

Steps to Reproduce:
1.rpmbuild -bp ~/rpmbuild/SPECS/glibc.spec
2. make check
3. cd pmbuild/BUILD/glibc-2.28/build-ppc64le-redhat-linux
3. ./testrun.sh ./malloc/tst-malloc-usable-tunables

Actual results:
malloc/tst-malloc-usable-tunables
 fails with: 
malloc_usable_size: expected 7 but got 24

Expected results:
The test should pass

Additional info:

Comment 2 DJ Delorie 2019-09-06 04:56:52 UTC
tl;dr it's address space randomization, but glibc should handle it better...

A normal run of my instrumented libc.so looks like this:

i 36 in 'GLIBC_TUNABLES=glibc.malloc.check=3' out 7fffc2120000 0
sbrk(0) 7fffc2120024
7fff8b0f0000-7fff8b110000 r-xp 00000000 00:00 0                          [vdso]
7fff8b110000-7fff8b140000 r-xp 00000000 fd:00 102121442                  /root/rpmbuild/BUILD/glibc-2.28/build-ppc64le-redhat-linux/elf/ld.so
7fff8b140000-7fff8b160000 rw-p 00020000 fd:00 102121442                  /root/rpmbuild/BUILD/glibc-2.28/build-ppc64le-redhat-linux/elf/ld.so
7fffc2120000-7fffc2130000 rw-p 00000000 00:00 0                          [heap]
7fffe7de0000-7fffe7e10000 rw-p 00000000 00:00 0                          [stack]
ne1
ne1
dj: test malloc
dj: ptmalloc_init
dj: ptmalloc GLIBC_TUNABLES = 'glibc.malloc.check=3'
dj: tune mallopt 3

The first part is in dl-tunables.c:tunables_strdup().  __sbrk(36)
returns something reasonable in the heap, "ne1" means new_env is set
in __tunables_init, and the dj: lines show that we initialize the
tunable correctly.


i 36 in 'GLIBC_TUNABLES=glibc.malloc.check=3' out ffffffffffffffff 12
sbrk(0) 7fffc18d0000
7fffaf0d0000-7fffaf0f0000 r-xp 00000000 00:00 0                          [vdso]
7fffaf0f0000-7fffaf120000 r-xp 00000000 fd:00 102121442                  /root/rpmbuild/BUILD/glibc-2.28/build-ppc64le-redhat-linux/elf/ld.so
7fffaf120000-7fffaf140000 rw-p 00020000 fd:00 102121442                  /root/rpmbuild/BUILD/glibc-2.28/build-ppc64le-redhat-linux/elf/ld.so
7fffc2540000-7fffc2570000 rw-p 00000000 00:00 0                          [stack]
strdup NULL
ne0
ne0
dj: test malloc
dj: ptmalloc_init
dj: ptmalloc GLIBC_TUNABLES = '(null)'
malloc_usable_size: expected 7 but got 24

In this case, __sbrk(36) returns -1.  I dumped the process map at that
point.  Note there is no [heap] map.  I don't know if it's the kernel
or libc that has to set that, but it *only* happens if
/proc/sys/kernel/randomize_va_space is set to 2

randomize_va_space:
0: disabled
1: Conservative: Shared libraries and PIE binaries are randomized.
2: Conservative and start of brk area is randomized, too

Yay, start of brk area is randomized - and sometimes that's fatal.


Moving on, though... the sbrk() fails, NULL is returned, and that
"string" is injected into the environ[] list, effectively removing
that variable - and all variables following it - from the environment.

	  if (new_env != NULL)
	    parse_tunables (new_env + len + 1, envval);
	  /* Put in the updated envval.  */
	  *prev_envp = new_env;

We shouldn't put new_env into *prev_envp if new_env is NULL.

But that won't fix the randomization problem; sbrk still fails and
tunables are still not parsed.

More:  I was able to strace one of the failures:

execve("./elf/ld64.so.2", ["./elf/ld64.so.2", "--library-path", ".:./math:./elf:./dlfcn:./nss:./n"..., "malloc/tst-malloc-usable-tunable"...], 0x10003ae52c0 /* 44 vars */) = 0
brk(NULL)                               = 0x7fffcaa00000
brk(0x7fffcaa00024)                     = 0x7fffcaa00000


Full reproduction steps:
* Get the glibc SPRM and install it; rpmbuild -ba glibc.spec
* cd rpmbuild/BUILD/glibc*/build*
* while GLIBC_TUNABLES=glibc.malloc.check=3 strace -f /bin/bash ./testrun.sh malloc/tst-malloc-usable-tunables; do echo; done

Comment 3 Florian Weimer 2019-11-08 16:22:32 UTC
This is a kernel bug, see bug 1749633, and it's already fixed upstream. I think we should still keep this bug open for discoverability based on the test name.

Comment 6 Carlos O'Donell 2020-07-20 13:48:53 UTC
Marking this CLOSED/NOTABUG. The bug should still show up when developers file new bugs if they see this failure in the builds. This bug was created as a historical record of the root cause analysis for the testsuite failure.