Bug 2011438
| Summary: | gating run with rpminspect and annocheck: pie test malloc says PASS but rpminspect says VERIFY | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Brian Lane <bcl> |
| Component: | annobin | Assignee: | Nick Clifton <nickc> |
| Status: | CLOSED ERRATA | QA Contact: | Václav Kadlčík <vkadlcik> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 9.0 | CC: | dcantrell, fweimer, jdenemar, mcermak, msrb, nickc, rjones, tfujiwar, vkadlcik |
| Target Milestone: | rc | Keywords: | Bugfix, Triaged |
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | annobin-10.13-1.el9 | Doc Type: | No Doc Update |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-05-17 12:33:08 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Brian Lane
2021-10-06 15:29:57 UTC
(In reply to Brian Lane from comment #0) > Hardened: /usr/libexec/tests/composer-cli/composer-cli-tests: PASS: pie test > malloc(): corrupted top size Is the "malloc(): corrupted top size" part of the message displayed by annocheck, or has it come from elsewhere ? If it has come from annocheck then it appear that there is a memory corruption somewhere. Can you point me at an rpm containing this composer-cli-tests binary so that I can run some checks of my own ? > http://artifacts.osci.redhat.com/testing-farm/dd4d3c10-b0be-4e64-a5c0- > d65a16f7233a/work-rpminspect0N8FR3/rpminspect/execute/data/annocheck/output. > txt Unfortunately any attempt to access artifacts.asci.redhat.com just hangs for me, so I cannot see this log. :-( In addition to the malloc issue, I can see several suspicious messages in the annocheck output for libvirt (comment #5): - threads test free(): invalid size - threads test munmap_chunk(): invalid pointer - threads test double free or corruption (out) Hmm, well I think that it is clear that there is some kind of memory corruption in annocheck 10.06. I still cannot reproduce the problem locally, but I have found one bug: Hardened: /usr/lib64/libvirt-admin.so.0.7008.0: (component: xdr_admin_connect_lookup_server_ret) unable to parse tool attribute: plugin name: annobin. The control character in this output string is due to printf() being called on a buffer one byte before the start of the printable portion. Annoying, but I do not think that this would cause a memory corruption. I have a fix for this problem, but I am going to carry on investigating. This is so frustrating. I have tried running annocheck under valgrind, running it with MALLOC_CHECK_=1, building it with -fsanitize=address and building it with -fsanitize-undefined and none of these attempts have shown any kind of memory problem (or any problems at all). Maybe the problem is host specific. Does anyone know what host machine is used to run these rpminspect tests ? Here's another one in a different package. Also cannot reproduce it locally. http://artifacts.osci.redhat.com/testing-farm/f86fc45d-7a5b-4574-8e5b-6b14c1ad1e0c/work-rpminspectEQpYYd/rpminspect/execute/data/annocheck/output.txt I uncovered a memory leak in librpminspect and fixed that, but still get the malloc() top size error. That's from annocheck and annocheck SIGABRTs.
Here's what I'm looking at in gdb right now:
Core was generated by `annocheck --ignore-unknown --verbose --debug-dir=/var/tmp/rpminspect/local.OUS0'.
Program terminated with signal SIGABRT, Aborted.
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
49 return ret;
(gdb) bt
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:49
#1 0x00007f848112a8a4 in __GI_abort () at abort.c:79
#2 0x00007f8481183a97 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f84812947fc "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3 0x00007f848118b70c in malloc_printerr (str=str@entry=0x7f8481292af3 "malloc(): corrupted top size") at malloc.c:5628
#4 0x00007f848118f114 in _int_malloc (av=av@entry=0x7f84812c7a00 <main_arena>, bytes=bytes@entry=448108) at malloc.c:4332
#5 0x00007f8481190177 in __GI___libc_malloc (bytes=bytes@entry=448108) at malloc.c:3229
#6 0x00007f84814368da in __libelf_set_rawdata_wrlock (scn=scn@entry=0x55e846b5d6a8) at elf_getdata.c:331
#7 0x00007f8481436e0a in __elf_getdata_rdlock (scn=0x55e846b5d6a8, data=<optimized out>) at elf_getdata.c:541
#8 0x000055e845d28262 in run_checkers (elf=0x55e846b5d360, fd=<optimized out>,
filename=0x7ffcd76dbffe "/var/tmp/rpminspect/local.OUS05q/after/aarch64/weldr-client-tests-35.3-2.el9.aarch64//usr/libexec/tests/composer-cli/composer-cli-tests") at /usr/src/debug/annobin-10.12-1.fc34.x86_64/annocheck/annocheck.c:692
#9 process_elf (
filename=0x7ffcd76dbffe "/var/tmp/rpminspect/local.OUS05q/after/aarch64/weldr-client-tests-35.3-2.el9.aarch64//usr/libexec/tests/composer-cli/composer-cli-tests", fd=<optimized out>, elf=0x55e846b5d360) at /usr/src/debug/annobin-10.12-1.fc34.x86_64/annocheck/annocheck.c:1610
#10 0x000055e845d288ca in process_file (
filename=0x7ffcd76dbffe "/var/tmp/rpminspect/local.OUS05q/after/aarch64/weldr-client-tests-35.3-2.el9.aarch64//usr/libexec/tests/composer-cli/composer-cli-tests") at /usr/src/debug/annobin-10.12-1.fc34.x86_64/annocheck/annocheck.c:1825
#11 0x000055e845d2666c in process_files () at /usr/src/debug/annobin-10.12-1.fc34.x86_64/annocheck/annocheck.c:1995
#12 main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/annobin-10.12-1.fc34.x86_64/annocheck/annocheck.c:2087
(gdb)
Because rpminspect already extracts all of the RPMs for all of the other work, rpminspect invokes rpminspect with a --debug-dir option and the specific file to examine. Here's the invocation where I am seeing "malloc: corrupted top size":
$ annocheck --ignore-unknown --verbose --debug-dir=/var/tmp/rpminspect/local.BzLX5p/after/aarch64/weldr-client-debuginfo-35.3-2.el9.aarch64 /usr/libexec/tests/composer-cli/composer-cli-tests
The debug dir location is just where rpminspect temporarily extracted the debuginfo RPM. The last argument is the name of the executable it is asking annocheck to examine, but I have sanitized the output here and trimmed the working dir from the lead up. It would actually be "/var/tmp/rpminspect/local.BzLX5p/after/aarch64/weldr-client-tests-35.3-2.el9.aarch64/usr/share/licenses/weldr-client-tests"
Hi David, Ah OK this is starting to make more sense. Running a command similar to yours but under valgrind reveals: [...] ==2352072== Conditional jump or move depends on uninitialised value(s) ==2352072== at 0x11B22E: UnknownInlinedFun (hardened.c:3562) ==2352072== by 0x11B22E: check_seg.lto_priv.0 (hardened.c:3478) ==2352072== by 0x11106D: UnknownInlinedFun (annocheck.c:750) ==2352072== by 0x11106D: process_elf (annocheck.c:1610) ==2352072== by 0x1118C9: process_file (annocheck.c:1825) ==2352072== by 0x10F66B: UnknownInlinedFun (annocheck.c:1995) ==2352072== by 0x10F66B: main (annocheck.c:2087) ==2352072== [...] Not quite a memory leak, but definitely worth investigating... So I think one thing I'm doing in rpminspect is not matching the right debuginfo tree to the subpackage in question. I'm fixing that now.
(In reply to Nick Clifton from comment #19)
> ==2352072== Conditional jump or move depends on uninitialised value(s)
> ==2352072== at 0x11B22E: UnknownInlinedFun (hardened.c:3562)
This was due to annocheck not testing the return value from libelf's gelf_getnote() function. It just assumed that the call always succeeded. I have a local fix for this.
Carrying on to investigate what annocheck does when it is given a bogus --debug-dir...
OK - I have found it. When --debug-path points to a long directory name it overflows an internal buffer used by annocheck. I am working on a fix now... And to followup from the rpminspect side... I have fixed the debuginfo subdir matching code so it picks the right directory for --debug-dir. That caused the malloc error to go away and annocheck stopped SIGABRTing. It now fails, but for the expected reason and the FAIL results in the annocheck code match the exit code of the process. Thanks, Nick for looking in to this on the annocheck side. BTW, the test output from libvirt contains
FAIL: property-note test because no .note.gnu.property section found
could it be in anyway caused by the same bug or should I file a separate BZ
for it?
(In reply to Jiri Denemark from comment #24) > BTW, the test output from libvirt contains > > FAIL: property-note test because no .note.gnu.property section found > > could it be in anyway caused by the same bug or should I file a separate BZ > for it? Please file a separate BZ. This is a separate issue and may even indicate a real problem with libvirt... *** Bug 2012159 has been marked as a duplicate of this bug. *** Right - the memory corruption should be fixed in annobin-10.13-1.el9. (It is currently in gating, but I hope it will make it through by Monday). Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (new packages: annobin), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2022:2342 |