Bug 1981410
Summary: | annocheck reports that /sbin/ldconfig stack protection deliberately disabled | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Jan Pazdziora <jpazdziora> |
Component: | annobin | Assignee: | Nick Clifton <nickc> |
Status: | CLOSED ERRATA | QA Contact: | Václav Kadlčík <vkadlcik> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 9.0 | CC: | ashankar, codonell, dj, fweimer, jjaburek, jpazdziora, mcermak, mnewsome, nickc, pfrankli, rlemosor, sipoyare, tschelle |
Target Milestone: | beta | Keywords: | Bugfix, Reopened, Triaged |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | annobin-10.10-1.el9 | Doc Type: | No Doc Update |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-05-17 12:33:04 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2044387 |
Description
Jan Pazdziora
2021-07-12 14:00:31 UTC
When I upgrade glibc to the (currently gated) glibc-2.33.9000-39.el9 packages, the output changes to # annocheck --verbose --ignore-gaps --skip-all --test-pic --test-pie --test-stack-prot /sbin/ldconfig annocheck: Version 9.79. Hardened: /sbin/ldconfig: PASS: pie test Hardened: /sbin/ldconfig: PASS: stack-prot test Hardened: /sbin/ldconfig: PASS: pic test Hardened: /sbin/ldconfig: info: set binary producer to <unknown>. Hardened: /sbin/ldconfig: info: set binary producer to <unknown>. Hardened: /sbin/ldconfig: info: set binary producer to Gas version 2. Hardened: /sbin/ldconfig: info: notes produced by assembler plugin version 1 Hardened: /sbin/ldconfig: info: set binary producer to Gimple version 9. Hardened: /sbin/ldconfig: info: notes produced by lto plugin version 9.79 Hardened: /sbin/ldconfig: FAIL: stack-prot test because stack protection deliberately disabled (addr range: 0x996d..0xc5c9c) Hardened: /sbin/ldconfig: FAIL: stack-prot test because stack protection deliberately disabled (addr range: 0x12e40..0xc5e44) So the gcc version discrepancy message is gone but there are still two FAILs reported. But with this new glibc version, there is now a new FAIL for ld-linux-x86-64.so.2: Hardened: /lib64/ld-linux-x86-64.so.2: FAIL: stack-prot test because stack protection deliberately disabled (function: notify_audit_modules_of_loaded_object) (source: annobin notes) ldconfig is statically linked, so the early userspace process initalization code is included in the binary. Some of that early initialization code runs before the stack protector has been set up, and those parts of to be excluded from instrumentation. Otherwise process startup code would just crash. The same issue applies to the dynamic linker, where this early initialization code is located for dynamically linked binaries. I thought that based on the discussion in bug 1923439, it was possible to reach the goal of making annocheck aware of these early initalization codes, to make annocheck run clean. That way the other test tooling would not need to have separate exception for glibc. So for the dynamic linker, it seems adding notify_audit_modules_of_loaded_object to startup_funcs in http://sourceware.org/git/?p=annobin.git;a=blob;f=annocheck/hardened.c;h=82339e86d3237413c004132fa35f81f11a06ac09;hb=HEAD#l1351 should do the trick? Nick, is that the case? Is the /sbin/ldconfig solvable? (In reply to Jan Pazdziora from comment #5) > I thought that based on the discussion in bug 1923439, it was possible to > reach the goal of making annocheck aware of these early initalization codes, > to make annocheck run clean. That way the other test tooling would not need > to have separate exception for glibc. It is possible, and it has been done, but ... it depends upon adding known startup function names to annocheck. As the glibc source code changes, and as gcc changes, the function names change. (Mainly because annocheck looks for the function nearest the start of the address range that appears to be failing a test. So if the compiler rearranges the code, the function name changes). I can, and will, add "notify_audit_modules_of_loaded_object" to the list of known startup functions. But really there needs to be a better way to detect startup code. The problem is, I have not been able to think of one. Let me reopen this bugzilla with a goal of figuring out some mechanism which would be reasonably stable across updates. Ideally, if a new function gets added to the startup section, communication with annocheck maintainer will be needed to keep things in sync. From practical point of view, the goal is also to prevent every glibc build getting stopped at gating and requiring manual waive. (In reply to Nick Clifton from comment #6) > I can, and will, add "notify_audit_modules_of_loaded_object" to the list of > known > startup functions. But really there needs to be a better way to detect > startup > code. The problem is, I have not been able to think of one. Nick, I'm going to assign this bug to you since we need to add this function to the exceptions list. We can and should continue to think of a better way to do this. (In reply to Carlos O'Donell from comment #9) > Nick, I'm going to assign this bug to you since we need to add this function > to the exceptions list. We can and should continue to think of a better way > to do this. Fair enough. Is there any chance that glibc code that is compiled without these security features could have some kind of identifying property ? Eg a prefix to the name, or a special ELF symbol type (STT_GNU_EXEMPT_FUNC anyone ?) or contained in a special section - one that isn't merged with .text at final link. (In reply to Nick Clifton from comment #10) > Fair enough. Is there any chance that glibc code that is compiled without > these > security features could have some kind of identifying property ? Eg a prefix > to the name, or a special ELF symbol type (STT_GNU_EXEMPT_FUNC anyone ?) or > contained in a special section - one that isn't merged with .text at final > link. Carlos and I briefly discussed this last week and it may well make sense to think of an idea to do this given the repeated need to fix up exceptions. I've filed bug 1983526 to track that. Hello, I've noticed that at least /usr/lib64/ld-linux-x86-64.so.2 (on x86_64) and /usr/lib/ld64.so.1 (on s390x) are no longer being reported by annocheck as having FAILed the stack-prot test (there is now a skip note), however /usr/sbin/ldconfig still does fail. (Due to the huge queues for ppc64le, I don't have results for that yet.) $ annocheck --skip-all --test-stack-prot --verbose /usr/sbin/ldconfig /usr/lib64/ld-linux-x86-64.so.2 annocheck: Version 9.83. Hardened: /usr/lib64/ld-linux-x86-64.so.2: info: set binary producer to GCC version 11. Hardened: /usr/lib64/ld-linux-x86-64.so.2: PASS: stack-prot test Hardened: /usr/lib64/ld-linux-x86-64.so.2: info: ALSO written in Assembler (source: DW_AT_language string). Hardened: /usr/lib64/ld-linux-x86-64.so.2: info: set binary producer to Gas version 2. Hardened: /usr/lib64/ld-linux-x86-64.so.2: WARN: Command line options not recorded by -grecord-gcc-switches Hardened: /usr/lib64/ld-linux-x86-64.so.2: info: set binary producer to Gimple version 9. Hardened: /usr/lib64/ld-linux-x86-64.so.2: info: notes produced by lto plugin version 9.83 Hardened: /usr/lib64/ld-linux-x86-64.so.2: info: set binary producer to Gas version 2. Hardened: /usr/lib64/ld-linux-x86-64.so.2: info: notes produced by assembler plugin version 1 Hardened: /usr/lib64/ld-linux-x86-64.so.2: skip: stack-prot test because function notify_audit_modules_of_loaded_object is part of the C library's startup code, which executes before stack protection is established Hardened: /usr/sbin/ldconfig: PASS: stack-prot test Hardened: /usr/sbin/ldconfig: info: set binary producer to <unknown>. Hardened: /usr/sbin/ldconfig: info: set binary producer to <unknown>. Hardened: /usr/sbin/ldconfig: info: set binary producer to Gas version 2. Hardened: /usr/sbin/ldconfig: info: notes produced by assembler plugin version 1 Hardened: /usr/sbin/ldconfig: info: set binary producer to Gimple version 9. Hardened: /usr/sbin/ldconfig: info: notes produced by lto plugin version 9.83 Hardened: /usr/sbin/ldconfig: FAIL: stack-prot test because stack protection deliberately disabled (addr range: 0x98e4..0xd43dc) Hardened: /usr/sbin/ldconfig: FAIL: stack-prot test because stack protection deliberately disabled (addr range: 0x12fa0..0xd4584) Is this expected? .. I don't quite know what's going on there - it first reports PASS and then goes on to FAIL. annobin-annocheck-9.83-2.el9 glibc-2.34-1.el9 Thanks, Jiri Hi Jiri, Is it OK if I CLOSE this BZ now ? Cheers Nick ----------------------------------------------------------------------------------- [For the record - and future readers - this was my emailed response to comment #12: > (Due to the huge queues for ppc64le, I don't have results for that yet.) Did you know that annocheck supports cross-checking ? Ie you can run annocheck on any host to check any rpm, not just those native to the host... > Hardened: /usr/sbin/ldconfig: PASS: stack-prot test > Hardened: /usr/sbin/ldconfig: info: set binary producer to <unknown>. Hmmm, an "<unknown>" producer is worrying... (I have a local fix for this now however. It will be in the next release of annocheck). > Hardened: /usr/sbin/ldconfig: FAIL: stack-prot test because stack protection > deliberately disabled (addr range: 0x98e4..0xd43dc) > Is this expected? Yes. See below. > . I don't quite know what's going on there - it first > reports PASS and then goes on to FAIL. I know - weird. Annocheck actually contains two checks for the stack protection feature. It examines the compilation command line (if it is recorded in the binary) and the annobin notes (if they exist). So in this case the option is present in the recorded command line, but also recorded as being deliberately disabled in the annobin notes. Anyway I think that I have tracked down the problem. Take a look at these two annocheck runs: % annocheck --skip-all --test-stack-prot --verbose sbin/ldconfig annocheck: Version 9.84. Hardened: sbin/ldconfig: PASS: stack-prot test Hardened: sbin/ldconfig: info: set binary producer to Gas version 2. Hardened: sbin/ldconfig: info: notes produced by assembler plugin version 1 Hardened: sbin/ldconfig: info: set binary producer to Gimple version 9. Hardened: sbin/ldconfig: info: notes produced by lto plugin version 9.83 Hardened: sbin/ldconfig: FAIL: stack-prot test because stack protection deliberately disabled (addr range: 0x98e4..0xd43dc) Hardened: sbin/ldconfig: FAIL: stack-prot test because stack protection deliberately disabled (addr range: 0x12fa0..0xd4584) % annocheck --skip-all --test-stack-prot --verbose ./sbin/ldconfig --debug-file usr/lib/debug/sbin/ldconfig-2.34-2.el9.x86_64.debug annocheck: Version 9.84. Hardened: ./sbin/ldconfig: info: set binary producer to Gas version 2. Hardened: ./sbin/ldconfig: WARN: Command line options not recorded by -grecord-gcc-switches Hardened: ./sbin/ldconfig: info: ALSO written in C (source: DW_AT_language string). Hardened: ./sbin/ldconfig: info: set binary producer to GCC version 11. Hardened: ./sbin/ldconfig: PASS: stack-prot test Hardened: ./sbin/ldconfig: info: set binary producer to Gas version 2. Hardened: ./sbin/ldconfig: info: set binary producer to Gas version 2. Hardened: ./sbin/ldconfig: info: notes produced by assembler plugin version 1 Hardened: ./sbin/ldconfig: info: set binary producer to Gimple version 9. Hardened: ./sbin/ldconfig: info: notes produced by lto plugin version 9.83 Hardened: ./sbin/ldconfig: skip: stack-prot test because function _dl_start is part of the C library's startup code, which executes before stack protection is established In the second run annocheck was able to detect that the disabled stack protection was part of glibc's staryup code because it had access to the full symbol table for sbin/ldconfig. In the first run it only had access to the very basic symbol table that is part of stripped executables, so it could not make this determination. (Note - the ld-linux-x86_64.so.2 executable is not stripped, which is why annocheck worked without any additional command line options). ------------------------------------------------------------------------- And this was Jiri's response: ------------------------------------------------------------------------- > Did you know that annocheck supports cross-checking ? Ie you can run > annocheck on any host to check any rpm, not just those native to the host... That's good to know, however we're running it as part of a bigger test suite, which needs to be run on a given architecture. We're also scanning files directly due to multiple reasons: 1) We need to catch and scan any additional ELF binaries in our Common Criteria OS configuration, whether these come from sources outside RPMs or are generated via RPM triggers/hooks 2) It exercises a different algorithm than the RPM-based scanning in gating/OSCI, which is how we found failures that were undetected before due to a bug in the RPM-specific logic I was also considering writing a custom scanner in python using pyelftools (similar to how we had a custom scanner in RHEL-6/7 for GNU_RELRO, TEXTREL, etc.), reading .gnu.build.attributes for additional information, but seeing the glibc and other exceptions in annocheck code, that's not worth it at this point. > In the second run annocheck was able to detect that the disabled stack > protection was part of glibc's staryup code because it had access to the > full symbol table for sbin/ldconfig. In the first run it only had access > to the very basic symbol table that is part of stripped executables, so > it could not make this determination. (Note - the ld-linux-x86_64.so.2 > executable is not stripped, which is why annocheck worked without any > additional command line options). Right, that makes sense. It's not for the actual stack-prot check, but for the exclusion rule, which is function name based. I can also see that annocheck uses the debuginfo automatically (if available), so I basically have to ensure all the RPMs on the system have debuginfos installed (where available) prior to running the scan. -------------------------------------------------------------------------------- (In reply to Nick Clifton from comment #13) > Hi Jiri, > > Is it OK if I CLOSE this BZ now ? Well, that doesn't really depend on me (QA Contact might have a bigger say here). As far as the testing we do, I'll need to figure out how to install/enable debuginfo repositories for our configuration + install all the hundreds of debuginfo packages before running our test suite. I plan to do that this/next week, so if you want to keep this open for a few more days until I give you the green light, feel free to. Jiri So I've installed all debuginfos via rpm -qa | xargs dnf -y debuginfo-install --skip-broken (lacking a better option) and this indeed fixed the stack-prot error for ldconfig: # annocheck --verbose --fixed-format-messages --skip-all --test-stack-prot /usr/sbin/ldconfig annocheck: Version 9.83. Hardened: /usr/sbin/ldconfig: info: ALSO written in C (source: DW_AT_language string). Hardened: PASS: test: stack-prot file: /usr/sbin/ldconfig. ... However I seem to be hitting another bug (aside from the cmdline order being reversed, which was fixed in upstream), possibly some hard limit on debuginfo lookups: # annocheck --verbose --fixed-format-messages --skip-all --test-stack-prot /usr/sbin/ldconfig $(for i in {1..168}; do echo /bin/bash; done) annocheck: Version 9.83. Hardened: PASS: test: stack-prot file: /bin/bash. Hardened: PASS: test: stack-prot file: /bin/bash. <snip many lines> Hardened: PASS: test: stack-prot file: /bin/bash. Hardened: /usr/sbin/ldconfig: info: ALSO written in C (source: DW_AT_language string). Hardened: PASS: test: stack-prot file: /usr/sbin/ldconfig. Hardened: FAIL: test: stack-prot file: /usr/sbin/ldconfig. This is inserting 168 instances of /bin/bash after /usr/sbin/ldconfig, which somehow causes the ldconfig check to fail. Changing it to 167 makes it PASS. annobin-annocheck-9.83-2.el9.x86_64 (In reply to Jiri Jaburek from comment #16) Hi Jiri, > # annocheck --verbose --fixed-format-messages --skip-all --test-stack-prot > /usr/sbin/ldconfig $(for i in {1..168}; do echo /bin/bash; done) > annocheck: Version 9.83. > Hardened: FAIL: test: stack-prot file: /usr/sbin/ldconfig. > > This is inserting 168 instances of /bin/bash after /usr/sbin/ldconfig, which > somehow causes the ldconfig check to fail. Changing it to 167 makes it PASS. Weird! I cannot reproduce this failure locally, but I am using version 9.86 of annocheck. Does the problem still happen if there are more than 168 instances of bash on the command line ? I am not sure what might be causing it. Is there anything special about the test machine that you are using ? Only a small amount of memory maybe ? Are you able to repeat the test using the latest version of rawhide's annobin: https://koji.fedoraproject.org/koji/buildinfo?buildID=1817311 Cheers Nick (In reply to Nick Clifton from comment #17) > Weird! I cannot reproduce this failure locally, but I am using version 9.86 > of > annocheck. Does the problem still happen if there are more than 168 > instances > of bash on the command line ? Yes. It's how I discovered it - passing many more via xargs because of annocheck's funky filesystem traversal logic. I do, essentially, find -H "$@" -type f -size +3c -ignore_readdir_race -print0 | xargs -0 -- annocheck ... > > I am not sure what might be causing it. Is there anything special about the > test machine that you are using ? Only a small amount of memory maybe ? Not likely, 4 GiB of RAM, most of which is free. > > Are you able to repeat the test using the latest version of rawhide's > annobin: > > https://koji.fedoraproject.org/koji/buildinfo?buildID=1817311 Yes, once I flip the arguments around, accounting for https://sourceware.org/git/?p=annobin.git;a=commit;h=be7478564bf2c7925db1ab323372ff2e49e77696 # annocheck --verbose --fixed-format-messages --skip-all --test-stack-prot $(for i in {1..168}; do echo /bin/bash; done) /usr/sbin/ldconfig annocheck: Version 9.86. Hardened: PASS: test: stack-prot file: /bin/bash. Hardened: PASS: test: stack-prot file: /bin/bash. ... Hardened: PASS: test: stack-prot file: /bin/bash. Hardened: PASS: test: stack-prot file: /bin/bash. Hardened: /usr/sbin/ldconfig: info: ALSO written in C (source: DW_AT_language string). Hardened: PASS: test: stack-prot file: /usr/sbin/ldconfig. Hardened: FAIL: test: stack-prot file: /usr/sbin/ldconfig. Here, 168 might just be a random number, an offset where memory corruption causes the check to fail. Running under valgrind, I can see annocheck having some issues: Hardened: PASS: test: stack-prot file: /bin/bash. Hardened: PASS: test: stack-prot file: /bin/bash. Hardened: PASS: test: stack-prot file: /bin/bash. Hardened: PASS: test: stack-prot file: /bin/bash. Hardened: /usr/sbin/ldconfig: info: ALSO written in C (source: DW_AT_language string). Hardened: PASS: test: stack-prot file: /usr/sbin/ldconfig. ==26708== Warning: invalid file descriptor 1031 in syscall openat() ==26708== Warning: invalid file descriptor 1031 in syscall openat() ==26708== Warning: invalid file descriptor 1031 in syscall openat() ==26708== Warning: invalid file descriptor 1031 in syscall openat() ==26708== Warning: invalid file descriptor 1031 in syscall openat() ==26708== Warning: invalid file descriptor 1031 in syscall openat() ==26708== Warning: invalid file descriptor 1031 in syscall openat() ==26708== Warning: invalid file descriptor 1031 in syscall openat() ==26708== Conditional jump or move depends on uninitialised value(s) ==26708== at 0x1120E9: skip_test_for_current_func (hardened.c:1435) ==26708== by 0x119685: build_note_checker (hardened.c:2015) ==26708== by 0x10F83B: annocheck_walk_notes (annocheck.c:524) ==26708== by 0x11543A: check_note_section (hardened.c:2779) ==26708== by 0x1102DA: UnknownInlinedFun (annocheck.c:633) ==26708== by 0x1102DA: process_elf (annocheck.c:1493) ==26708== by 0x110D29: process_file (annocheck.c:1708) ==26708== by 0x10E31B: UnknownInlinedFun (annocheck.c:1866) ==26708== by 0x10E31B: main (annocheck.c:1958) ==26708== Uninitialised value was created by a stack allocation ==26708== at 0x118376: annocheck_get_symbol_name_and_type (annocheck.c:1363) ==26708== Hardened: FAIL: test: stack-prot file: /usr/sbin/ldconfig. ==26708== Warning: invalid file descriptor 1031 in syscall openat() ==26708== Warning: invalid file descriptor 1031 in syscall openat() ==26708== Warning: invalid file descriptor 1031 in syscall openat() ==26708== Warning: invalid file descriptor 1031 in syscall openat() ==26708== ==26708== HEAP SUMMARY: ==26708== in use at exit: 714,488,777 bytes in 450,569 blocks ==26708== total heap usage: 703,774 allocs, 253,205 frees, 996,296,333 bytes allocated ==26708== ==26708== LEAK SUMMARY: ==26708== definitely lost: 198,974 bytes in 2,805 blocks ==26708== indirectly lost: 119,798,006 bytes in 73,152 blocks ==26708== possibly lost: 594,376,792 bytes in 372,496 blocks ==26708== still reachable: 115,005 bytes in 2,116 blocks ==26708== suppressed: 0 bytes in 0 blocks ==26708== Rerun with --leak-check=full to see details of leaked memory ==26708== ==26708== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) ==26708== ==26708== 1 errors in context 1 of 1: ==26708== Conditional jump or move depends on uninitialised value(s) ==26708== at 0x1120E9: skip_test_for_current_func (hardened.c:1435) ==26708== by 0x119685: build_note_checker (hardened.c:2015) ==26708== by 0x10F83B: annocheck_walk_notes (annocheck.c:524) ==26708== by 0x11543A: check_note_section (hardened.c:2779) ==26708== by 0x1102DA: UnknownInlinedFun (annocheck.c:633) ==26708== by 0x1102DA: process_elf (annocheck.c:1493) ==26708== by 0x110D29: process_file (annocheck.c:1708) ==26708== by 0x10E31B: UnknownInlinedFun (annocheck.c:1866) ==26708== by 0x10E31B: main (annocheck.c:1958) ==26708== Uninitialised value was created by a stack allocation ==26708== at 0x118376: annocheck_get_symbol_name_and_type (annocheck.c:1363) Hi Jiri, (In reply to Jiri Jaburek from comment #18) > Running under valgrind, I can see annocheck having some issues: > ==26708== Warning: invalid file descriptor 1031 in syscall openat() I have no idea what this means, so I will ignore it... > ==26708== Conditional jump or move depends on uninitialised value(s) > ==26708== at 0x1120E9: skip_test_for_current_func (hardened.c:1435) > ==26708== by 0x119685: build_note_checker (hardened.c:2015) > ==26708== Uninitialised value was created by a stack allocation > ==26708== at 0x118376: annocheck_get_symbol_name_and_type > (annocheck.c:1363) I am still puzzled. I have looked over that code but not found any problems. Plus when I run your 168+bash command line under valgrind there are no errors and everything works. :-( What type of host machine are you using ? If I generated a scratch rpm build with a potential patch in it, would you be able to download it and try it out ? Cheers Nick (In reply to Nick Clifton from comment #19) > Hi Jiri, > > (In reply to Jiri Jaburek from comment #18) > > > Running under valgrind, I can see annocheck having some issues: > > > ==26708== Warning: invalid file descriptor 1031 in syscall openat() > > I have no idea what this means, so I will ignore it... It means that something is passing an int with the number 1031 to openat(), but that file descriptor is not actually open (EBADF). > > > ==26708== Conditional jump or move depends on uninitialised value(s) > > ==26708== at 0x1120E9: skip_test_for_current_func (hardened.c:1435) > > ==26708== by 0x119685: build_note_checker (hardened.c:2015) > > > ==26708== Uninitialised value was created by a stack allocation > > ==26708== at 0x118376: annocheck_get_symbol_name_and_type > > (annocheck.c:1363) > > I am still puzzled. I have looked over that code but not found any > problems. Plus when I run your 168+bash command line under valgrind > there are no errors and everything works. :-( So I managed to figure out the reproducer more specifically - you need a LOT of debuginfos to be installed, preferably for all packages on the system. Ie. reserve a system in Beaker with latest RHEL-9 compose and do rpm -qa | xargs dnf debuginfo-install -y --skip-broken --nogpgcheck I'm not sure what the magical number of packages or files in /usr/lib/debug is, but my initial reproducer was with ~600 RPMs instaled, and it also works with ~900 RPMs in the default Beaker reservation. ... There are two probably unrelated issues here, one kind of discovering the other. The primary issue is, as valgrind hints, in annocheck not closing some file descriptor and this exceeding the default 1024 limit. You can see this from strace: openat(AT_FDCWD, "/bin/bash", O_RDONLY) = 3 openat(AT_FDCWD, "/bin/bash", O_RDONLY) = 3 openat(AT_FDCWD, "/usr/lib/debug/.build-id/5d/9f2a19a933da650ba5cd7b3a303da1301ca2e1.debug", O_RDONLY) = 4 annocheck: Version 9.86. openat(AT_FDCWD, "/usr/lib/debug/.build-id/5d/9f2a19a933da650ba5cd7b3a303da1301ca2e1.debug", O_RDONLY) = 5 openat(AT_FDCWD, "/usr/lib/debug/.build-id/5d/9f2a19a933da650ba5cd7b3a303da1301ca2e1.debug", O_RDONLY) = 6 openat(AT_FDCWD, "/usr/lib/debug/.build-id/5d/9f2a19a933da650ba5cd7b3a303da1301ca2e1.debug", O_RDONLY) = 7 openat(AT_FDCWD, "/usr/lib/debug/.build-id/5d/9f2a19a933da650ba5cd7b3a303da1301ca2e1.debug", O_RDONLY) = 8 openat(AT_FDCWD, "/usr/lib/debug/.build-id/5d/9f2a19a933da650ba5cd7b3a303da1301ca2e1.debug", O_RDONLY) = 9 openat(AT_FDCWD, "/bin/bash", O_RDONLY) = 3 openat(AT_FDCWD, "/bin/bash", O_RDONLY) = 3 openat(AT_FDCWD, "/usr/lib/debug/.build-id/5d/9f2a19a933da650ba5cd7b3a303da1301ca2e1.debug", O_RDONLY) = 10 Hardened: PASS: test: stack-prot file: /bin/bash. openat(AT_FDCWD, "/usr/lib/debug/.build-id/5d/9f2a19a933da650ba5cd7b3a303da1301ca2e1.debug", O_RDONLY) = 11 openat(AT_FDCWD, "/usr/lib/debug/.build-id/5d/9f2a19a933da650ba5cd7b3a303da1301ca2e1.debug", O_RDONLY) = 12 openat(AT_FDCWD, "/usr/lib/debug/.build-id/5d/9f2a19a933da650ba5cd7b3a303da1301ca2e1.debug", O_RDONLY) = 13 openat(AT_FDCWD, "/usr/lib/debug/.build-id/5d/9f2a19a933da650ba5cd7b3a303da1301ca2e1.debug", O_RDONLY) = 14 openat(AT_FDCWD, "/usr/lib/debug/.build-id/5d/9f2a19a933da650ba5cd7b3a303da1301ca2e1.debug", O_RDONLY) = 15 openat(AT_FDCWD, "/bin/bash", O_RDONLY) = 3 openat(AT_FDCWD, "/bin/bash", O_RDONLY) = 3 openat(AT_FDCWD, "/usr/lib/debug/.build-id/5d/9f2a19a933da650ba5cd7b3a303da1301ca2e1.debug", O_RDONLY) = 16 Hardened: PASS: test: stack-prot file: /bin/bash. ... The second issue that valgrind found was that if (ELF64_ST_TYPE (per_file.component_type) == STT_GNU_IFUNC) in skip_test_for_current_func uses an undefined value, likely per_file.component_type. Now, per_file is a global variable, and its .component_type seems to be set earlier in build_note_checker via /* If the new range is valid, get a component name for it. */ if (start != end) get_component_name (data, sec, note_data, prefer_func_name); but there probably is some execution flow where this doesn't get called before a call to skip_test_for_current_func is made via case 0: /* NONE */ /* See BZ 1923439: Parts of glibc are deliberately compiled without stack protection, because they execute before the framework is established. This is currently handled by tests in skip_check (). */ if (! skip_test_for_current_func (data, TEST_STACK_PROT)) fail (data, TEST_STACK_PROT, SOURCE_ANNOBIN_NOTES, "stack protection deliberately disabled"); break; This doesn't seem to be triggered when there's enough file descriptors, though. > > What type of host machine are you using ? > > If I generated a scratch rpm build with a potential patch in it, would > you be able to download it and try it out ? > > Cheers > Nick (In reply to Jiri Jaburek from comment #20) Hi Jiri, > The primary issue is, as valgrind hints, in annocheck not closing some file > descriptor and this exceeding the default 1024 limit. Hmm, yes that would be a problem. I will investigate this. > The second issue that valgrind found was that > > if (ELF64_ST_TYPE (per_file.component_type) == STT_GNU_IFUNC) > > in skip_test_for_current_func uses an undefined value, likely > per_file.component_type. Possibly, but the per_file structure is initialised in start() by: memset (& per_file, 0, sizeof per_file); I think that it might actually be the per_file.component_name field which is causing the problem. This is set to a string which is obtained via a call to a libelf library function, and annocheck was assuming that the string would remain valid for the duration of the program's execution. (See BZ #1988715, which I think is related). Are you able to try out either annobin-9.87-1.fc35 or annobin-9.87-1.el9 which contain a fix for this assumption ? > This doesn't seem to be triggered when there's enough file descriptors, though. Which is strange. Why would the two issues be related ? Cheers Nick (In reply to Nick Clifton from comment #21) > I think that it might actually be the per_file.component_name field which is > causing the problem. This is set to a string which is obtained via a call > to a libelf library function, and annocheck was assuming that the string > would remain valid for the duration of the program's execution. (See BZ > #1988715, which I think is related). > > Are you able to try out either annobin-9.87-1.fc35 or annobin-9.87-1.el9 > which contain a fix for this assumption ? I can check later, but you should be able to reproduce the issue via Beaker yourself and see if that helps. In the meantime, there seems to be an issue more relevant to this BZ as ldconfig is FAILing on s390x / ppc64le even when alone on the cmdline: [root@s390x ~]# annocheck --verbose --skip-all --test-stack-prot /usr/sbin/ldconfig annocheck: Version 9.87. Hardened: /usr/sbin/ldconfig: info: set binary producer to Gas version 2. Hardened: /usr/sbin/ldconfig: WARN: Command line options not recorded by -grecord-gcc-switches Hardened: /usr/sbin/ldconfig: info: ALSO written in C (source: DW_AT_language string). Hardened: /usr/sbin/ldconfig: info: set binary producer to GCC version 11. Hardened: /usr/sbin/ldconfig: PASS: stack-prot test Hardened: /usr/sbin/ldconfig: info: set binary producer to Gas version 2. Hardened: /usr/sbin/ldconfig: info: set binary producer to Gas version 2. Hardened: /usr/sbin/ldconfig: info: notes produced by assembler plugin version 1 Hardened: /usr/sbin/ldconfig: info: set binary producer to Gimple version 9. Hardened: /usr/sbin/ldconfig: info: notes produced by lto plugin version 9.83 Hardened: /usr/sbin/ldconfig: FAIL: stack-prot test because stack protection deliberately disabled (function: __tls_get_offset) Hardened: /usr/sbin/ldconfig: skip: stack-prot test because function check_one_fd is part of the C library's startup code, which executes before stack protection is established [root@s390x ~]# ls -l /usr/lib/debug/usr/sbin/ldconfig-2.34-1.el9.s390x.debug -rw-r--r--. 1 root root 2737672 Aug 3 12:07 /usr/lib/debug/usr/sbin/ldconfig-2.34-1.el9.s390x.debug (I don't have access to ppc64le yet to see what the exact error is there, but I'd imagine it will be similar.) > > > > This doesn't seem to be triggered when there's enough file descriptors, though. > > Which is strange. Why would the two issues be related ? > > Cheers > Nick (In reply to Jiri Jaburek from comment #22) Hi Jiri, FYI: annocheck in rawhide now has a fix for the file descriptor problem. Cheers Nick PS. I will try getting a machine from Beaker to test out the /usr/sbin/ldconfig problem, but I have not had much success in obtaining machines from there... Right - I was finally able to reproduce the ldconfig problem on an s390x box, and so I have added a special case exception for the __tls_get_offset function. The fixed annobin is annobin-9.89-1.el9 (In reply to Nick Clifton from comment #24) > Right - I was finally able to reproduce the ldconfig problem on an s390x > box, and so I have added a special case exception for the __tls_get_offset > function. > > The fixed annobin is annobin-9.89-1.el9 I can confirm that 9.89-1.el9 fixes the issue for me on s390x. However (as I mentioned before) ppc64le still FAILs. I was able to acquire a system and get the function name, though: [root@ibm-p9z ~]# annocheck --verbose --skip-all --test-stack-prot /usr/sbin/ldconfig annocheck: Version 9.89. Hardened: /usr/sbin/ldconfig: info: set binary producer to Gas version 2. Hardened: /usr/sbin/ldconfig: WARN: Command line options not recorded by -grecord-gcc-switches Hardened: /usr/sbin/ldconfig: info: ALSO written in C (source: DW_AT_language string). Hardened: /usr/sbin/ldconfig: info: set binary producer to GCC version 11. Hardened: /usr/sbin/ldconfig: PASS: stack-prot test Hardened: /usr/sbin/ldconfig: info: set binary producer to Gas version 2. Hardened: /usr/sbin/ldconfig: info: set binary producer to Gas version 2. Hardened: /usr/sbin/ldconfig: info: notes produced by assembler plugin version 1 Hardened: /usr/sbin/ldconfig: info: set binary producer to Gimple version 9. Hardened: /usr/sbin/ldconfig: info: notes produced by lto plugin version 9.83 Hardened: /usr/sbin/ldconfig: skip: stack-prot test because function _dl_start is part of the C library's startup code, which executes before stack protection is established Hardened: /usr/sbin/ldconfig: FAIL: stack-prot test because stack protection deliberately disabled (function: __libc_setup_tls) OK - restting to ASSIGNED so that I can add the code to handle the ppc64le function name Right - I have added the ppc64le exception to annobin-9.90-1.el9. I can confirm ldconfig PASSes on ppc64le with 9.90-1.el9. Alright, I've identified why it still fails in our test suite. The simple case with "many bash, one ldconfig" no longer fails, but passing many different binaries (which don't share the same debugid) still does. Try (with enough binaries in sbin): # ls -1 /usr/sbin | wc -l 445 # find /usr/sbin -type f | xargs annocheck --ignore-unknown --verbose --skip-all --test-stack-prot | grep ldconfig Hardened: /usr/sbin/ldconfig: PASS: stack-prot test Hardened: /usr/sbin/ldconfig: info: set binary producer to <unknown>. Hardened: /usr/sbin/ldconfig: info: set binary producer to <unknown>. Hardened: /usr/sbin/ldconfig: info: set binary producer to Gas version 2. Hardened: /usr/sbin/ldconfig: info: notes produced by assembler plugin version 1 Hardened: /usr/sbin/ldconfig: info: set binary producer to Gimple version 9. Hardened: /usr/sbin/ldconfig: info: notes produced by lto plugin version 9.83 Hardened: /usr/sbin/ldconfig: FAIL: stack-prot test because stack protection deliberately disabled (addr range: 0x98e4..0xd43dc) Hardened: /usr/sbin/ldconfig: FAIL: stack-prot test because stack protection deliberately disabled (addr range: 0x12fa0..0xd4584) ... This is because annocheck still keeps all previous debuginfos open, despite being done with a given binary: annocheck: Version 9.83. Hardened: /usr/sbin/request-key: info: set binary producer to Gimple version 11. Hardened: /usr/sbin/request-key: PASS: stack-prot test openat(AT_FDCWD, "/usr/lib/debug/.build-id/5c/3a725d1c79862f5e154e68f25c093351f5412a.debug", O_RDONLY) = 6 Hardened: /usr/sbin/request-key: info: set binary producer to GCC version 11. Hardened: /usr/sbin/request-key: info: set binary producer to Gas version 2. Hardened: /usr/sbin/request-key: info: notes produced by assembler plugin openat(AT_FDCWD, "/usr/lib/debug/.build-id/5c/3a725d1c79862f5e154e68f25c093351f5412a.debug", O_RDONLY) = 7 version 1 Hardened: /usr/sbin/request-key: info: set binary producer to Gimple version 9. Hardened: /usr/sbin/request-key: info: notes produced by lto plugin openat(AT_FDCWD, "/usr/lib/debug/.build-id/5c/3a725d1c79862f5e154e68f25c093351f5412a.debug", O_RDONLY) = 8 openat(AT_FDCWD, "/usr/lib/debug/.build-id/5c/3a725d1c79862f5e154e68f25c093351f5412a.debug", O_RDONLY) = 9 openat(AT_FDCWD, "/usr/sbin/key.dns_resolver", O_RDONLY) = 3 openat(AT_FDCWD, "/usr/sbin/key.dns_resolver", O_RDONLY) = 3 openat(AT_FDCWD, "/usr/lib/debug/.build-id/8d/50e947bfcbf1d172c741aa32a8f5abbf283af6.debug", O_RDONLY) = 10 openat(AT_FDCWD, "/usr/lib/debug/.build-id/0d/591fbd834c9c3fefd6aa4e66ed25073c4274a1.debug", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/debug/usr/sbin/../../.dwz/keyutils-1.6.1-4.el9.x86_64", O_RDONLY) = 11 version 9.83 Hardened: /usr/sbin/key.dns_resolver: info: set binary producer to Gimple version 11. Hardened: /usr/sbin/key.dns_resolver: PASS: stack-prot test openat(AT_FDCWD, "/usr/lib/debug/.build-id/8d/50e947bfcbf1d172c741aa32a8f5abbf283af6.debug", O_RDONLY) = 12 Hardened: /usr/sbin/key.dns_resolver: info: set binary producer to GCC version 11. Hardened: /usr/sbin/key.dns_resolver: info: set binary producer to Gas version 2. Hardened: /usr/sbin/key.dns_resolver: info: notes produced by assembler plugin openat(AT_FDCWD, "/usr/lib/debug/.build-id/8d/50e947bfcbf1d172c741aa32a8f5abbf283af6.debug", O_RDONLY) = 13 version 1 Hardened: /usr/sbin/key.dns_resolver: info: set binary producer to Gimple version 9. Hardened: /usr/sbin/key.dns_resolver: info: notes produced by lto plugin openat(AT_FDCWD, "/usr/lib/debug/.build-id/8d/50e947bfcbf1d172c741aa32a8f5abbf283af6.debug", O_RDONLY) = 14 openat(AT_FDCWD, "/usr/lib/debug/.build-id/8d/50e947bfcbf1d172c741aa32a8f5abbf283af6.debug", O_RDONLY) = 15 openat(AT_FDCWD, "/usr/sbin/efibootmgr", O_RDONLY) = 3 openat(AT_FDCWD, "/usr/sbin/efibootmgr", O_RDONLY) = 3 openat(AT_FDCWD, "/usr/lib/debug/.build-id/bd/ab63a34cdbe0e0fbd236a7dd7ef7101ab9282f.debug", O_RDONLY) = 16 openat(AT_FDCWD, "/usr/lib/debug/.build-id/49/29d60c452f3f2117e88c6146cedcc2019033f9.debug", O_RDONLY) = -1 ENOENT (No such file or directory) openat(AT_FDCWD, "/usr/lib/debug/usr/sbin/../../.dwz/efibootmgr-16-12.el9.x86_64", O_RDONLY) = 17 version 9.83 and then it starts running out: version 9.83 Hardened: /usr/sbin/ldattach: info: set binary producer to Gimple version 11. Hardened: /usr/sbin/ldattach: PASS: stack-prot test openat(AT_FDCWD, "/usr/lib/debug/.build-id/92/45c30c64650adb2393f4c0a84ae5c74cc727aa.debug", O_RDONLY) = 1020 Hardened: /usr/sbin/ldattach: info: set binary producer to GCC version 11. Hardened: /usr/sbin/ldattach: info: set binary producer to Gas version 2. Hardened: /usr/sbin/ldattach: info: notes produced by assembler plugin openat(AT_FDCWD, "/usr/lib/debug/.build-id/92/45c30c64650adb2393f4c0a84ae5c74cc727aa.debug", O_RDONLY) = 1021 version 1 Hardened: /usr/sbin/ldattach: info: set binary producer to Gimple version 9. Hardened: /usr/sbin/ldattach: info: notes produced by lto plugin openat(AT_FDCWD, "/usr/lib/debug/.build-id/92/45c30c64650adb2393f4c0a84ae5c74cc727aa.debug", O_RDONLY) = 1022 openat(AT_FDCWD, "/usr/lib/debug/.build-id/92/45c30c64650adb2393f4c0a84ae5c74cc727aa.debug", O_RDONLY) = 1023 openat(AT_FDCWD, "/usr/sbin/hwclock", O_RDONLY) = 3 openat(AT_FDCWD, "/usr/sbin/hwclock", O_RDONLY) = 3 openat(AT_FDCWD, "/usr/lib/debug/.build-id/a2/b7797dbe75d273990d185311905d03e97d4436.debug", O_RDONLY) = -1 EMFILE (Too many open files) openat(AT_FDCWD, "./hwclock-2.37.1-2.el9.x86_64.debug", O_RDONLY) = -1 EMFILE (Too many open files) openat(AT_FDCWD, "./.debug/hwclock-2.37.1-2.el9.x86_64.debug", O_RDONLY) = -1 EMFILE (Too many open files) openat(AT_FDCWD, "/usr/sbin/hwclock-2.37.1-2.el9.x86_64.debug", O_RDONLY) = -1 EMFILE (Too many open files) openat(AT_FDCWD, "/usr/sbin/.debug/hwclock-2.37.1-2.el9.x86_64.debug", O_RDONLY) = -1 EMFILE (Too many open files) openat(AT_FDCWD, "/usr/lib/debug/hwclock-2.37.1-2.el9.x86_64.debug", O_RDONLY) = -1 EMFILE (Too many open files) (In reply to Jiri Jaburek from comment #39) > This is because annocheck still keeps all previous debuginfos open, despite > being done with a given binary: > > annocheck: Version 9.83. Ah - but the fix for closing debuginfo files went into annocheck 9.88, so using version 9.83 will not work... Cheers Nick Should be fixed in annobin-10.06-1.el9 Right, please try the latest build: annobin-10.09-1.el9 pre-verified: annobin-10.10-1.el9 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (new packages: annobin), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2022:2342 |