Bug 1041336
Summary: | attribute((nonnull)) is worthless, but shouldn't be | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Eric Blake <eblake> |
Component: | gcc | Assignee: | Jakub Jelinek <jakub> |
Status: | CLOSED UPSTREAM | QA Contact: | qe-baseos-tools-bugs |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 7.0 | CC: | armbru, eblake, law, mpolacek, rjones |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2013-12-13 17:36:01 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Eric Blake
2013-12-12 15:12:38 UTC
??? __attribute__((nonnull)) int * foo (int *p) { return p; } int * bar (void) { return foo ((int *) 0); } gcc -S -O2 -Wall rh1041336.c rh1041336.c: In function ‘bar’: rh1041336.c:10:3: warning: null argument where non-null required (argument 1) [-Wnonnull] return foo ((int *) 0); ^ (In reply to Jakub Jelinek from comment #2) > int * > bar (void) > { > return foo ((int *) 0); Nobody codes bugs that blatantly. Here's the real problem: $ cat foo.c __attribute__((nonnull)) int * foo (int *p) { return p; } int * bar (void) { int *oops = 0; return foo (oops); } $ gcc -S -O2 -Wall foo.c $ echo $? 0 It's trivial for a human to see that 'oops' is a NULL pointer, but gcc refuses to warn. Coverity and clang catch it, gcc should too. Where the warning would REALLY be helpful is in diagnosing cases where it is not as easy for the human to see a potential NULL deref, such as: __attribute__((nonnull)) int * foo (int *p) { return p; } int * bar (int *blah) { int *other = foo (blah); if (!blah) abort() return other; } Coverity also catches this one under its FORWARD_NULL rule - the fact that we have code later in the function bar() that is conditional on whether 'blah' is NULL or not means that all earlier uses of 'blah' must be assumed to potentially have the value NULL, but passing NULL to foo() is a violation of the nonnull attribute, so either the 'if (!blah)' block is dead code or the 'foo(blah)' block is a bug; either way, a warning is warranted. Where Coverity shines is when the provable potential for NULL is not immediately adjacent to the use of NULL; but the point remains that anywhere the compiler can prove an argument might be NULL, it should warn that the program may be violating the attribute(nonnull) on the function it is calling. I'm okay if Coverity can prove more instances of NULL than gcc, but what I'm not okay with is how easy it is to write blatant NULL derefs that gcc won't flag. Slightly tangential, but it bugs me too much to keep it to myself: gcc is able and eager to recognize and *exploit* null-ness, but at the same time fails to tell me about it. Recently covered again in LWN[*]: The second example is a null pointer dereference in the Linux kernel: struct tun_struct *tun = ...; struct sock *sk = tun->sk; if (!tun) return POLLERR; /* write to address based on tun */ Normally that code would cause a kernel oops if tun is null, but if page zero is mapped for some reason, the code is basically harmless --- as long as the test remains. Because the compiler sees the dereference operation, it can conclude that the pointer is always non-null and remove the test entirely, which turns a fairly innocuous bug into a potential kernel exploit. [*] http://lwn.net/Articles/575563/ (In reply to Markus Armbruster from comment #5) > Slightly tangential, but it bugs me too much to keep it to myself: gcc > is able and eager to recognize and *exploit* null-ness, but at the > same time fails to tell me about it. Recently covered again in > LWN[*]: > > The second example is a null pointer dereference in the Linux > kernel: > > struct tun_struct *tun = ...; > struct sock *sk = tun->sk; > if (!tun) > return POLLERR; > /* write to address based on tun */ > > Normally that code would cause a kernel oops if tun is null, but > if page zero is mapped for some reason, the code is basically > harmless --- as long as the test remains. Because the compiler > sees the dereference operation, it can conclude that the pointer > is always non-null and remove the test entirely, which turns a > fairly innocuous bug into a potential kernel exploit. > > [*] http://lwn.net/Articles/575563/ This is very hard to warn about, the way GCC optimizes this is that the VRP pass, after seeing the pointer dereference, notes in the range info for the SSA_NAME that the range doesn't include 0, and much later the test is actually optimized, but there is no easy way to store why the range was the way it was, ranges even for pointers can be influenced by tests (assertion on the test branches), arithmetics on the pointers etc., ranges can be copied between pointers and merged (union) on PHIs, etc. Warning at the point where we add the assertion that the pointer is non-NULL after pointer dereference would warn pretty much on all code, you'd want to warn only about when you actually optimize something based on it, but as I said, it is close to impossible to track the reason for why the range has been computed that way. Not to mention that in GCC 4.9, the ranges are preserved through optimization passes, so while in <= 4.8 only the VRP pass could optimize based on the range info, in 4.9+ any following pass can optimize based on it too. Jakub's responses are dead-on correct. Also note that more modern versions of GCC (4.9 and beyond) will use information gleamed from the non-null and returns-nonnull attribute to do even more optimizations. Specifically paths where the constraints implied by those attributes are violated will be isolated and traps inserted. For that specific transformation (isolation and trapping) I expect there will be optional warnings. See upstream #16351 (link in external trackers). Note that we might implement runtime checks for this, that would be a part of -fsanitize=undefined. But this has first to be done in the LLVM repo. Supposedly it be done for 4.9. Right now the warning is in the frontends. Warning later on for this has tons of issues, sometimes compiler threads jumps or performs similar optimizations where it will copy and simplify some portions of code that you could end up with foo (0) calls in the code, the compiler couldn't prove those paths are unreachable and if the warning would be late, it would warn, when there is nothing wrong with the original code, there would be no way to avoid that. Also, with some programming techniques, say using macros or templates heavily, one could end up with tons of false positives. What would actually IMHO make sense instead is for -fsanitize=undefined add code to report misuses of the nonnull attribute (and returns_nonnull attribute) at runtime. This is new thing in GCC 4.9 though, and right now the libubsan library we are using doesn't have warning code for this and the library unfortunately has quite problematic upstream in llvm, so it might be difficult to get it accepted there, but we can at least try. -fsanitize=undefined already has warnings about __builtin_unreachable () being reached and similar other problems (signed integer overflow, etc.). I'm all for giving optimizers license (I used to work on one), but when they outrun the toolchain's ability to catch programming errors, it can only end in tears. Static checks are best, of course. This is what Eric asked for. But run-time checks would be so much better than nothing. In particular when they change "optimizer detected your program violates some arcane stipulation of the standard, and turned it into security hole for you (but a really fast one, and besides, it's your own fault!)" into "optimizer detected... and turned it into something that safely crashes". |