Bug 1967482

Summary: DTRACE_PROBE4() compilation failure with Clang on ppc64le
Product: Red Hat Enterprise Linux 8 Reporter: Thomas Huth <thuth>
Component: llvmAssignee: Timm Bäder <tbaeder>
Status: CLOSED ERRATA QA Contact: Miloš Prchlík <mprchlik>
Severity: high Docs Contact:
Priority: medium    
Version: 8.5CC: fche, lberk, mcermak, mjw, mnewsome, mprchlik, sguelton, tschelle, tstellar
Target Milestone: betaKeywords: Bugfix, Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: ppc64le   
OS: Linux   
Whiteboard:
Fixed In Version: llvm-12.0.1-1.module+el8.5.0+11871+08d0eab5 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 1977503 (view as bug list) Environment:
Last Closed: 2021-11-09 18:34:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1940132, 1977503    

Description Thomas Huth 2021-06-03 08:13:47 UTC
Description of problem:
Compiling a program with Clang that uses DTRACE_PROBE4() fails on a ppc64le host. Same program compiles fine with gcc on ppc64le or with Clang on a x86 host.

Version-Release number of selected component (if applicable):
systemtap-sdt-devel-4.4-10.el8.ppc64le
clang-11.0.0-1.module+el8.4.0+8598+a071fcd5.ppc64le

How reproducible:
100%

Steps to Reproduce:
1. cat > /tmp/probetest.c << EOF

#include <sys/sdt.h>

void func(int a, int b, int c, int d)
{
    DTRACE_PROBE4(my_provider, my_probe, a, b, c, d);
}

int main(int argc, char **argv)
{
    func(0, 1, 2, 3);
    return 0;
}
EOF

2. clang -c /tmp/probetest.c

Actual results:

/tmp/probetest.c:5:5: error: invalid operand in inline asm: '990: nop.pushsection .note.stapsdt,"?","note".balign 4.4byte 992f-991f,994f-993f,3991: .asciz "stapsdt"992: .balign 4993: .8byte 990b.8byte _.stapsdt.base.8byte 0.asciz "my_provider".asciz "my_probe".asciz "${0:n}@${1:I}$1 ${2:n}@${3:I}$3 ${4:n}@${5:I}$5 ${6:n}@${7:I}$7"994: .balign 4.popsection'
    DTRACE_PROBE4(my_provider, my_probe, a, b, c, d);
    ^
/usr/include/sys/sdt.h:437:3: note: expanded from macro 'DTRACE_PROBE4'
  STAP_PROBE4(provider,probe,parm1,parm2,parm3,parm4)
  ^
/usr/include/sys/sdt.h:329:3: note: expanded from macro 'STAP_PROBE4'
  _SDT_PROBE(provider, name, 4, (arg1, arg2, arg3, arg4))
  ^
/usr/include/sys/sdt.h:77:27: note: expanded from macro '_SDT_PROBE'
    __asm__ __volatile__ (_SDT_ASM_BODY(provider, name, _SDT_ASM_ARGS, (n)) \
                          ^
note: (skipping 1 expansions in backtrace; use -fmacro-backtrace-limit=0 to see all)
/usr/include/sys/sdt.h:82:26: note: expanded from macro '_SDT_ASM_1'
# define _SDT_ASM_1(x)                  _SDT_S(x) "\n"
                                        ^
/usr/include/sys/sdt.h:81:22: note: expanded from macro '_SDT_S'
# define _SDT_S(x)                      #x
                                        ^
<scratch space>:2:1: note: expanded from here
"990: nop"
^

Expected results:
Example program should get compiled successfully.

Additional info:
Same problem exists in the current version for RHEL9, so please clone this BZ to RHEL9 if necessary.

Comment 1 Frank Ch. Eigler 2021-06-03 16:22:55 UTC
"invalid operand in inline asm" type errors tend to be consequences of compiler optimizations fighting with the STAP_SDT_ARG_CONSTRAINT parameter for governing the inline-assembler operands.  These tend not to be bugs either in llvm, the instrumented application, nor in systemtap, but just an unlucky integration difficulty.

See /usr/include/sys/sdt.h for a brief blurb on the phenomenon.  Consider a #define'ing a different STAP_SDT_ARG_CONSTRAINT.

Comment 2 Thomas Huth 2021-06-04 08:20:34 UTC
It seems to work indeed if I set STAP_SDT_ARG_CONSTRAINT to either "nr" or "r" instead of "nor", so seems like Clang does not like the "o" constraint on ppc64.

Now I don't think it makes sense to clutter userspace programs with STAP_SDT_ARG_CONSTRAINT depending on the architecture and compiler that is getting used.
So how could we proceed here in a more generic way? Should this BZ get re-assigned to the clang component, so they could fix the problem with the "o" constraint? Or would it make more sense to include a "#if defined(__clang__) && defined(__ppc64__)" section in sys/sdt.h to define a differnt default value for STAP_SDT_ARG_CONSTRAINT in that case?

Comment 4 Frank Ch. Eigler 2021-06-04 11:09:03 UTC
(In reply to Thomas Huth from comment #2)
> It seems to work indeed if I set STAP_SDT_ARG_CONSTRAINT to either "nr" or
> "r" instead of "nor", so seems like Clang does not like the "o" constraint
> on ppc64.

Doesn't like it as in doesn't support it at all, or that particular point of code with that particular constraint generates an error?  Note that on __powerpc__, the sys/sdt.h file uses nZr as the default (line 100ish).

> Now I don't think it makes sense to clutter userspace programs with
> STAP_SDT_ARG_CONSTRAINT depending on the architecture and compiler that is
> getting used.

The problem is that the constraints represent a tradeoff that systemtap is not well positioned to impose.  An "r" constraint forces the compiler to load all the parameters into registers, even if they didn't otherwise need to be there.  That means increasing register pressure, which if in some tight loop could hit performance.  For a given region of code, there might simply not be enough free registers, and we get an error anyway.

This parameter gives developers some power to choose, and the occasional necessity.

Comment 5 Thomas Huth 2021-06-04 11:17:44 UTC
(In reply to Frank Ch. Eigler from comment #4)
> Doesn't like it as in doesn't support it at all, or that particular point of
> code with that particular constraint generates an error?  Note that on
> __powerpc__, the sys/sdt.h file uses nZr as the default (line 100ish).
[...]
> This parameter gives developers some power to choose, and the occasional
> necessity.

The code works perfectly fine when using gcc on that ppc64le machine. So I think the constraint should normally not be an issue here. It's just that Clang does not like it at all...

Comment 6 Frank Ch. Eigler 2021-06-04 16:15:48 UTC
If "nZr" is not acceptable to powerpc clang, that would seem to be an outright gcc incompatibility that maybe they should fix.

Comment 7 Tom Stellard 2021-06-04 18:20:52 UTC
Does someone have a reduced test case with the failing inline asm statement?

Comment 8 Thomas Huth 2021-06-05 06:28:26 UTC
(In reply to Tom Stellard from comment #7)
> Does someone have a reduced test case with the failing inline asm statement?

If I preprocess the above example and strip down the result to the bare minimum, I end up with:

void func(int myarg)
{
    asm volatile(" .asciz \"%n[S1]@%I[A1]%[A1]\" "
                 :: [S1] "n" (1), [A1] "nZr" ((myarg)));
}

int main(int argc, char **argv)
{
    func(0);
    return 0;
}

... which still compiles fine with gcc on ppc64le and still produces the "invalid operand in inline asm" with clang.

Comment 11 Miloš Prchlík 2021-08-16 08:02:40 UTC
Bumping to ITM25 to gain more space for dealing with some issues that appeared in llvm-toolset gating.

Comment 14 Miloš Prchlík 2021-08-23 06:52:58 UTC
Verified with llvm-toolset-rhel8-8050020210806062652.b4937e53, llvm-toolset-12.0.1-1.module+el8.5.0+11871+08d0eab5.

Comment 17 errata-xmlrpc 2021-11-09 18:34:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (llvm-toolset:rhel8 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2021:4233