Bug 1807176 - Regression in code complation between clang 9 and clang 10
Summary: Regression in code complation between clang 9 and clang 10
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: clang
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: serge_sans_paille
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-25 18:14 UTC by Omair Majid
Modified: 2020-04-21 14:22 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-20 11:13:48 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
Pre-processed code with the bug (403.39 KB, text/plain)
2020-02-25 18:20 UTC, Omair Majid
no flags Details
Dockerfile to reproduce the issue (601 bytes, text/plain)
2020-02-26 00:18 UTC, Omair Majid
no flags Details

Description Omair Majid 2020-02-25 18:14:44 UTC
Description of problem:

I am not sure if this is even a bug. It could just be an undocumented behaviour change that brings clang closer to gcc.

CoreCLR, a part of .NET Core, uses clang as a compiler. On Fedora 33 (rawhide) it fails to build. The error message is:

In file included from /home/omajid/coreclr/src/pal/src/misc/tracepointprovider.cpp:19:
In file included from /home/omajid/coreclr/src/pal/src/include/pal/palinternal.h:620:
In file included from /usr/bin/../lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/stdlib.h:30:
/usr/include/stdlib.h:112:36: error: exception specification in declaration does not match previous declaration
__extension__ extern long long int atoll (const char *__nptr)
                                   ^
/home/omajid/coreclr/src/pal/inc/pal.h:4227:33: note: previous declaration is here
PALIMPORT long long int __cdecl atoll(const char *) THROW_DECL;
                                ^
In file included from /home/omajid/coreclr/src/pal/src/misc/tracepointprovider.cpp:19:
In file included from /home/omajid/coreclr/src/pal/src/include/pal/palinternal.h:621:
/usr/include/string.h:43:14: error: exception specification in declaration does not match previous declaration
extern void *memcpy (void *__restrict __dest, const void *__restrict __src,
             ^
/home/omajid/coreclr/src/pal/inc/pal.h:4214:26: note: previous declaration is here
PALIMPORT void * __cdecl memcpy(void *, const void *, size_t) THROW_DECL;
                         ^

For some background, CoreCLR re-defines built-in functions, like atoll and memcpy. Their declarations look like this:

#ifndef THROW_DECL
#if defined(_MSC_VER) || defined(__llvm__) || !defined(__cplusplus)
#define THROW_DECL
#else
#define THROW_DECL throw()
#endif // !_MSC_VER
#endif // !THROW_DECL

PALIMPORT void * __cdecl memcpy(void *, const void *, size_t) THROW_DECL;
PALIMPORT long long int __cdecl atoll(const char *) THROW_DECL;

The complete code is here: https://github.com/dotnet/coreclr/blob/c5d3d752260383fbed72ba2b4d86d82fea673c76/src/pal/inc/pal.h#L4224

This code compiles fine on Fedora 31, but fails on Fedora 33. Changing the THROW_DECL definition to be something like this makes things work:

#ifndef THROW_DECL
#if defined(_MSC_VER) || !defined(__cplusplus)
#define THROW_DECL
#else
#define THROW_DECL throw()
#endif // !_MSC_VER
#endif // !THROW_DECL

It seems like clang 10 now understands that builtin-memcpy and builtin-atoll both have a `throw()` exception specification and expect to see our redefined versions to have the same specification. gcc has apparently always seen them with a `throw()` specification, as seen by the change to this same file needed for CoreCLR to support GCC here: https://github.com/dotnet/coreclr/pull/22129/files#diff-b836af8f59cbb2bd02de8224c57d0002


1. Is this a bug or intentional behaviour change in clang?
   - If this is a bug, any ETA on a fix? Or a workaround I can use?
   - If this is an intentional change, can this be documented somehwere so I can point upstream CoreCLR folks at this?
   - If this is an intentional change, what would be a good way to feature-test whether the current version of clang has this ability so we can define THROW_DECL correctly?


Version-Release number of selected component (if applicable):

clang-10.0.0-0.3.rc2.fc33.x86_64
llvm-10.0.0-0.1.rc2.fc33.x86_64

How reproducible:
100%

Steps to Reproduce:
1. git clone https://github.com/dotnet/coreclr
2. cd coreclr
3. git checout release/3.1
4. VERBOSE=1 ./build.sh

Actual results:
Build fails

Expected results:
Build works? Or a fix for the build?

Additional info:

Comment 1 Omair Majid 2020-02-25 18:20:22 UTC
Created attachment 1665714 [details]
Pre-processed code with the bug

This is the pre-processed cpp file that causes clang to complain about the error above.

Comment 2 serge_sans_paille 2020-02-25 21:36:06 UTC
I've just tested on an f28 with clang 6.0.1: I also have the error

Tested on an f33 with clang 10.0.0 : same error

Tested on a rhel7.5 with clang 6.0.1 : same error again.

Until now, I fail to reproduce the original « it works » setting.

I'd say that `defined(__llvm__)` should be pruned, But I'd be happier if I could reproduce the original « it works » setting.

Comment 3 serge_sans_paille 2020-02-25 21:53:59 UTC
You can point them at the following snippet to advocate for throw() for clang/llvm : https://godbolt.org/z/iKW3Vh

Comment 4 Omair Majid 2020-02-26 00:18:21 UTC
Created attachment 1665777 [details]
Dockerfile to reproduce the issue

I am attaching a Dockerfile that I can run locally to see the "working" vs "failing" builds.

You just need to change the first line of the Dockerfile to see the difference.

It works (at least the tracepointprovider.cpp file is compiled successfully) with:

FROM fedora:31

But errors out with:

FROM fedora:33

I built/run it using:

$ podman build -t deleteme .
$ podman run -it deleteme

Comment 5 serge_sans_paille 2020-03-02 14:33:23 UTC
Some more information:

with clang++-9, the following code needs `-fms-extensions` to compile without warning:

```
long long int atoll(const char *) ;
__extension__ extern long long int atoll (const char *__nptr)
     throw () __attribute__ ((__pure__)) __attribute__ ((__nonnull__ (1))) ;
```

with clang++10, it now needs (due to commit 39aa8954a4846b317d3da2f0addfce8224b438de upstream) `-fms-compatibility`.

So basically there has been a change in handling of exception spec mismatch between clang9 and clang 10, which explains your issue.
The right fix is of course to remove `defined(__llvm__)` from the pal header, because tehre's been a mismatch for a long time, it was just ignored due to the compiler flag combination used by the project.

Comment 6 serge_sans_paille 2020-04-20 11:13:48 UTC
Patch is being handled upstream on dotnet side by Omair, not a clang/llvm issue anymore, closing it.

Comment 7 Omair Majid 2020-04-21 14:22:36 UTC
For the record, this was the fix merged into dotnet: https://github.com/dotnet/runtime/pull/32837

The fix merged is exactly what was suggested above:

> The right fix is of course to remove `defined(__llvm__)` from the pal header, because there's been a mismatch for a long time, it was just ignored due to the compiler flag combination used by the project.


Note You need to log in before you can comment on or make changes to this bug.