Bug 2273618 - Optimizing with -O2 causes wrong results on s390x
Summary: Optimizing with -O2 causes wrong results on s390x
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: gcc
Version: 40
Hardware: s390x
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Jakub Jelinek
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ZedoraTracker
TreeView+ depends on / blocked
 
Reported: 2024-04-05 11:00 UTC by Jonas Ådahl
Modified: 2024-04-12 13:45 UTC (History)
13 users (show)

Fixed In Version: gcc-14.0.1-0.14.fc41
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-04-12 13:45:04 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Reproducer (1.66 KB, text/x-csrc)
2024-04-05 11:00 UTC, Jonas Ådahl
no flags Details


Links
System ID Private Priority Status Summary Last Updated
GNU Compiler Collection 114605 0 P1 UNCONFIRMED [14 Regression] wrong code with -march=z13 -O0 since r14-5831 2024-04-05 13:23:55 UTC

Description Jonas Ådahl 2024-04-05 11:00:05 UTC
When investigating faulty rendering in GNOME Shell when running under s390x, I eventually discovered that compiling mutter with -O0 made the issue go away.

Eventually I narrowed it down to a function that did a memcpy from a local float array to a stack allocated float array in a callee.

I could also work around it in three ways:

* #pragma GCC optimize ("O0") around the affected function.
* Mark the float array copied from as volatile
* Switch the memcpy to a for loop

With that in mind, I took the relevant code, removed as much as I could while still reproducing. It isn't only the memcpy; e.g. it needs a bit of noise to make it reproduce.

Attaching reproducing C file. When running, if it doesn't reproduce, it exits cleanly. If it reproduces it'll print

1.000000 == 0.000000 failed
Aborted (core dumped)

The three discovered workarounds are included in the C file, hidden behind `#if 0`.

Reproducible: Always

Comment 1 Jonas Ådahl 2024-04-05 11:00:46 UTC
Created attachment 2025354 [details]
Reproducer

Comment 2 Dan Horák 2024-04-05 11:45:42 UTC
Jonas, could you make also the attachment public? Thanks.

Comment 3 Jonas Ådahl 2024-04-05 11:55:08 UTC
(In reply to Dan Horák from comment #2)
> Jonas, could you make also the attachment public? Thanks.

Done; sorry about that.

Comment 4 Dan Horák 2024-04-05 12:05:02 UTC
Thanks and for the record it reproduces on z14 with gcc-14.0.1-0.13.fc41.s390x, but not with gcc-13.2.1-4.fc38.s390x

Comment 5 Jakub Jelinek 2024-04-05 12:11:44 UTC
Simplified for -march=z13 -O0:

typedef struct { const float *a; int b, c; float *d; } S;

__attribute__((noipa)) void
bar (void)
{
}

__attribute__((noinline, optimize (2))) static void
foo (S *e)
{
  const float *f;
  float *g;
  float h[4] = { 0.0, 0.0, 1.0, 1.0 };
  if (!e->b)
    f = h;
  else
    f = e->a;
  g = &e->d[0];
  __builtin_memcpy (g, f, sizeof (float) * 4);
  bar ();
  if (!e->b)
    if (g[0] != 0.0 || g[1] != 0.0 || g[2] != 1.0 || g[3] != 1.0)
      __builtin_abort ();
}

int
main ()
{
  float d[4];
  S e = { .d = d };
  foo (&e);
  return 0;
}

Bisecting now.

Comment 6 Jakub Jelinek 2024-04-05 13:10:07 UTC
Bisected to https://gcc.gnu.org/r14-5831


Note You need to log in before you can comment on or make changes to this bug.