Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Created attachment 920180[details]
test case exposing problem
Description of problem:
Gcc generates code that reuses stack for two different variables, which are in different scopes. Unfortunately, modification of the already out-of-scope variable happens after initialization of the second variable, effectively corrupting memory.
I attach cpp reproducing the issue; the idea is as follows:
{
create corruptingVar
}
{
create corruptedVar // by default initalized with 0s; tmp62 in the assembly
check correct initialization;
}
Lets look at the generated assembly:
pxor %xmm0, %xmm0 # tmp62
movdqa %xmm0, (%rsp) # tmp62,
movq $_ZTV11TripleArray+16, (%rsp) #, corruptingVar.D.3281._vptr.TripleArray
movdqa %xmm0, 16(%rsp) # tmp62,
movdqa %xmm0, 32(%rsp) # tmp62,
movdqa %xmm0, 48(%rsp) # tmp62,
cmpq $0, (%rsp) #, corruptedVar.data
jne .L28 #,
Apparently, in the middle of tmp62 initialization (top of stack) a pointer to vtable is set for already out-of-scope variable, which corrupts tmp62.
Version-Release number of selected component (if applicable):
g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4)
How reproducible:
I attach cpp code + makefile reproducing the issue. It is probably not minimal, but I believe it is short enough.
I also attach assembly as generated on my machine.
Steps to Reproduce:
1. make
2. ./test
Actual results:
process aborts
Expected results:
no error
Additional info:
The compiler memory barrier after ccBeingCorrupted is added to make reproduction test case smaller. Same goes with the noinline attribute.
I'm compiling with -fno-strict-aliasing because the original code requires it. The reproduction test doesn't do anything fishy, but once -fno-strict-aliasing is removed the problem doesn't reproduce on provided test case.
As far as I understand -O3 -fno-strict-aliasing is a legitimate, fully supported combination. Please correct me if I'm wrong.
The problem reproduces on g++-4.1 (GCC) 4.1.2 20080704 (Red Hat 4.1.2-52)
The test case doesn't reproduce the problem on g++-4.9.0. I haven't tried hard to reproduce it though.
On 4.4 branch this started with http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=148601 - part of PR40389 fix (4.5 and newer are fine). There doesn't seem to be anything that we could backport, unfortunately.
What happens here is that the code sinking moves the statement that is setting up the vtable pointer:
+Sinking # t_8 = VDEF <t_7>
+t._vptr.T = &_ZTV1T[2];
+ from bb 2 to bb 4
int main() ()
{
int pretmp.74;
@@ -444,7 +98,6 @@
<bb 2>:
t = empty (); [return slot optimization]
- t._vptr.T = &_ZTV1T[2];
<bb 3>:
# i_15 = PHI <i_4(7), 0(2)>
@@ -459,6 +112,7 @@
goto <bb 3>;
<bb 4>:
+ t._vptr.T = &_ZTV1T[2];
__asm__ __volatile__("" : "memory");
D.2196_1 = a.data[0];
if (D.2196_1 != 0)
The workaround is either to use -fno-tree-sink, or to add a memory barrier
asm volatile ("" ::: "memory"); before Array aa; line in the original reproducer.
(In reply to Jakub Zytka from comment #9)
> Thank you very much. I suppose -fno-tree-sink will do.
> We'll check it for performance, but I guess the impact will be negligible.
Thanks for your understanding, Jakub. We do strive to provide fixes in Red Hat Enterprise Linux gcc wherever possible, but sometimes - as in this case where a patch would be relatively invasive into the code base - we need to favour overall stability.
I'm closing this bug out for now, but if you have a follow up query, please feel free to re-open this bug or open a new one and we'll take a look.
Created attachment 920180 [details] test case exposing problem Description of problem: Gcc generates code that reuses stack for two different variables, which are in different scopes. Unfortunately, modification of the already out-of-scope variable happens after initialization of the second variable, effectively corrupting memory. I attach cpp reproducing the issue; the idea is as follows: { create corruptingVar } { create corruptedVar // by default initalized with 0s; tmp62 in the assembly check correct initialization; } Lets look at the generated assembly: pxor %xmm0, %xmm0 # tmp62 movdqa %xmm0, (%rsp) # tmp62, movq $_ZTV11TripleArray+16, (%rsp) #, corruptingVar.D.3281._vptr.TripleArray movdqa %xmm0, 16(%rsp) # tmp62, movdqa %xmm0, 32(%rsp) # tmp62, movdqa %xmm0, 48(%rsp) # tmp62, cmpq $0, (%rsp) #, corruptedVar.data jne .L28 #, Apparently, in the middle of tmp62 initialization (top of stack) a pointer to vtable is set for already out-of-scope variable, which corrupts tmp62. Version-Release number of selected component (if applicable): g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-4) How reproducible: I attach cpp code + makefile reproducing the issue. It is probably not minimal, but I believe it is short enough. I also attach assembly as generated on my machine. Steps to Reproduce: 1. make 2. ./test Actual results: process aborts Expected results: no error Additional info: The compiler memory barrier after ccBeingCorrupted is added to make reproduction test case smaller. Same goes with the noinline attribute. I'm compiling with -fno-strict-aliasing because the original code requires it. The reproduction test doesn't do anything fishy, but once -fno-strict-aliasing is removed the problem doesn't reproduce on provided test case. As far as I understand -O3 -fno-strict-aliasing is a legitimate, fully supported combination. Please correct me if I'm wrong. The problem reproduces on g++-4.1 (GCC) 4.1.2 20080704 (Red Hat 4.1.2-52) The test case doesn't reproduce the problem on g++-4.9.0. I haven't tried hard to reproduce it though.