Bug 529299

Summary: wrong code generated for memcpy on sparc64
Product: [Fedora] Fedora Reporter: Tom Lane <tgl>
Component: gccAssignee: Jakub Jelinek <jakub>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: rawhideCC: hhorak, jakub
Target Milestone: ---   
Target Release: ---   
Hardware: sparc64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-10-16 20:30:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
DbaccMain.ii.gz none

Description Tom Lane 2009-10-16 03:21:04 UTC
Created attachment 365013 [details]
DbaccMain.ii.gz

Description of problem:
In the attached .ii file, the last function (Dbacc::execREAD_PSEUDO_REQ) ends with a memcpy that is trying to copy a uint64 local variable to the first two elements of a uint32[] array within a passed struct. This fails with SIGBUS for no apparent reason.  The passed struct address is not wrong, since the function has previously fetched the same two uint32's it's now trying to overwrite.  Furthermore the crash goes away if I replace the memcpy with the following ugly equivalent:

  Uint32 * src = (Uint32*)&tmp;
  signal->theData[0] = src[0];
  signal->theData[1] = src[1];

Original compile command is

g++ -DHAVE_CONFIG_H -DNDEBUG   -I. -I../../../../../include -I../../../../../storage/ndb/src/kernel/blocks/dblqh -I. -I../../../../../include -I../../../../../storage/ndb/include -I../../../../../include -I../../../../../storage/ndb/include -I../../../../../storage/ndb/src/kernel/vm -I../../../../../storage/ndb/src/kernel/error -I../../../../../storage/ndb/src/kernel -I../../../../../storage/ndb/include/kernel -I../../../../../storage/ndb/include/transporter -I../../../../../storage/ndb/include/debugger -I../../../../../storage/ndb/include/mgmapi -I../../../../../storage/ndb/include/mgmcommon -I../../../../../storage/ndb/include/ndbapi -I../../../../../storage/ndb/include/util -I../../../../../storage/ndb/include/portlib -I../../../../../storage/ndb/include/logger      -O1 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mcpu=ultrasparc -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -fno-strict-aliasing -fwrapv -fPIC -felide-constructors -fno-rtti -fno-exceptions   -fno-implicit-templates -fno-exceptions -fno-rtti -c -o DbaccMain.o dbacc/DbaccMain.cpp

(it fails at either -O1 or -O2, have not tried -O0)

I've attached the -E output.

Version-Release number of selected component (if applicable):
gcc-4.4.1-21.fc12.sparc64

How reproducible:
100%

Steps to Reproduce:
1. .. try to build current mysql on sparc ...
2.
3.
  
Actual results:


Expected results:


Additional info:
I see the same bug using F11, so it's not new.

Comment 1 Jakub Jelinek 2009-10-16 09:23:35 UTC
I don't see any bug on the gcc side, at least not in this routine.
        ldx     [%fp+2039], %g1                   
        stx     %g1, [%i1+80]
for the memcpy is just fine, given that %fp+2039 is certainly 8 byte aligned (stack bias is 2047), and %i1 must be 8 byte aligned, due to:
  union {                                               
    Uint32 theData[8192];
    Uint64 dummyAlign;                                            
  };
in Signal.  So, if you get a SIGBUS, I'm fairly sure it is on stx insn rather than ldx, and it is because whatever created the Signal object (which isn't obvious from the preprocessed source, there is nothing that calls this function in that TU) didn't ensure it was 8 byte aligned.  So, please, first on the SIGBUS print in gdb $fp and $i1, and if $i1 is misaligned, debug where it has been allocated and why it is misaligned.

Comment 2 Tom Lane 2009-10-16 19:05:31 UTC
Hmm ... you guessed correctly, $i1 is 0x7feffb769f4 == 4 mod 8, so if stx requires 8-byte alignment then that explains the SIGBUS.  But I don't entirely understand why memcpy is generating code that assumes any alignment at all.  Is gcc optimizing the memcpy into an 8-byte aligned move instruction because of what it thinks it knows about the source and destination?

Comment 3 Jakub Jelinek 2009-10-16 19:56:12 UTC
Yes, it copies 8 bytes, the source is known to be 8 byte aligned, the destination is 80 bytes into a struct with __alignof__(Signal) == 8, so also supposed to be 8 byte aligned.  So the fastest memcpy in that case is ldx/stx.

Comment 4 Tom Lane 2009-10-16 20:30:08 UTC
Fair enough, I'll go looking into their allocation code then.

Comment 5 Tom Lane 2009-10-17 18:15:08 UTC
Just for the sake of closure ...

  Uint32 tmp[sizeof(SignalHeader)+25];
  Signal * signal = (Signal*)&tmp;

Egad :-(

Comment 6 Jakub Jelinek 2009-10-17 20:21:52 UTC
You can of course use union { char tmp[offsetof(Signal, theData) + 25]; Uint64 dummyAlign; }; or something similar, depending on how big you exactly want it.  But, allocating 4 * sizeof(SignalHeader)+100 bytes when you don't know exactly how big 3 SegmentedSectionPtr's are and wastefully allocate 3 times SignalHeader size doesn't make much sense.

Comment 7 Tom Lane 2009-10-17 20:33:54 UTC
Yeah, the calculation of the array size seems way off.  It turns out they actually *have* a clean solution, which is a parameterized type named SignalT.  The right way to write this was SignalT<25> tmp.
But some junior programmer didn't get the word, evidently.

Comment 8 Tom Lane 2009-10-17 21:11:06 UTC
Filed upstream at http://bugs.mysql.com/bug.php?id=48132