Hide Forgot
Created attachment 1593178 [details] reproducer Description of problem: GCC emits movups load from statically allocated external symbol related: https://bugzilla.redhat.com/show_bug.cgi?id=947197 Version-Release number of selected component (if applicable): gcc-toolset-9-gcc-9.1.1-1.el8.x86_64 How reproducible: g++ x.cpp preprocessed.cpp -O3 -S -std=c++11 grep xmm.*_ZZN5boost16exception_detail27get_static_exception_objectINS0_14bad_exception_EEENS_13except preprocessed.s movups %xmm0, _ZZN5boost16exception_detail27get_static_exception_objectINS0_14bad_exception_EEENS_13exception_ptrEvE2ep(%rip) result: movups %xmm0, _ZZN5boost16exception_detail27get_static_exception_objectINS0_14bad_exception_EEENS_13exception_ptrEvE2ep(%rip) when compiled with the system gcc (), the same code contains movq %rbx, _ZZN5boost16exception_detail27get_static_exception_objectINS0_14bad_exception_EEENS_13exception_ptrEvE2ep+8(%rip)
The problem in the original PR (which I can't reproduce) was that we were generating movdqa, which requires that its memory operand be aligned to 16-byte boundary, on a symbol that needed not be 16-byte aligned. The x86-64 ABI says that *arrays* larger than 15B shall be 16-byte aligned, but gcc aligned other aggregates too (see ix86_data_alignment). That has been fixed and GCC 8 generates movq %r13, _ZZN5boost16exception_detail27get_static_exception_objectINS0_14bad_exception_EEENS_13exception_ptrEvE2ep(%rip) movq %rbx, _ZZN5boost16exception_detail27get_static_exception_objectINS0_14bad_exception_EEENS_13exception_ptrEvE2ep+8(%rip) i.e. uses movq to move two quadwords. This doesn't have the alignment problem above. However, GCC 9 generates movups %xmm0, _ZZN5boost16exception_detail27get_static_exception_objectINS0_14bad_exception_EEENS_13exception_ptrEvE2ep(%rip) and the QE test broke. We generate movups since <https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01499.html> which improved BB vectorization. But this should still be fine wrt the original problem: movups will store the 128-bit value but the address of the memory it stores to does *not* have to be 16-byte aligned. Unlike movaps, which has the alignment requirement.
So I think this should be CLOSED|NOTABUG, but I'd like to ask Jakub to double check what I wrote.
Closing as per the conclusion above. The QE test is just too strict.