Bugzilla will be upgraded to version 5.0 on a still to be determined date in the near future. The original upgrade date has been delayed.
Bug 1421155 - Update dynamic loader trampoline for Intel SSE, AVX, and AVX512 usage.
Update dynamic loader trampoline for Intel SSE, AVX, and AVX512 usage.
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: glibc (Show other bugs)
7.4
x86_64 Linux
unspecified Severity medium
: rc
: ---
Assigned To: Carlos O'Donell
qe-baseos-tools
Vladimír Slávik
:
Depends On:
Blocks: 1413146
  Show dependency treegraph
 
Reported: 2017-02-10 09:03 EST by Carlos O'Donell
Modified: 2017-11-20 17:02 EST (History)
7 users (show)

See Also:
Fixed In Version: glibc-2.17-190.el7
Doc Type: Enhancement
Doc Text:
Improved performance for dynamically loaded libraries using the Intel SSE, AVX and AVX512 features Dynamic library loading has been updated for libraries using the Intel SSE, AVX, and AVX512 features. As a result, performance while loading these libraries has improved. Additionally, support for LD_AUDIT-style auditing has been added.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-01 14:09:25 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Sourceware 20495 None None None 2017-03-14 10:08 EDT
Sourceware 21236 None None None 2017-03-14 09:59 EDT
Sourceware 21265 None None None 2017-03-20 09:43 EDT
Red Hat Product Errata RHSA-2017:1916 normal SHIPPED_LIVE Moderate: glibc security, bug fix, and enhancement update 2017-08-01 14:05:43 EDT

  None (edit)
Description Carlos O'Donell 2017-02-10 09:03:20 EST
We need to update the dynamic loader trampoline to optimize for SSE, AVX, and AVX512 usage:

f3dcae82d54e5097e18e1d6ef4ff55c2ea4e621e
fb0f7a6755c1bfaec38f490fbfcaa39a66ee3604

The last commit is required to avoid the state transition penalties caused by the first commit and as described here:
https://sourceware.org/bugzilla/show_bug.cgi?id=20495

We need to do this for rhel-7.4 to avoid any performance issues with newer DTS which can generate the instructions that cause performance problems.
Comment 1 Florian Weimer 2017-02-10 09:09:26 EST
It may make sense to backport this change as well, for additional test coverage:

commit 3403a17fea8ccef7dc5f99553a13231acf838744
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Thu Feb 9 12:19:44 2017 -0800

    x86-64: Verify that _dl_runtime_resolve preserves vector registers
    
    On x86-64, _dl_runtime_resolve must preserve the first 8 vector
    registers.  Add 3 _dl_runtime_resolve tests to verify that SSE,
    AVX and AVX512 registers are preserved.

However, we would have to replace the intrinsics with inline assembly because our GCC is too old to support this test.
Comment 2 Florian Weimer 2017-03-14 09:59:16 EDT
We have a report that upstream commit fb0f7a6755c1bfaec38f490fbfcaa39a66ee3604 introduces a regression:

  https://sourceware.org/bugzilla/show_bug.cgi?id=21236

My feeling is that it is too risky to include this patch until we know what is going on.
Comment 3 Carlos O'Donell 2017-03-15 00:37:10 EDT
Upstream conclusion is that ICC violates the ps-abi for x86_64 and there is no fault in glibc.
Comment 4 Carlos O'Donell 2017-03-15 21:07:49 EDT
Used DTS 6 on a KNL box with AVX512 to rebuild and run tst-avx512 as a final verification that all the register saves/restores are in place as expected.

For a record of what I did:
(1) Install rhel-7.4
(2) Install DTS 6.
(3) Compiled glibc.
(4) Recompiled tst-avx512 with AVX512 support (missing from the system compiler) and run test.
#!/bin/bash
set -e
set -x
# CC=gcc
CC=/opt/rh/devtoolset-6/root/bin/gcc
# GCC_INCLUDE=/usr/lib/gcc/x86_64-redhat-linux/4.8.5/include
GCC_INCLUDE=/opt/rh/devtoolset-6/root/usr/lib/gcc/x86_64-redhat-linux/6.2.1/include
AVX512_CFLAGS=-mavx512f
# Compiler the shared object code with AVX512 support.
$CC ../sysdeps/x86_64/tst-avx512mod.c -c -std=gnu99 -fgnu89-inline  -DNDEBUG $AVX512_CFLAGS -O3 -Wall -Winline -Wwrite-strings -fasynchronous-unwind-tables -fmerge-all-constants -fno-asynchronous-unwind-tables -frounding-math -g -mtune=generic -Wstrict-prototypes -Werror=implicit-function-declaration   -fPIC  -fno-tree-loop-distribute-patterns       -I../include -I/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf -I/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux -I../sysdeps/unix/sysv/linux/x86_64/64/nptl -I../sysdeps/unix/sysv/linux/x86_64/64 -I../nptl/sysdeps/unix/sysv/linux/x86_64 -I../nptl/sysdeps/unix/sysv/linux/x86 -I../sysdeps/unix/sysv/linux/x86 -I../rtkaio/sysdeps/unix/sysv/linux/x86_64 -I../sysdeps/unix/sysv/linux/x86_64 -I../sysdeps/unix/sysv/linux/wordsize-64 -I../ports/sysdeps/unix/sysv/linux -I../nptl/sysdeps/unix/sysv/linux -I../nptl/sysdeps/pthread -I../rtkaio/sysdeps/pthread -I../sysdeps/pthread -I../rtkaio/sysdeps/unix/sysv/linux -I../sysdeps/unix/sysv/linux -I../sysdeps/gnu -I../sysdeps/unix/inet -I../ports/sysdeps/unix/sysv -I../nptl/sysdeps/unix/sysv -I../rtkaio/sysdeps/unix/sysv -I../sysdeps/unix/sysv -I../sysdeps/unix/x86_64 -I../ports/sysdeps/unix -I../nptl/sysdeps/unix -I../rtkaio/sysdeps/unix -I../sysdeps/unix -I../sysdeps/posix -I../nptl/sysdeps/x86_64/64 -I../sysdeps/x86_64/64 -I../sysdeps/x86_64/fpu/multiarch -I../sysdeps/x86_64/fpu -I../sysdeps/x86/fpu -I../sysdeps/x86_64/multiarch -I../nptl/sysdeps/x86_64 -I../sysdeps/x86_64 -I../sysdeps/x86 -I../sysdeps/ieee754/ldbl-96 -I../sysdeps/ieee754/dbl-64/wordsize-64 -I../sysdeps/ieee754/dbl-64 -I../sysdeps/ieee754/flt-32 -I../sysdeps/wordsize-64 -I../sysdeps/ieee754 -I../sysdeps/generic -I../ports -I../nptl -I../rtkaio  -I.. -I../libio -I. -nostdinc -isystem $GCC_INCLUDE -isystem /usr/include  -D_LIBC_REENTRANT -include /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/libc-modules.h -DMODULE_NAME=nonlib -include ../include/libc-symbols.h  -DPIC -DSHARED     -o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512mod.os -MD -MP -MF /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512mod.os.dt -MT /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512mod.os
# Link the shared object used in the test.
$CC   -shared -static-libgcc  -Wl,-dynamic-linker=/lib64/ld-linux-x86-64.so.2 -Wl,-z,defs -B/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/csu/  -Wl,-z,combreloc -Wl,-z,relro -Wl,--hash-style=both  -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/math -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/dlfcn -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nss -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nis -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/rt -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/resolv -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/crypt -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/support -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nptl -Wl,-rpath-link=/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/math:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/dlfcn:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nss:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nis:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/rt:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/resolv:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/crypt:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/support:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nptl -o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512mod.so -T /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/shlib.lds /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/csu/abi-note.o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512mod.os  -Wl,--start-group /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/libc.so /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/libc_nonshared.a -Wl,--as-needed /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/ld.so -Wl,--no-as-needed -Wl,--end-group
# Compile the test object with AVX512 support.
$CC ../sysdeps/x86_64/tst-avx512.c -c -std=gnu99 -fgnu89-inline  -DNDEBUG $AVX512_CFLAGS -O3 -Wall -Winline -Wwrite-strings -fasynchronous-unwind-tables -fmerge-all-constants -fno-asynchronous-unwind-tables -frounding-math -g -mtune=generic -Wstrict-prototypes -Werror=implicit-function-declaration   -fno-tree-loop-distribute-patterns       -I../include -I/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf -I/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux -I../sysdeps/unix/sysv/linux/x86_64/64/nptl -I../sysdeps/unix/sysv/linux/x86_64/64 -I../nptl/sysdeps/unix/sysv/linux/x86_64 -I../nptl/sysdeps/unix/sysv/linux/x86 -I../sysdeps/unix/sysv/linux/x86 -I../rtkaio/sysdeps/unix/sysv/linux/x86_64 -I../sysdeps/unix/sysv/linux/x86_64 -I../sysdeps/unix/sysv/linux/wordsize-64 -I../ports/sysdeps/unix/sysv/linux -I../nptl/sysdeps/unix/sysv/linux -I../nptl/sysdeps/pthread -I../rtkaio/sysdeps/pthread -I../sysdeps/pthread -I../rtkaio/sysdeps/unix/sysv/linux -I../sysdeps/unix/sysv/linux -I../sysdeps/gnu -I../sysdeps/unix/inet -I../ports/sysdeps/unix/sysv -I../nptl/sysdeps/unix/sysv -I../rtkaio/sysdeps/unix/sysv -I../sysdeps/unix/sysv -I../sysdeps/unix/x86_64 -I../ports/sysdeps/unix -I../nptl/sysdeps/unix -I../rtkaio/sysdeps/unix -I../sysdeps/unix -I../sysdeps/posix -I../nptl/sysdeps/x86_64/64 -I../sysdeps/x86_64/64 -I../sysdeps/x86_64/fpu/multiarch -I../sysdeps/x86_64/fpu -I../sysdeps/x86/fpu -I../sysdeps/x86_64/multiarch -I../nptl/sysdeps/x86_64 -I../sysdeps/x86_64 -I../sysdeps/x86 -I../sysdeps/ieee754/ldbl-96 -I../sysdeps/ieee754/dbl-64/wordsize-64 -I../sysdeps/ieee754/dbl-64 -I../sysdeps/ieee754/flt-32 -I../sysdeps/wordsize-64 -I../sysdeps/ieee754 -I../sysdeps/generic -I../ports -I../nptl -I../rtkaio  -I.. -I../libio -I. -nostdinc -isystem $GCC_INCLUDE -isystem /usr/include  -D_LIBC_REENTRANT -include /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/libc-modules.h -DMODULE_NAME=nonlib -include ../include/libc-symbols.h       -o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512.o -MD -MP -MF /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512.o.dt -MT /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512.o
# Build the auxiliary object with AVX512 support.
$CC ../sysdeps/x86_64/tst-avx512-aux.c -c -std=gnu99 -fgnu89-inline  -DNDEBUG $AVX512_CFLAGS -O3 -Wall -Winline -Wwrite-strings -fasynchronous-unwind-tables -fmerge-all-constants -fno-asynchronous-unwind-tables -frounding-math -g -mtune=generic -Wstrict-prototypes -Werror=implicit-function-declaration   -fno-tree-loop-distribute-patterns       -I../include -I/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf -I/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux -I../sysdeps/unix/sysv/linux/x86_64/64/nptl -I../sysdeps/unix/sysv/linux/x86_64/64 -I../nptl/sysdeps/unix/sysv/linux/x86_64 -I../nptl/sysdeps/unix/sysv/linux/x86 -I../sysdeps/unix/sysv/linux/x86 -I../rtkaio/sysdeps/unix/sysv/linux/x86_64 -I../sysdeps/unix/sysv/linux/x86_64 -I../sysdeps/unix/sysv/linux/wordsize-64 -I../ports/sysdeps/unix/sysv/linux -I../nptl/sysdeps/unix/sysv/linux -I../nptl/sysdeps/pthread -I../rtkaio/sysdeps/pthread -I../sysdeps/pthread -I../rtkaio/sysdeps/unix/sysv/linux -I../sysdeps/unix/sysv/linux -I../sysdeps/gnu -I../sysdeps/unix/inet -I../ports/sysdeps/unix/sysv -I../nptl/sysdeps/unix/sysv -I../rtkaio/sysdeps/unix/sysv -I../sysdeps/unix/sysv -I../sysdeps/unix/x86_64 -I../ports/sysdeps/unix -I../nptl/sysdeps/unix -I../rtkaio/sysdeps/unix -I../sysdeps/unix -I../sysdeps/posix -I../nptl/sysdeps/x86_64/64 -I../sysdeps/x86_64/64 -I../sysdeps/x86_64/fpu/multiarch -I../sysdeps/x86_64/fpu -I../sysdeps/x86/fpu -I../sysdeps/x86_64/multiarch -I../nptl/sysdeps/x86_64 -I../sysdeps/x86_64 -I../sysdeps/x86 -I../sysdeps/ieee754/ldbl-96 -I../sysdeps/ieee754/dbl-64/wordsize-64 -I../sysdeps/ieee754/dbl-64 -I../sysdeps/ieee754/flt-32 -I../sysdeps/wordsize-64 -I../sysdeps/ieee754 -I../sysdeps/generic -I../ports -I../nptl -I../rtkaio  -I.. -I../libio -I. -nostdinc -isystem $GCC_INCLUDE -isystem /usr/include  -D_LIBC_REENTRANT -include /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/libc-modules.h -DMODULE_NAME=nonlib -include ../include/libc-symbols.h       -o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512-aux.o -MD -MP -MF /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512-aux.o.dt -MT /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512-aux.o
# Link the final test binary.
$CC -nostdlib -nostartfiles -o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512  -Wl,-dynamic-linker=/lib64/ld-linux-x86-64.so.2   -Wl,-z,combreloc -Wl,-z,relro -Wl,--hash-style=both /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/csu/crt1.o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/csu/crti.o `$CC  --print-file-name=crtbegin.o` /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512.o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/support/libsupport_nonshared.a /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512-aux.o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512mod.so  -Wl,-rpath-link=/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/math:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/dlfcn:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nss:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nis:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/rt:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/resolv:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/crypt:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/support:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nptl /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/libc.so.6 /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/libc_nonshared.a -Wl,--as-needed /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/ld.so -Wl,--no-as-needed -lgcc -Wl,--as-needed -lgcc_s  -Wl,--no-as-needed `$CC  --print-file-name=crtend.o` /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/csu/crtn.o
# Re-run the test with avx512 support.
env GCONV_PATH=/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/iconvdata LOCPATH=/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/localedata LC_ALL=C   /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/ld-linux-x86-64.so.2 --library-path /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/math:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/dlfcn:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nss:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nis:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/rt:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/resolv:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/crypt:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/support:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nptl /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512  > /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512.out

QE should IMO take tst-avx512 out of the glibc test framework (easy to extract) and compile it stand-alone with DTS6 and run it under the newly installed glibc to verify the same as I have done above. Reach out to me if you need any help doing that.
Comment 7 Carlos O'Donell 2017-03-22 21:37:04 EDT
Intel just added:

commit c15f8eb50cea7ad1a4ccece6e0982bf426d52c00
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Mar 21 10:59:31 2017 -0700

    x86-64: Improve branch predication in _dl_runtime_resolve_avx512_opt [BZ #21258]
    
    On Skylake server, _dl_runtime_resolve_avx512_opt is used to preserve
    the first 8 vector registers.  The code layout is
    
      if only %xmm0 - %xmm7 registers are used
         preserve %xmm0 - %xmm7 registers
      if only %ymm0 - %ymm7 registers are used
         preserve %ymm0 - %ymm7 registers
      preserve %zmm0 - %zmm7 registers
    
    Branch predication always executes the fallthrough code path to preserve
    %zmm0 - %zmm7 registers speculatively, even though only %xmm0 - %xmm7
    registers are used.  This leads to lower CPU frequency on Skylake
    server.  This patch changes the fallthrough code path to preserve
    %xmm0 - %xmm7 registers instead:
    
      if whole %zmm0 - %zmm7 registers are used
        preserve %zmm0 - %zmm7 registers
      if only %ymm0 - %ymm7 registers are used
         preserve %ymm0 - %ymm7 registers
      preserve %xmm0 - %xmm7 registers

    Tested on Skylake server.
    
            [BZ #21258]
            * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve_opt):
            Define only if _dl_runtime_resolve is defined to
            _dl_runtime_resolve_sse_vex.
            * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_opt):
            Fallthrough to _dl_runtime_resolve_sse_vex.
---

To upstream. This is a relatively minor change that would mean rhel-7.4 would be optimally placed for Skylake.

Given that I'm going to respin for lock elision we should consider this bug too. I'm going to flip this back to ASSIGNED to make sure we don't miss this.
Comment 21 errata-xmlrpc 2017-08-01 14:09:25 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:1916

Note You need to log in before you can comment on or make changes to this bug.