RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1421155 - Update dynamic loader trampoline for Intel SSE, AVX, and AVX512 usage.
Summary: Update dynamic loader trampoline for Intel SSE, AVX, and AVX512 usage.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: glibc
Version: 7.4
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Carlos O'Donell
QA Contact: qe-baseos-tools-bugs
Vladimír Slávik
URL:
Whiteboard:
Depends On:
Blocks: 1413146
TreeView+ depends on / blocked
 
Reported: 2017-02-10 14:03 UTC by Carlos O'Donell
Modified: 2017-11-20 22:02 UTC (History)
7 users (show)

Fixed In Version: glibc-2.17-190.el7
Doc Type: Enhancement
Doc Text:
Improved performance for dynamically loaded libraries using the Intel SSE, AVX and AVX512 features Dynamic library loading has been updated for libraries using the Intel SSE, AVX, and AVX512 features. As a result, performance while loading these libraries has improved. Additionally, support for LD_AUDIT-style auditing has been added.
Clone Of:
Environment:
Last Closed: 2017-08-01 18:09:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1499012 0 unspecified CLOSED glibc: Restore compatibility with compilers which use vector registers for argument passing against x86-64 ABI 2021-03-11 15:55:45 UTC
Red Hat Bugzilla 1504969 1 None None None 2023-09-15 00:04:40 UTC
Red Hat Product Errata RHSA-2017:1916 0 normal SHIPPED_LIVE Moderate: glibc security, bug fix, and enhancement update 2017-08-01 18:05:43 UTC
Sourceware 20495 0 P2 RESOLVED x86_64 performance degradation due to AVX/SSE transition penalty 2020-11-19 21:51:53 UTC
Sourceware 21236 0 P2 RESOLVED NaN generation by optimized math functions 2020-11-19 21:51:30 UTC
Sourceware 21265 0 P2 RESOLVED _dl_runtime_resolve isn't compatible with Intel C++ __regcall calling convention 2020-11-19 21:51:30 UTC

Internal Links: 1499012 1504969

Description Carlos O'Donell 2017-02-10 14:03:20 UTC
We need to update the dynamic loader trampoline to optimize for SSE, AVX, and AVX512 usage:

f3dcae82d54e5097e18e1d6ef4ff55c2ea4e621e
fb0f7a6755c1bfaec38f490fbfcaa39a66ee3604

The last commit is required to avoid the state transition penalties caused by the first commit and as described here:
https://sourceware.org/bugzilla/show_bug.cgi?id=20495

We need to do this for rhel-7.4 to avoid any performance issues with newer DTS which can generate the instructions that cause performance problems.

Comment 1 Florian Weimer 2017-02-10 14:09:26 UTC
It may make sense to backport this change as well, for additional test coverage:

commit 3403a17fea8ccef7dc5f99553a13231acf838744
Author: H.J. Lu <hjl.tools>
Date:   Thu Feb 9 12:19:44 2017 -0800

    x86-64: Verify that _dl_runtime_resolve preserves vector registers
    
    On x86-64, _dl_runtime_resolve must preserve the first 8 vector
    registers.  Add 3 _dl_runtime_resolve tests to verify that SSE,
    AVX and AVX512 registers are preserved.

However, we would have to replace the intrinsics with inline assembly because our GCC is too old to support this test.

Comment 2 Florian Weimer 2017-03-14 13:59:16 UTC
We have a report that upstream commit fb0f7a6755c1bfaec38f490fbfcaa39a66ee3604 introduces a regression:

  https://sourceware.org/bugzilla/show_bug.cgi?id=21236

My feeling is that it is too risky to include this patch until we know what is going on.

Comment 3 Carlos O'Donell 2017-03-15 04:37:10 UTC
Upstream conclusion is that ICC violates the ps-abi for x86_64 and there is no fault in glibc.

Comment 4 Carlos O'Donell 2017-03-16 01:07:49 UTC
Used DTS 6 on a KNL box with AVX512 to rebuild and run tst-avx512 as a final verification that all the register saves/restores are in place as expected.

For a record of what I did:
(1) Install rhel-7.4
(2) Install DTS 6.
(3) Compiled glibc.
(4) Recompiled tst-avx512 with AVX512 support (missing from the system compiler) and run test.
#!/bin/bash
set -e
set -x
# CC=gcc
CC=/opt/rh/devtoolset-6/root/bin/gcc
# GCC_INCLUDE=/usr/lib/gcc/x86_64-redhat-linux/4.8.5/include
GCC_INCLUDE=/opt/rh/devtoolset-6/root/usr/lib/gcc/x86_64-redhat-linux/6.2.1/include
AVX512_CFLAGS=-mavx512f
# Compiler the shared object code with AVX512 support.
$CC ../sysdeps/x86_64/tst-avx512mod.c -c -std=gnu99 -fgnu89-inline  -DNDEBUG $AVX512_CFLAGS -O3 -Wall -Winline -Wwrite-strings -fasynchronous-unwind-tables -fmerge-all-constants -fno-asynchronous-unwind-tables -frounding-math -g -mtune=generic -Wstrict-prototypes -Werror=implicit-function-declaration   -fPIC  -fno-tree-loop-distribute-patterns       -I../include -I/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf -I/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux -I../sysdeps/unix/sysv/linux/x86_64/64/nptl -I../sysdeps/unix/sysv/linux/x86_64/64 -I../nptl/sysdeps/unix/sysv/linux/x86_64 -I../nptl/sysdeps/unix/sysv/linux/x86 -I../sysdeps/unix/sysv/linux/x86 -I../rtkaio/sysdeps/unix/sysv/linux/x86_64 -I../sysdeps/unix/sysv/linux/x86_64 -I../sysdeps/unix/sysv/linux/wordsize-64 -I../ports/sysdeps/unix/sysv/linux -I../nptl/sysdeps/unix/sysv/linux -I../nptl/sysdeps/pthread -I../rtkaio/sysdeps/pthread -I../sysdeps/pthread -I../rtkaio/sysdeps/unix/sysv/linux -I../sysdeps/unix/sysv/linux -I../sysdeps/gnu -I../sysdeps/unix/inet -I../ports/sysdeps/unix/sysv -I../nptl/sysdeps/unix/sysv -I../rtkaio/sysdeps/unix/sysv -I../sysdeps/unix/sysv -I../sysdeps/unix/x86_64 -I../ports/sysdeps/unix -I../nptl/sysdeps/unix -I../rtkaio/sysdeps/unix -I../sysdeps/unix -I../sysdeps/posix -I../nptl/sysdeps/x86_64/64 -I../sysdeps/x86_64/64 -I../sysdeps/x86_64/fpu/multiarch -I../sysdeps/x86_64/fpu -I../sysdeps/x86/fpu -I../sysdeps/x86_64/multiarch -I../nptl/sysdeps/x86_64 -I../sysdeps/x86_64 -I../sysdeps/x86 -I../sysdeps/ieee754/ldbl-96 -I../sysdeps/ieee754/dbl-64/wordsize-64 -I../sysdeps/ieee754/dbl-64 -I../sysdeps/ieee754/flt-32 -I../sysdeps/wordsize-64 -I../sysdeps/ieee754 -I../sysdeps/generic -I../ports -I../nptl -I../rtkaio  -I.. -I../libio -I. -nostdinc -isystem $GCC_INCLUDE -isystem /usr/include  -D_LIBC_REENTRANT -include /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/libc-modules.h -DMODULE_NAME=nonlib -include ../include/libc-symbols.h  -DPIC -DSHARED     -o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512mod.os -MD -MP -MF /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512mod.os.dt -MT /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512mod.os
# Link the shared object used in the test.
$CC   -shared -static-libgcc  -Wl,-dynamic-linker=/lib64/ld-linux-x86-64.so.2 -Wl,-z,defs -B/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/csu/  -Wl,-z,combreloc -Wl,-z,relro -Wl,--hash-style=both  -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/math -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/dlfcn -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nss -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nis -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/rt -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/resolv -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/crypt -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/support -L/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nptl -Wl,-rpath-link=/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/math:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/dlfcn:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nss:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nis:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/rt:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/resolv:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/crypt:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/support:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nptl -o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512mod.so -T /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/shlib.lds /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/csu/abi-note.o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512mod.os  -Wl,--start-group /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/libc.so /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/libc_nonshared.a -Wl,--as-needed /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/ld.so -Wl,--no-as-needed -Wl,--end-group
# Compile the test object with AVX512 support.
$CC ../sysdeps/x86_64/tst-avx512.c -c -std=gnu99 -fgnu89-inline  -DNDEBUG $AVX512_CFLAGS -O3 -Wall -Winline -Wwrite-strings -fasynchronous-unwind-tables -fmerge-all-constants -fno-asynchronous-unwind-tables -frounding-math -g -mtune=generic -Wstrict-prototypes -Werror=implicit-function-declaration   -fno-tree-loop-distribute-patterns       -I../include -I/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf -I/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux -I../sysdeps/unix/sysv/linux/x86_64/64/nptl -I../sysdeps/unix/sysv/linux/x86_64/64 -I../nptl/sysdeps/unix/sysv/linux/x86_64 -I../nptl/sysdeps/unix/sysv/linux/x86 -I../sysdeps/unix/sysv/linux/x86 -I../rtkaio/sysdeps/unix/sysv/linux/x86_64 -I../sysdeps/unix/sysv/linux/x86_64 -I../sysdeps/unix/sysv/linux/wordsize-64 -I../ports/sysdeps/unix/sysv/linux -I../nptl/sysdeps/unix/sysv/linux -I../nptl/sysdeps/pthread -I../rtkaio/sysdeps/pthread -I../sysdeps/pthread -I../rtkaio/sysdeps/unix/sysv/linux -I../sysdeps/unix/sysv/linux -I../sysdeps/gnu -I../sysdeps/unix/inet -I../ports/sysdeps/unix/sysv -I../nptl/sysdeps/unix/sysv -I../rtkaio/sysdeps/unix/sysv -I../sysdeps/unix/sysv -I../sysdeps/unix/x86_64 -I../ports/sysdeps/unix -I../nptl/sysdeps/unix -I../rtkaio/sysdeps/unix -I../sysdeps/unix -I../sysdeps/posix -I../nptl/sysdeps/x86_64/64 -I../sysdeps/x86_64/64 -I../sysdeps/x86_64/fpu/multiarch -I../sysdeps/x86_64/fpu -I../sysdeps/x86/fpu -I../sysdeps/x86_64/multiarch -I../nptl/sysdeps/x86_64 -I../sysdeps/x86_64 -I../sysdeps/x86 -I../sysdeps/ieee754/ldbl-96 -I../sysdeps/ieee754/dbl-64/wordsize-64 -I../sysdeps/ieee754/dbl-64 -I../sysdeps/ieee754/flt-32 -I../sysdeps/wordsize-64 -I../sysdeps/ieee754 -I../sysdeps/generic -I../ports -I../nptl -I../rtkaio  -I.. -I../libio -I. -nostdinc -isystem $GCC_INCLUDE -isystem /usr/include  -D_LIBC_REENTRANT -include /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/libc-modules.h -DMODULE_NAME=nonlib -include ../include/libc-symbols.h       -o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512.o -MD -MP -MF /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512.o.dt -MT /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512.o
# Build the auxiliary object with AVX512 support.
$CC ../sysdeps/x86_64/tst-avx512-aux.c -c -std=gnu99 -fgnu89-inline  -DNDEBUG $AVX512_CFLAGS -O3 -Wall -Winline -Wwrite-strings -fasynchronous-unwind-tables -fmerge-all-constants -fno-asynchronous-unwind-tables -frounding-math -g -mtune=generic -Wstrict-prototypes -Werror=implicit-function-declaration   -fno-tree-loop-distribute-patterns       -I../include -I/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf -I/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux -I../sysdeps/unix/sysv/linux/x86_64/64/nptl -I../sysdeps/unix/sysv/linux/x86_64/64 -I../nptl/sysdeps/unix/sysv/linux/x86_64 -I../nptl/sysdeps/unix/sysv/linux/x86 -I../sysdeps/unix/sysv/linux/x86 -I../rtkaio/sysdeps/unix/sysv/linux/x86_64 -I../sysdeps/unix/sysv/linux/x86_64 -I../sysdeps/unix/sysv/linux/wordsize-64 -I../ports/sysdeps/unix/sysv/linux -I../nptl/sysdeps/unix/sysv/linux -I../nptl/sysdeps/pthread -I../rtkaio/sysdeps/pthread -I../sysdeps/pthread -I../rtkaio/sysdeps/unix/sysv/linux -I../sysdeps/unix/sysv/linux -I../sysdeps/gnu -I../sysdeps/unix/inet -I../ports/sysdeps/unix/sysv -I../nptl/sysdeps/unix/sysv -I../rtkaio/sysdeps/unix/sysv -I../sysdeps/unix/sysv -I../sysdeps/unix/x86_64 -I../ports/sysdeps/unix -I../nptl/sysdeps/unix -I../rtkaio/sysdeps/unix -I../sysdeps/unix -I../sysdeps/posix -I../nptl/sysdeps/x86_64/64 -I../sysdeps/x86_64/64 -I../sysdeps/x86_64/fpu/multiarch -I../sysdeps/x86_64/fpu -I../sysdeps/x86/fpu -I../sysdeps/x86_64/multiarch -I../nptl/sysdeps/x86_64 -I../sysdeps/x86_64 -I../sysdeps/x86 -I../sysdeps/ieee754/ldbl-96 -I../sysdeps/ieee754/dbl-64/wordsize-64 -I../sysdeps/ieee754/dbl-64 -I../sysdeps/ieee754/flt-32 -I../sysdeps/wordsize-64 -I../sysdeps/ieee754 -I../sysdeps/generic -I../ports -I../nptl -I../rtkaio  -I.. -I../libio -I. -nostdinc -isystem $GCC_INCLUDE -isystem /usr/include  -D_LIBC_REENTRANT -include /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/libc-modules.h -DMODULE_NAME=nonlib -include ../include/libc-symbols.h       -o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512-aux.o -MD -MP -MF /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512-aux.o.dt -MT /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512-aux.o
# Link the final test binary.
$CC -nostdlib -nostartfiles -o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512  -Wl,-dynamic-linker=/lib64/ld-linux-x86-64.so.2   -Wl,-z,combreloc -Wl,-z,relro -Wl,--hash-style=both /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/csu/crt1.o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/csu/crti.o `$CC  --print-file-name=crtbegin.o` /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512.o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/support/libsupport_nonshared.a /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512-aux.o /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512mod.so  -Wl,-rpath-link=/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/math:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/dlfcn:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nss:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nis:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/rt:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/resolv:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/crypt:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/support:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nptl /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/libc.so.6 /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/libc_nonshared.a -Wl,--as-needed /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/ld.so -Wl,--no-as-needed -lgcc -Wl,--as-needed -lgcc_s  -Wl,--no-as-needed `$CC  --print-file-name=crtend.o` /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/csu/crtn.o
# Re-run the test with avx512 support.
env GCONV_PATH=/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/iconvdata LOCPATH=/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/localedata LC_ALL=C   /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/ld-linux-x86-64.so.2 --library-path /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/math:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/dlfcn:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nss:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nis:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/rt:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/resolv:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/crypt:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/support:/root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/nptl /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512  > /root/rpmbuild/BUILD/glibc-2.17-c758a686/build-x86_64-redhat-linux/elf/tst-avx512.out

QE should IMO take tst-avx512 out of the glibc test framework (easy to extract) and compile it stand-alone with DTS6 and run it under the newly installed glibc to verify the same as I have done above. Reach out to me if you need any help doing that.

Comment 7 Carlos O'Donell 2017-03-23 01:37:04 UTC
Intel just added:

commit c15f8eb50cea7ad1a4ccece6e0982bf426d52c00
Author: H.J. Lu <hjl.tools>
Date:   Tue Mar 21 10:59:31 2017 -0700

    x86-64: Improve branch predication in _dl_runtime_resolve_avx512_opt [BZ #21258]
    
    On Skylake server, _dl_runtime_resolve_avx512_opt is used to preserve
    the first 8 vector registers.  The code layout is
    
      if only %xmm0 - %xmm7 registers are used
         preserve %xmm0 - %xmm7 registers
      if only %ymm0 - %ymm7 registers are used
         preserve %ymm0 - %ymm7 registers
      preserve %zmm0 - %zmm7 registers
    
    Branch predication always executes the fallthrough code path to preserve
    %zmm0 - %zmm7 registers speculatively, even though only %xmm0 - %xmm7
    registers are used.  This leads to lower CPU frequency on Skylake
    server.  This patch changes the fallthrough code path to preserve
    %xmm0 - %xmm7 registers instead:
    
      if whole %zmm0 - %zmm7 registers are used
        preserve %zmm0 - %zmm7 registers
      if only %ymm0 - %ymm7 registers are used
         preserve %ymm0 - %ymm7 registers
      preserve %xmm0 - %xmm7 registers

    Tested on Skylake server.
    
            [BZ #21258]
            * sysdeps/x86_64/dl-trampoline.S (_dl_runtime_resolve_opt):
            Define only if _dl_runtime_resolve is defined to
            _dl_runtime_resolve_sse_vex.
            * sysdeps/x86_64/dl-trampoline.h (_dl_runtime_resolve_opt):
            Fallthrough to _dl_runtime_resolve_sse_vex.
---

To upstream. This is a relatively minor change that would mean rhel-7.4 would be optimally placed for Skylake.

Given that I'm going to respin for lock elision we should consider this bug too. I'm going to flip this back to ASSIGNED to make sure we don't miss this.

Comment 21 errata-xmlrpc 2017-08-01 18:09:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:1916


Note You need to log in before you can comment on or make changes to this bug.