Bug 1249426 - curl crashes with "Illegal instruction (core dumped)" at intel_aes_gcmINIT()
curl crashes with "Illegal instruction (core dumped)" at intel_aes_gcmINIT()
Status: CLOSED DUPLICATE of bug 1335280
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: nss-softokn (Show other bugs)
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: nss-nspr-maint
BaseOS QE Security Team
Depends On:
Blocks: 1269194
  Show dependency treegraph
Reported: 2015-08-02 17:50 EDT by Yoshifumi Kinoshita
Modified: 2016-07-08 16:13 EDT (History)
21 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2016-06-28 10:55:16 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
CPU flag test (807 bytes, text/plain)
2016-02-02 06:14 EST, David Zambonini
no flags Details
Suggested Fix (1.17 KB, patch)
2016-02-02 06:15 EST, David Zambonini
no flags Details | Diff

  None (edit)
Comment 2 Kamil Dudka 2015-08-03 04:34:36 EDT
The crypto algorithm in question is implemented by nss-softokn, so I am changing the component such.  Please attach the contents of /proc/cpuinfo from the machine where it crashes.
Comment 4 Kamil Dudka 2015-08-03 12:27:58 EDT
The code in aes_InitContext() mistakenly detected the 'avx' flag on a CPU that does not implement it.  The detection needs to be improved such that it does not enable accelerated GCM in this case.

Please try to use the following command as a workaround:

Comment 5 Yoshifumi Kinoshita 2015-08-03 12:48:56 EDT
The user verified the workaround works on their system.
Comment 16 Joe Wright 2016-01-26 10:01:15 EST
Is there any way at all to fix this without breaking FIPS?
Comment 17 David Zambonini 2016-02-02 06:13:31 EST
I've experienced this issue in a paravirtualised environment (Virtuozzo) running on a Haswell processor, which most certainly does have AVX2 support. Although this is a non-standard kernel/environment, you may want to test against this to determine whether this is also the problem facing the stock OS.

# curl https://google.com
Illegal instruction

# NSS_DISABLE_HW_GCM=1 curl https://google.com
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">

The program terminates with:

SIGILL {si_signo=SIGILL, si_code=ILL_ILLOPN, si_addr=0x2ae0fd11fd60}

So (in my case) the problem was not in the instruction issued, but with the operand.

A trace reveals:

#0  intel_aes_gcmINIT () at intel-gcm.s:71
71          vmovdqu      16*0(KS), T

Where KS is %rsi and T is %xmm0
%rsi was aligned (0x15966600)

Therefore my guess would be that the problem is not with issuing AVX instructions, but failing to detect whether extended processor state is being saved or not.

A quick rough and ready test appears to verify this (in my case at least, cpuid.c):

Failed environment (EL6, but on older kernel):
# ./cpuid
%eax    = 0x000306f2
%ebx    = 0x02100800
%ecx    = 0x77fefbff
%edx    = 0xbfebfbff
aes     = 1
clmul   = 1
avx     = 1
xsave   = 1
osxsave = 0

Successful environment (EL6, identical processor):
# ./cpuid
%eax    = 0x000306f2
%ebx    = 0x13100800
%ecx    = 0x7ffefbff
%edx    = 0xbfebfbff
aes     = 1
clmul   = 1
avx     = 1
xsave   = 1
osxsave = 1

I'd suggest testing against both xsave and osxsave in nss-softokn (patch supplied) and see if this solves the problem.
Comment 18 David Zambonini 2016-02-02 06:14 EST
Created attachment 1120375 [details]
CPU flag test

Simple rough and ready test based on freebl tests.
Comment 19 David Zambonini 2016-02-02 06:15 EST
Created attachment 1120376 [details]
Suggested Fix

"Works for me" fix by adding XSAVE and OS XSAVE tests before enabling GCM.
Comment 20 Elio Maldonado Batiz 2016-02-02 18:27:22 EST
I'm afraid the fix would break FIPS because it patches nss-softokn-3.14.3/mozilla/security/nss/lib/freebl/rijndael.c which is inside the crypto boundary.
Comment 21 David Zambonini 2016-02-02 20:06:12 EST
Sorry, I blithely walked into that and stated the obvious without considering FIPS. While it isn't a consideration for me, I understand why this presents a problem for you. 

If /security/nss/lib/freebl/mpi/mpcpucache* is outside of the FIPS boundary, then altering the result of freebl_cpuid by effectively ANDing the AVX bit in %ecx/%rcx against the XSAVE and OSXSAVE bits before returning (sorry, my assembly is a little rusty), something like:

bt %ecx, 26 ; test for xsave support
jnc failavx
bt %ecx, 27 ; test for osxsave support
jc out
btr %ecx, 28 ; clear avx support

would appear to work while not giving any side-effects elsewhere in the code. If it's inside the boundary, then I can't see any way forward, at least for correct detection while maintaining FIPS validation.
Comment 22 Elio Maldonado Batiz 2016-02-03 10:10:57 EST
(In reply to David Zambonini from comment #21)
We may be able to use the original proposed patch as long as the customer is not concerned with preserving FIPS validation and can to upgrade beyond nss-softokn-3.14.3-22.el6_6 which is the one to be validated one. I'll have more the say later.
Comment 27 Kevin Stange 2016-06-09 13:27:15 EDT
We're starting to see this issue now that 6.8 has released with NSS 3.21, which includes a GCM cipher suite.  Prior, with NSS 3.19, the issue did not occur for us.

It appears that NSS fixed this issue in 3.15.4 (which is after the 3.14 version shipped for softokn-freebl), using a patch similar to David's.


I'm guessing that 7.2 doesn't have this issue because softokn-freebl is 3.16.

What's the "correct" way to fix this?  The environmental variable workaround is not ideal because it's hard to make sure it's set for every process.  Maintaining our own softokn-freebl package is not really ideal either.
Comment 28 Kai Engert (:kaie) 2016-06-28 10:55:16 EDT
According to Bob Relyea, this is a duplicate of bug 1335280, a fix should become available with nss-softokn-3.14.3-23.3.el6_8

*** This bug has been marked as a duplicate of bug 1335280 ***

Note You need to log in before you can comment on or make changes to this bug.