RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1249426 - curl crashes with "Illegal instruction (core dumped)" at intel_aes_gcmINIT()
Summary: curl crashes with "Illegal instruction (core dumped)" at intel_aes_gcmINIT()
Keywords:
Status: CLOSED DUPLICATE of bug 1335280
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: nss-softokn
Version: 6.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: nss-nspr-maint
QA Contact: BaseOS QE Security Team
URL:
Whiteboard:
Depends On:
Blocks: 1269194
TreeView+ depends on / blocked
 
Reported: 2015-08-02 21:50 UTC by Yoshifumi Kinoshita
Modified: 2019-12-16 04:51 UTC (History)
21 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-06-28 14:55:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
CPU flag test (807 bytes, text/plain)
2016-02-02 11:14 UTC, David Zambonini
no flags Details
Suggested Fix (1.17 KB, patch)
2016-02-02 11:15 UTC, David Zambonini
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1335280 1 None None None 2021-01-20 06:05:38 UTC

Internal Links: 1335280

Comment 2 Kamil Dudka 2015-08-03 08:34:36 UTC
The crypto algorithm in question is implemented by nss-softokn, so I am changing the component such.  Please attach the contents of /proc/cpuinfo from the machine where it crashes.

Comment 4 Kamil Dudka 2015-08-03 16:27:58 UTC
The code in aes_InitContext() mistakenly detected the 'avx' flag on a CPU that does not implement it.  The detection needs to be improved such that it does not enable accelerated GCM in this case.

Please try to use the following command as a workaround:

export NSS_DISABLE_HW_GCM=1

Comment 5 Yoshifumi Kinoshita 2015-08-03 16:48:56 UTC
The user verified the workaround works on their system.

Comment 16 Joe Wright 2016-01-26 15:01:15 UTC
Is there any way at all to fix this without breaking FIPS?

Comment 17 David Zambonini 2016-02-02 11:13:31 UTC
I've experienced this issue in a paravirtualised environment (Virtuozzo) running on a Haswell processor, which most certainly does have AVX2 support. Although this is a non-standard kernel/environment, you may want to test against this to determine whether this is also the problem facing the stock OS.

# curl https://google.com
Illegal instruction

# NSS_DISABLE_HW_GCM=1 curl https://google.com
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
etc.

The program terminates with:

SIGILL {si_signo=SIGILL, si_code=ILL_ILLOPN, si_addr=0x2ae0fd11fd60}

So (in my case) the problem was not in the instruction issued, but with the operand.

A trace reveals:

#0  intel_aes_gcmINIT () at intel-gcm.s:71
71          vmovdqu      16*0(KS), T

Where KS is %rsi and T is %xmm0
%rsi was aligned (0x15966600)

Therefore my guess would be that the problem is not with issuing AVX instructions, but failing to detect whether extended processor state is being saved or not.

A quick rough and ready test appears to verify this (in my case at least, cpuid.c):

Failed environment (EL6, but on older kernel):
# ./cpuid
%eax    = 0x000306f2
%ebx    = 0x02100800
%ecx    = 0x77fefbff
%edx    = 0xbfebfbff
aes     = 1
clmul   = 1
avx     = 1
xsave   = 1
osxsave = 0

Successful environment (EL6, identical processor):
# ./cpuid
%eax    = 0x000306f2
%ebx    = 0x13100800
%ecx    = 0x7ffefbff
%edx    = 0xbfebfbff
aes     = 1
clmul   = 1
avx     = 1
xsave   = 1
osxsave = 1

I'd suggest testing against both xsave and osxsave in nss-softokn (patch supplied) and see if this solves the problem.

Comment 18 David Zambonini 2016-02-02 11:14:46 UTC
Created attachment 1120375 [details]
CPU flag test

Simple rough and ready test based on freebl tests.

Comment 19 David Zambonini 2016-02-02 11:15:37 UTC
Created attachment 1120376 [details]
Suggested Fix

"Works for me" fix by adding XSAVE and OS XSAVE tests before enabling GCM.

Comment 20 Elio Maldonado Batiz 2016-02-02 23:27:22 UTC
I'm afraid the fix would break FIPS because it patches nss-softokn-3.14.3/mozilla/security/nss/lib/freebl/rijndael.c which is inside the crypto boundary.

Comment 21 David Zambonini 2016-02-03 01:06:12 UTC
Sorry, I blithely walked into that and stated the obvious without considering FIPS. While it isn't a consideration for me, I understand why this presents a problem for you. 

If /security/nss/lib/freebl/mpi/mpcpucache* is outside of the FIPS boundary, then altering the result of freebl_cpuid by effectively ANDing the AVX bit in %ecx/%rcx against the XSAVE and OSXSAVE bits before returning (sorry, my assembly is a little rusty), something like:

bt %ecx, 26 ; test for xsave support
jnc failavx
bt %ecx, 27 ; test for osxsave support
jc out
failavx:
btr %ecx, 28 ; clear avx support
out:

would appear to work while not giving any side-effects elsewhere in the code. If it's inside the boundary, then I can't see any way forward, at least for correct detection while maintaining FIPS validation.

Comment 22 Elio Maldonado Batiz 2016-02-03 15:10:57 UTC
(In reply to David Zambonini from comment #21)
We may be able to use the original proposed patch as long as the customer is not concerned with preserving FIPS validation and can to upgrade beyond nss-softokn-3.14.3-22.el6_6 which is the one to be validated one. I'll have more the say later.

Comment 27 Kevin Stange 2016-06-09 17:27:15 UTC
We're starting to see this issue now that 6.8 has released with NSS 3.21, which includes a GCM cipher suite.  Prior, with NSS 3.19, the issue did not occur for us.

It appears that NSS fixed this issue in 3.15.4 (which is after the 3.14 version shipped for softokn-freebl), using a patch similar to David's.

https://bugzilla.mozilla.org/show_bug.cgi?id=940794
https://hg.mozilla.org/projects/nss/rev/edda2ba82d22

I'm guessing that 7.2 doesn't have this issue because softokn-freebl is 3.16.

What's the "correct" way to fix this?  The environmental variable workaround is not ideal because it's hard to make sure it's set for every process.  Maintaining our own softokn-freebl package is not really ideal either.

Comment 28 Kai Engert (:kaie) (inactive account) 2016-06-28 14:55:16 UTC
According to Bob Relyea, this is a duplicate of bug 1335280, a fix should become available with nss-softokn-3.14.3-23.3.el6_8

*** This bug has been marked as a duplicate of bug 1335280 ***


Note You need to log in before you can comment on or make changes to this bug.