Bug 693863

Summary: Backport OpenSSL CHIL Engine fixes
Product: Red Hat Enterprise Linux 6 Reporter: Tomas Mraz <tmraz>
Component: opensslAssignee: Tomas Mraz <tmraz>
Status: CLOSED ERRATA QA Contact: BaseOS QE Security Team <qe-baseos-security>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.2CC: mjc, mvadkert, pvrabec, sander
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openssl-1.0.0-15.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 671484 Environment:
Last Closed: 2011-12-06 18:08:27 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Tomas Mraz 2011-04-05 18:57:56 UTC
+++ This bug was initially created as a clone of Bug #671484 +++

Description of problem:

The CHIL Engine, used to access Thales/nCipher hardware, requires that thread locking upcalls be set regardless of whether the calling program is multithreaded.  This is discussed in the following OpenSSL ticket:

http://rt.openssl.org/Ticket/Display.html?id=1736

This issue was caused to go away in the following OpenSSL commit: 

http://cvs.openssl.org/filediff?f=openssl/engines/e_chil.c&v1=1.1.2.5&v2=1.1.2.6

which was part of 0.9.8j.  

Version-Release number of selected component (if applicable):

Any OpenSSL before 0.9.8j is affected. Any Red Hat Enterprise Linux 5 release is affected.  Tests conducted with OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008, and openssl-0.9.8e-12.el5_5.7 on RHEL 5.5. 

How reproducible:

Attempt to load the CHIL engine into the Red Hat supplied copy of OpenSSL.

Steps to Reproduce:
1. Install nCipher hardware.
2. Install nCipher software.
3. Set and export LD_LIBRARY_PATH=/opt/nfast/toolkits/hwcrhk
4. Call /usr/bin/openssl engine -vvvv -tt -c chil
  
Actual results:

(chil) CHIL hardware engine support
[RSA, DH, RAND]
     [ unavailable ]
13205:error:80067072:CHIL engine:HWCRHK_INIT:locking missing:e_chil.c:594:You HAVE to add dynamic locking callbacks via CRYPTO_set_dynlock_{create,lock,destroy}_callback()
     SO_PATH: Specifies the path to the 'hwcrhk' shared library
          (input flags): STRING
     FORK_CHECK: Turns fork() checking on or off (boolean)
          (input flags): NUMERIC
     THREAD_LOCKING: Turns thread-safe locking on or off (boolean)
         (input flags): NUMERIC

Expected results:

(chil) CHIL hardware engine support
[RSA, DH, RAND]
     [ available ]
     SO_PATH: Specifies the path to the 'hwcrhk' shared library
          (input flags): STRING
     FORK_CHECK: Turns fork() checking on or off (boolean)
          (input flags): NUMERIC
     THREAD_LOCKING: Turns thread-safe locking on or off (boolean)
         (input flags): NUMERIC

Additional info:

If a multithreaded program calls OpenSSL and loads the CHIL Engine without setting those callbacks, unexpected behavior may occur.  It is my opinion that in this case, you get what you deserve.  

Apache 2.2 was updated to set the right upcalls in the same timeframe the OpenSSL RT issue was discussed, and this fix was already backported by Red Hat.

Description of problem:

The CHIL Engine (used to access Thales/nCipher cryptographic hardware) sets an
ex_data table entry in OpenSSL, with a function pointer to the cleanup function
located in the CHIL engine binary.

When a calling program unloads the Engine, the cleanup function pointer is not
cleared.  When the calling program loads the Engine again, a second function
pointer is added to the ex_data cleanup stack, leaving the first one in place
but likely pointing to invalid memory.  This crashes the calling program as
soon as it attempts to clean up the ex_data  entry.  


Version-Release number of selected component (if applicable):

Any OpenSSL 0.9.8 RPM.
Any OpenSSL version from openssl.org

How reproducible:

Apache 2.2.x, as supplied by Red Hat or downloaded from apache.org, does this
double library load and triggers this issue any time the Engine library text
does not land in the same memory location second time around, which is most of
the time.

Steps to Reproduce:
1. Install Thales nCipher HSM
2. Install nCSS Software
3. Add /opt/nfast/toolkits/hwcrhk to /etc/ld.so.conf.d/nfast, run ldconfig -l
(if I recall)
4. Edit /etc/httpd/conf/httpd.conf (if I recall) to add SSLCryptoDevice chil
anywhere
5. /etc/init.d/httpd start.  

Actual results:

httpd server segfaults.

Expected results:

httpd server does not segfault.

Additional info:

This patch to the openssl upstream fixes the issue:

http://cvs.openssl.org/filediff?f=openssl/engines/e_chil.c&v1=1.1.2.9&v2=1.1.2.10

Comment 4 Miroslav Vadkerti 2011-08-10 08:39:32 UTC
Great I have a NetHSM connected to my RHEL6 machine and are able to verify this issue.

Comment 5 Miroslav Vadkerti 2011-08-10 08:59:51 UTC
But I cannot reproduce the issue on the latest openssl (which should NOT contain the fix AFAIK). Tomas is it possible that this bug is NOTABUG on RHEL6? It works even on openssl-1.0.0-1.el6.

# rpm -q openssl
openssl-1.0.0-10.el6_1.3.x86_64

# ldconfig -p | grep nfast
	libnfhwcrhk.so (libc6,x86-64) => /opt/nfast/toolkits/hwcrhk/libnfhwcrhk.so

# /usr/bin/openssl engine -vvvv -tt -c chil
(chil) CHIL hardware engine support
 [RSA, DH, RAND]
     [ available ]
     SO_PATH: Specifies the path to the 'hwcrhk' shared library
          (input flags): STRING
     FORK_CHECK: Turns fork() checking on (non-zero) or off (zero)
          (input flags): NUMERIC
     THREAD_LOCKING: Turns thread-safe locking on (zero) or off (non-zero)
          (input flags): NUMERIC
     SET_USER_INTERFACE: Set the global user interface (internal)
          (input flags): [Internal] 
     SET_CALLBACK_DATA: Set the global user interface extra data (internal)
          (input flags): [Internal]

Comment 6 Sander Temme 2011-08-11 06:35:04 UTC
(In reply to comment #5)
> But I cannot reproduce the issue on the latest openssl (which should NOT
> contain the fix AFAIK). Tomas is it possible that this bug is NOTABUG on RHEL6?
> It works even on openssl-1.0.0-1.el6.
> 
> # rpm -q openssl
> openssl-1.0.0-10.el6_1.3.x86_64

The thread locking upcall fix was applied to OpenSSL before 1.0.0 was forked, so all versions have it.  

Could you please see if the second portion of the report, causing Apache to crash, is still an issue with this OpenSSL version?  

I don't have EL6 so I can't verify this.

Comment 7 Miroslav Vadkerti 2011-08-11 07:12:45 UTC
Sure, it works for me withouth issue in RHEL6. 

To be precise adding "SSLCryptoDevice chil" to /etc/httpd/conf.d/ssl.conf makes httpd restart OK.

Sander do you know how to test the performance of the chil engine? I tried openssl speed with -engine chil (with and w/o -evp) and I cannot see that the engine would bring some performance gain.

Comment 8 Sander Temme 2011-08-11 14:47:05 UTC
Miroslav, please ensure that ssl.conf is in fact included in the main config by making an HTTPS connection to the server. 

Please note that the CHIL enginge only registers for RSA exponentiation.  Any other algorithms requested by the calling program will be handled by the OpenSSL software.  When running openssl speed, add -elapsed to the options so that it uses the wall clock time instead of CPU cycles to calculate performance: since the host CPU is barely active when running against the module, the output is strongly skewed.  Also, to get maximum benefit from the module, use -multi X to spawn multiple processes.  Choosing 2 < X < 20 should max out the module.  

You can check the load on the module by periodically running /opt/nfast/bin/stattree PerModule 1 ModuleJobStats.  Look at the CPULoadPercent to see whether the module is operating at capacity (I get mine up to 94% with 16 processes).  

Even with multiple processes, you'll find the module slower than your host computer.  This type of device used to be known as crypto-accelerator, but computers are so much faster now that they will beat most of our modules in a head-to-head speed test.  The discrepancy will be smaller for larger key sizes (2048, 4096).  You'll also get more performance benefit when the host computer is supposed to be doing something else like, say, run a web app.

Comment 9 Miroslav Vadkerti 2011-08-12 12:01:29 UTC
(In reply to comment #8)
> Miroslav, please ensure that ssl.conf is in fact included in the main config by
> making an HTTPS connection to the server.

ssl.conf is included by default with installed mod_ssl package on RHEL systems. HTTPS connection is successful I tested just now.
 
> 
> Please note that the CHIL enginge only registers for RSA exponentiation.  Any
> other algorithms requested by the calling program will be handled by the
> OpenSSL software.  When running openssl speed, add -elapsed to the options so
> that it uses the wall clock time instead of CPU cycles to calculate
> performance: since the host CPU is barely active when running against the
> module, the output is strongly skewed.  Also, to get maximum benefit from the
> module, use -multi X to spawn multiple processes.  Choosing 2 < X < 20 should
> max out the module.  

I'm using netHSM 6000 connected via network. I do not have a PCI card available so maybe this may cause alsmost constant sign/verify times below.

With chil engine:
# openssl speed -engine chil -elapsed -multi 18 rsa
                  sign    verify    sign/s verify/s
rsa  512 bits 0.012221s 0.011017s     81.8     90.8
rsa 1024 bits 0.012137s 0.011310s     82.4     88.4
rsa 2048 bits 0.011060s 0.010911s     90.4     91.7
rsa 4096 bits 0.011680s 0.036412s     85.6     27.5

W/o chil engine:
# openssl speed -elapsed -multi 18 rsa
                  sign    verify    sign/s verify/s
rsa  512 bits 0.000091s 0.000008s  10973.7 129813.0
rsa 1024 bits 0.000450s 0.000023s   2222.6  43124.6
rsa 2048 bits 0.002741s 0.000080s    364.8  12423.1
rsa 4096 bits 0.019434s 0.000303s     51.5   3301.6

> 
> You can check the load on the module by periodically running
> /opt/nfast/bin/stattree PerModule 1 ModuleJobStats.  Look at the CPULoadPercent
> to see whether the module is operating at capacity (I get mine up to 94% with
> 16 processes).  

with 18 processes I see only up to 3-4% of CPULoadPercent though this may be expected as I'm using netHSM:

# /opt/nfast/bin/stattree PerModule 1 ModuleJobStats
+#PerModule:
   +#1:
      +#ModuleJobStats:
         -CmdCount             11980925
         -ReplyCount           11980923
         -CmdBytes             3275495060
         -ReplyBytes           2122423056
         -HostWriteCount       11336366
         -HostWriteErrors      0
         -HostReadCount        22358238
         -HostReadErrors       0
         -HostReadEmpty        0
         -HostReadDeferred     10760961
         -HostReadTerminated   0
         -PFNIssued            267636
         -PFNRejected          0
         -PFNCompleted         267635
         -ANIssued             11
         -CPULoadPercent       3

> 
> Even with multiple processes, you'll find the module slower than your host
> computer.  This type of device used to be known as crypto-accelerator, but
> computers are so much faster now that they will beat most of our modules in a
> head-to-head speed test.  The discrepancy will be smaller for larger key sizes
> (2048, 4096).  You'll also get more performance benefit when the host computer
> is supposed to be doing something else like, say, run a web app.

Thanks for your kind answers and explanation!

Comment 10 Tomas Mraz 2011-08-15 08:23:36 UTC
Please disregard the first part of the bug description up to the second "Description of problem".

Comment 13 errata-xmlrpc 2011-12-06 18:08:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1730.html