Bug 677700

Summary: Lack of interlock between nss processes to pcache causes httpd failure
Product: Red Hat Enterprise Linux 6 Reporter: Martin Poole <mpoole>
Component: mod_nssAssignee: Rob Crittenden <rcritten>
Status: CLOSED ERRATA QA Contact: Chandrasekar Kannan <ckannan>
Severity: high Docs Contact:
Priority: high    
Version: 6.0CC: benl, dpal, jgalipea, kchamart, shaines
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: mod_nss-1.0.8-11.el6 Doc Type: Bug Fix
Doc Text:
During the Apache HTTP Server startup, a race condition could prevent one or more child processes from receiving the token PIN, rendering such processes unable to use SSL. With this update, the race condition no longer occurs, and all child processes of the Apache HTTP Server can enable SSL as expected.
Story Points: ---
Clone Of: 677698
: 677701 (view as bug list) Environment:
Last Closed: 2011-05-19 14:15:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 677698    
Bug Blocks: 677701    

Description Martin Poole 2011-02-15 16:00:46 UTC
+++ This bug was initially created as a clone of Bug #677698 +++

Description of problem:

with the release of 1.0.8 there is no lock mechanism to serialise httpd process access to the nss_pcache process. This can result in multiple requests being read as a single request by nss_pcache, a single result returned and callers either getting FAIL result or erroring out.


Version-Release number of selected component (if applicable):

1.0.8-3

Additional info:

filtered strace indicating failure mode

9932  write(14, "RETR\tNSS FIPS 140-2 Certificate DB", 34 <unfinished ...>
9928  <... munmap resumed> )            = 0
9932  <... write resumed> )             = 34
9928  write(14, "RETR\tNSS FIPS 140-2 Certificate DB", 34 <unfinished ...>
9922  <... poll resumed> )              = 1 ([{fd=0, revents=POLLIN}])
9932  read(15,  <unfinished ...>
9928  <... write resumed> )             = 34
9932  <... read resumed> 0x7fff97e88940, 1024) = -1 EAGAIN (Resource temporarily unavailable)
9928  read(15,  <unfinished ...>
9922  read(0,  <unfinished ...>
9932  poll([{fd=15, events=POLLIN}], 1, 30000 <unfinished ...>
9928  <... read resumed> 0x7fff97e88940, 1024) = -1 EAGAIN (Resource temporarily unavailable)
9928  poll([{fd=15, events=POLLIN}], 1, 30000 <unfinished ...>
9922  <... read resumed> "RETR\tNSS FIPS 140-2 Certificate DBRETR\tNSS FIPS 140-2 Certificat"..., 1024) = 68
9922  write(1, "\0", 1 <unfinished ...>
9932  <... poll resumed> )              = 1 ([{fd=15, revents=POLLIN}])
9928  <... poll resumed> )              = 1 ([{fd=15, revents=POLLIN}])
9922  <... write resumed> )             = 1
9932  read(15,  <unfinished ...>
9928  read(15,  <unfinished ...>
9932  <... read resumed> "\0", 1024)    = 1
9928  <... read resumed> 0x7fff97e88940, 1024) = -1 EAGAIN (Resource temporarily unavailable)
9922  poll([{fd=0, events=POLLIN|POLLPRI}], 1, -1 <unfinished ...>
9932  gettimeofday( <unfinished ...>
9928  gettimeofday( <unfinished ...>
9932  <... gettimeofday resumed> {1295949534, 353375}, NULL) = 0
9928  <... gettimeofday resumed> {1295949534, 353465}, NULL) = 0
9932  gettimeofday( <unfinished ...>
9928  write(2, "[Tue Jan 25 15:28:54 2011] [error] Unable to read from pin store"..., 117 <unfinished ...>

Comment 2 Rob Crittenden 2011-02-21 21:03:36 UTC
*** Bug 678681 has been marked as a duplicate of this bug. ***

Comment 3 Rob Crittenden 2011-03-02 20:35:36 UTC
I had some problems reliably reproducing the error. I ended up doing this:

# while [ true ]; do service httpd restart; sleep 5; done

And in another window: tail -f /var/log/httpd/error_log | grep 70007

With the patch in place I let this run an hour and never saw an occurrence.

I also ran this to ensure all the children initialized properly:

# /usr/lib64/nss/unsupported-tools/strsclnt -p 8443 -d /etc/httpd/alias -P 90 test.example.com -c 10000 -t 100

There were no errors about SSL being disabled in the error log.

Comment 6 Jaromir Hradilek 2011-04-20 14:52:03 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
During the Apache HTTP Server startup, a race condition could prevent one or more child processes from receiving the token PIN, rendering such processes unable to use SSL. With this update, the race condition no longer occurs, and all child processes of the Apache HTTP Server can enable SSL as expected.

Comment 7 Kashyap Chamarthy 2011-04-25 10:53:34 UTC
when I run the never ending while loop for httpd restart for a while, 
I notice these in the httpd/error_log .. can these be ignored?
----------------------------------------------
[Mon Apr 25 00:36:48 2011] [warn] mod_wsgi: Runtime using Python/2.6.6.
[Mon Apr 25 00:36:48 2011] [notice] Apache/2.2.15 (Unix) DAV/2 mod_nss/2.2.15 NSS/3.12.9.0 mod_ssl/2.2.15 OpenSSL/1.0.0-fips mod_wsgi/3.2 Python/2.6.6 mod_perl/2.0.4 Perl/v5.10.1 configured -- resuming normal operations
[Mon Apr 25 00:36:53 2011] [error] Exception KeyError: KeyError(139642094028768,) in <module 'threading' from '/usr/lib64/python2.6/threading.pyc'> ignored
[Mon Apr 25 00:36:53 2011] [error] Exception KeyError: KeyError(139642094028768,) in <module 'threading' from '/usr/lib64/python2.6/threading.pyc'> ignored
[Mon Apr 25 00:36:53 2011] [error] Exception KeyError: KeyError(139642094028768,) in <module 'threading' from '/usr/lib64/python2.6/threading.pyc'> ignored
[Mon Apr 25 00:36:53 2011] [error] Exception KeyError: KeyError(139642094028768,) in <module 'threading' from '/usr/lib64/python2.6/threading.pyc'> ignored
[Mon Apr 25 00:36:53 2011] [error] Exception KeyError: KeyError(139642094028768,) in <module 'threading' from '/usr/lib64/python2.6/threading.pyc'> ignored
----------------------------------------------

Comment 8 Rob Crittenden 2011-04-25 13:34:59 UTC
Probably, those aren't being generated by mod_nss directly in any case. Do you have a mod_wsgi application defined?

Comment 9 Kashyap Chamarthy 2011-04-25 14:43:22 UTC
This is a fairly stock RHEL 6.1. No other apps are defined.

Comment 10 Rob Crittenden 2011-04-25 15:00:37 UTC
Ok, in any case these are python errors so probably unrelated to mod_nss.

Comment 11 Kashyap Chamarthy 2011-04-29 11:22:28 UTC
Ok, I ignored the above errors. Other than that, I noticed no errors in the httpd error_log.

As an aside, I tried this:

1. Placed this directive -- 'NSSFIPS on'  in /etc/httpd/conf.d/nss.conf
2. Set a password for mod_nss dbs in -- /etc/httpd/alias/
   # certutil -W /etc/httpd/alias/
3. service httpd restart

Result:
==================================================================
[root@shark alias]# service httpd restart
Stopping httpd:                                            [  OK  ]
Starting httpd: Please enter password for "NSS FIPS 140-2 Certificate DB" token:
                                                           [  OK  ]
==================================================================
[root@shark alias]# tail /var/log/httpd/error_log
[Fri Apr 29 01:33:22 2011] [warn] Cipher rsa_rc4_128_md5 is enabled but this is not a FIPS cipher, disabling.
[Fri Apr 29 01:33:22 2011] [warn] Cipher rsa_rc4_128_sha is enabled but this is not a FIPS cipher, disabling.
[Fri Apr 29 01:33:22 2011] [warn] Cipher rsa_rc4_128_md5 is enabled but this is not a FIPS cipher, disabling.
[Fri Apr 29 01:33:22 2011] [warn] Cipher rsa_rc4_128_sha is enabled but this is not a FIPS cipher, disabling.
[Fri Apr 29 01:33:22 2011] [warn] Cipher rsa_rc4_128_md5 is enabled but this is not a FIPS cipher, disabling.
[Fri Apr 29 01:33:22 2011] [warn] Cipher rsa_rc4_128_sha is enabled but this is not a FIPS cipher, disabling.
[Fri Apr 29 01:33:22 2011] [warn] Cipher rsa_rc4_128_md5 is enabled but this is not a FIPS cipher, disabling.
[Fri Apr 29 01:33:22 2011] [warn] Cipher rsa_rc4_128_sha is enabled but this is not a FIPS cipher, disabling.
[Fri Apr 29 01:33:22 2011] [warn] Cipher rsa_rc4_128_md5 is enabled but this is not a FIPS cipher, disabling.
[Fri Apr 29 01:33:22 2011] [warn] Cipher rsa_rc4_128_sha is enabled but this is not a FIPS cipher, disabling.
[root@shark alias]# 
==================================================================

 
Also ran 'strsclnt' tool as mentioned by Rob.
##################################################
[root@shark alias]# /usr/lib64/nss/unsupported-tools/strsclnt -p 8443 -d /etc/httpd/alias -P 90 shark.lab.eng.pnq.redhat.com -c 10000 -t 100
strsclnt: -- SSL: Server Certificate Validated.
.
.
.
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: 1000 cache hits; 9000 cache misses, 0 cache not reusable
          0 stateless resumes
##################################################

Rob, does this makes sense?

Comment 12 Kashyap Chamarthy 2011-04-29 18:00:21 UTC
Rob confirmed the above is correct on IRC (#secinterop)

Turning it to VERIFIED.

Version info:
===================================
[root@shark ~]# rpm -q mod_nss
mod_nss-1.0.8-12.el6.x86_64
[root@shark ~]# 
===================================

Comment 13 errata-xmlrpc 2011-05-19 14:15:08 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0735.html