Bug 976572

Summary: Pick up various upstream GCM code fixes applied since nss-3.14.3 was released
Product: Red Hat Enterprise Linux 6 Reporter: Elio Maldonado Batiz <emaldona>
Component: nss-softoknAssignee: Elio Maldonado Batiz <emaldona>
Status: CLOSED ERRATA QA Contact: Hubert Kario <hkario>
Severity: high Docs Contact:
Priority: high    
Version: 6.5CC: amarecek, emaldona, eparis, hkario, kengert, ksrot, omoris, rrelyea
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: nss-softokn-3.14.3-5.el6 Doc Type: Bug Fix
Doc Text:
Cause: During upstream development post release of nss-3.14.3 various defects in the GCM implementation of GCM were identoifies and fixed for the nss-3.15.1 release. Consequence: None, customers won't be affected because GCM is a new feature. Fix: Backport the GCM fixed applied upstream for nss-3.15.1. Result: Customers sgould be able to use GCM and won't encounter any of the problems already documented and fixed upstream.
Story Points: ---
Clone Of:
: 987131 988072 (view as bug list) Environment:
Last Closed: 2013-11-21 06:15:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 988072, 988083    
Bug Blocks: 968473, 983766, 987131    
Attachments:
Description Flags
Patch 1 of 5
rrelyea: review+
Patch 2 of 5
rrelyea: review+
Patch 3 of 5
rrelyea: review+
Patch 4 of 5
rrelyea: review+
Patch 5 of 5 - switch to using PLArenaPool on blapitest.c
rrelyea: review+
gcm fixes - freebl part
rrelyea: review+
gcm fixes - softoken part
rrelyea: review+
Print CPU info just before the tests start none

Description Elio Maldonado Batiz 2013-06-20 20:19:14 UTC
Description of problem: GCM support was added in softoken from nss-3.14.3 but since that release several fixed have been applied upstream and we need to pick them up for the RHEL-6.5.


Version-Release number of selected component (if applicable): 3.14.3-4
How reproducible:   N/A
Steps to Reproduce: N/A
Actual results:     N/A
Expected results:   N/A

Additional info:

Some were picked up as part of the rebase but the changes need carefull review. This is new code code slated for submission for a FIPS-140 revalidation of nss-softokn.

Comment 2 Elio Maldonado Batiz 2013-06-20 21:06:13 UTC
Created attachment 763596 [details]
Patch 1 of 5

Comment 3 Elio Maldonado Batiz 2013-06-20 21:07:04 UTC
Created attachment 763597 [details]
Patch 2 of 5

Comment 4 Elio Maldonado Batiz 2013-06-20 21:08:00 UTC
Created attachment 763598 [details]
Patch 3 of 5

Comment 5 Elio Maldonado Batiz 2013-06-20 21:09:01 UTC
Created attachment 763599 [details]
Patch 4 of 5

Comment 6 Bob Relyea 2013-06-20 23:26:24 UTC
Comment on attachment 763596 [details]
Patch 1 of 5

r+ rrelyea

Comment 7 Bob Relyea 2013-06-20 23:27:53 UTC
Comment on attachment 763597 [details]
Patch 2 of 5

r+ rrelyea

Comment 8 Bob Relyea 2013-06-20 23:29:19 UTC
Comment on attachment 763598 [details]
Patch 3 of 5

r+ good, matches gcm.nits in RHEL-5

Comment 9 Bob Relyea 2013-06-20 23:30:52 UTC
Comment on attachment 763599 [details]
Patch 4 of 5

r+ rrelyea

good, matches freebl-gcm.patch in RHEL-5.

Comment 10 Bob Relyea 2013-06-20 23:31:38 UTC
yes, these are needed --- for FIPS if for no other reason: dev_ack+

Comment 11 Elio Maldonado Batiz 2013-06-20 23:53:36 UTC
Thank you for the prompt review. 

Unfortunately, further testing makes me think there is still "something wrotten in Denmark."

Just did a scratch using the source rpm downloaded from the last brew build - https://brewweb.devel.redhat.com/buildinfo?buildID=273865
wget http://download.devel.redhat.com/brewroot/packages/nss-softokn/3.14.3/4.el6/src/nss-softokn-3.14.3-4.el6.src.rpm

The scratch build failed! 
https://brewweb.devel.redhat.com/taskinfo?taskID=5932642
This is an x86_64 one but a previously one I did failed s390x,  both with a core and in the same test.

cipher.sh: #15: AES GCM (Failed in/out offset pairs:  [0:0][0:1][0:2][0:3][0:4][0:5] ----- skipped .... [7:7]) - Core file is detected - FAILED

By the way, a scratch build with those four patches disabled had no problems,
https://brewweb.devel.redhat.com/taskinfo?taskID=5932673

There is something wrong in one of them.

Comment 12 Elio Maldonado Batiz 2013-06-21 00:24:53 UTC
There is nothing wrong with any of them one ones views them one by one. The problem may be when they are applied one after the other. This is a different conext that what we had upstream in 3.15 whre this patches are being picked up. I consolidated them into one patch and the final patched code is actually different. Then I did a scratch build
https://brewweb.devel.redhat.com/taskinfo?taskID=5932776
It succeeded!

Comment 13 Elio Maldonado Batiz 2013-06-21 01:44:28 UTC
Tried scratch builds again, no changes, and it failed twice.

Comment 14 Elio Maldonado Batiz 2013-06-21 02:53:06 UTC
Exactly the same failure but this time with a RHEL-5.10 scratch build.

Comment 15 Elio Maldonado Batiz 2013-06-21 06:28:38 UTC
A bit of good news for a change. 

Since we never have any failures in fedora that have me clue that the backported patches need to be more faithfull to the original ones on rhel-3.5. So I did another experiment. Created an additional patch that makes the blapitest.c use PLArenaPool instead of PRArenaPool and I am pleased to inform that I am getting good builds mutiple times. 

It's doing the job so far but it may not be the final solution yet.

I found the following comments in nss/util/secitem.h.

/*
** This is a legacy function containing bugs. It doesn't update item->len,
** and it has other issues as described in bug 298649 and bug 298938.
** However, the function is  kept unchanged for consumers that might depend 
** on the broken behaviour. New code should call SECITEM_ReallocItemV2.
**
** Reallocate the data for the specified "item".  If "arena" is not NULL,
** then reallocate from there, otherwise reallocate from the heap.
** In the case where oldlen is 0, the data is allocated (not reallocated).
** In any case, "item" is expected to be a valid SECItem pointer;
** SECFailure is returned if it is not.  If the allocation succeeds,
** SECSuccess is returned.
*/
extern SECStatus SECITEM_ReallocItem( /* deprecated function */
				     PLArenaPool *arena, SECItem *item,
				     unsigned int oldlen, unsigned int newlen);

/*
** Reallocate the data for the specified "item".  If "arena" is not NULL,
** then reallocate from there, otherwise reallocate from the heap.
** If item->data is NULL, the data is allocated (not reallocated).
** In any case, "item" is expected to be a valid SECItem pointer;
** SECFailure is returned if it is not, and the item will remain unchanged.
** If the allocation succeeds, the item is updated and SECSuccess is returned.
 */
extern SECStatus SECITEM_ReallocItemV2(PLArenaPool *arena, SECItem *item,
				       unsigned int newlen);

wtc's comment "New code should call SECITEM_ReallocItemV2." is telling.
This is new code after all. To use it I would need to patch nss-util, a spearte package. That would need a separate bug that would make a blocker a blocker to this one. I seem to be getting closer to a solution but there is a bit more to do analysis-wise.

Comment 16 Elio Maldonado Batiz 2013-06-21 15:22:30 UTC
(In reply to Elio Maldonado Batiz from comment #14)
> Exactly the same failure but this time with a RHEL-5.10 scratch build.

After applying that extra patch, which I haven't showed show you yet, the RHEL-5.10 scratch builds are now working.

Comment 17 Elio Maldonado Batiz 2013-06-21 15:38:41 UTC
Created attachment 763882 [details]
Patch 5 of 5 - switch to using PLArenaPool on blapitest.c

That extra patch that fixes those intermittent core dumps on cipher (blapi) tests.

Comment 18 Bob Relyea 2013-06-21 16:54:02 UTC
Comment on attachment 763882 [details]
Patch 5 of 5 - switch to using PLArenaPool on blapitest.c

Are this patches from upstream?

bob

Comment 19 Elio Maldonado Batiz 2013-06-21 18:23:38 UTC
(In reply to Bob Relyea from comment #18)
> Comment on attachment 763882 [details]
> Patch 1 of 5 - switch to using PLArenaPool on blapitest.c
> 
> Are this patches from upstream?
> 
> bob

Yes, Patch 5 is a subset of the last patch from this upstream bug https://bugzilla.mozilla.org/show_bug.cgi?id=802430
See https://bug802430.bugzilla.mozilla.org/attachment.cgi?id=744975

With meld, a convenient way to compare is with 
meld NO_NSPR_10_SUPPORT.patch usePL.patch. 
Click on the "keep Highlighting button" on the top right to make it easier to view.

Comment 23 Elio Maldonado Batiz 2013-07-19 19:17:12 UTC
Today I had again a build failure on a regular brew build of nss for RHEL-5.10. The cause was the same sporadic core dump I have reported earlier, previously seen on scratch builds, and for which I have the fifth patch under consideration. It happens on 64-bit platforms.

Comment 24 Elio Maldonado Batiz 2013-07-19 19:25:52 UTC
From http://download.devel.redhat.com/brewroot/work/tasks/2670/6062670/build.log
...
bltest -T -m aes_gcm -d /builddir/build/BUILD/nss-3.14.3/mozilla/security/nss/tests/../cmd/bltest -1 7 -2 6
./cipher.sh: line 57: 10272 Illegal instruction     (core dumped) ${PROFTOOL} ${BINDIR}/bltest -T -m $PARAM -d $CIPHERTESTDIR -1 $inOff -2 $outOff
bltest -T -m aes_gcm -d /builddir/build/BUILD/nss-3.14.3/mozilla/security/nss/tests/../cmd/bltest -1 7 -2 7
./cipher.sh: line 57: 10275 Illegal instruction     (core dumped) ${PROFTOOL} ${BINDIR}/bltest -T -m $PARAM -d $CIPHERTESTDIR -1 $inOff -2 $outOff
cipher.sh: #15: AES GCM (Failed in/out offset pairs:  [0:0][0:1][0:2][0:3][0:4][0:5][0:6][0:7][1:0][1:1][1:2][1:3][1:4][1:5][1:6][1:7][2:0][2:1][2:2][2:3][2:4][2:5][2:6][2:7][3:0][3:1][3:2][3:3][3:4][3:5][3:6][3:7][4:0][4:1][4:2][4:3][4:4][4:5][4:6][4:7][5:0][5:1][5:2][5:3][5:4][5:5][5:6][5:7][6:0][6:1][6:2][6:3][6:4][6:5][6:6][6:7][7:0][7:1][7:2][7:3][7:4][7:5][7:6][7:7]) - Core file is detected - FAILED
....

Comment 25 Bob Relyea 2013-07-22 22:17:43 UTC
Comment on attachment 763882 [details]
Patch 5 of 5 - switch to using PLArenaPool on blapitest.c

r+ probably worth pushing upstream as well.

bob

Comment 26 Elio Maldonado Batiz 2013-07-24 17:43:19 UTC
Created attachment 777876 [details]
gcm fixes - freebl part

Comment 27 Elio Maldonado Batiz 2013-07-24 17:44:27 UTC
Created attachment 777886 [details]
gcm fixes - softoken part

Comment 28 Elio Maldonado Batiz 2013-07-26 23:12:54 UTC
Created attachment 778906 [details]
Print CPU info just before the tests start

Same thing we are doing on rhel-5.10. Good pratcice in case we run run into platform-specific problems.

Comment 29 Bob Relyea 2013-07-27 00:29:19 UTC
Comment on attachment 777876 [details]
gcm fixes - freebl part

r+ rrelyea

Comment 30 Bob Relyea 2013-07-27 00:30:01 UTC
Comment on attachment 777886 [details]
gcm fixes - softoken part

r+ rrelyea

Comment 33 errata-xmlrpc 2013-11-21 06:15:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1558.html