Bug 568912

Summary: Occasional data corruption using openssl.i686 AES ciphers on HP dc5850 systems
Product: Red Hat Enterprise Linux 5 Reporter: Jeff Bastian <jbastian>
Component: opensslAssignee: Tomas Mraz <tmraz>
Status: CLOSED NOTABUG QA Contact: BaseOS QE Security Team <qe-baseos-security>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.4CC: akostadi, Jan.van.Eldik, rdassen, tao
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-03-09 15:40:29 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
openssl aes stress test
none
openssl all ciphers stress test
none
Log of openssl i686 rpmbuild failure on affected customer system
none
Revised openssl all ciphers stress test none

Description Jeff Bastian 2010-02-26 22:12:14 UTC
Description of problem:
  Environment:
    RHEL 5.4 i386
    HP dc5850 workstation with AMD Phenom 8600B Triple-Core Processor
    openssl i686

The problem was first noticed with scp failing:
    # dd if=/dev/urandom of=/tmp/bigfile bs=1M count=2048
    # scp localhost:/tmp/bigfile /tmp/bigfile-2
    bigfile                                    1%   29MB  14.1MB/s   02:23 ETA
    Corrupted MAC on input.
    Finished discarding for 127.0.0.1
    lost connection

Then I tried simply encrypting and decrypting a file with openssl and comparing the decrypted file to the original:
    openssl enc -aes-128-cbc -a -salt -in /usr/share/dict/linux.words \
            -out /tmp/words.enc -pass pass:myPassword
    openssl enc -d -aes-128-cbc -a -in /tmp/words.enc \
            -out /tmp/words.txt -pass pass:myPassword
    cmp /usr/share/dict/linux.words /tmp/words.txt

Repeat this a few times and eventually the decrypted /tmp/words.txt will be corrupt.  We've seen about a 4% - 6% failure rate.
    # diff /usr/share/dict/words words.txt 
    25140,25142c25140
    < arranged
    < arrangement
    < arrangements
    ---
    > arrang~��G�s��Y)H��BG�rrangements

It only fails with the AES family of ciphers.

Switching over to the openssl.i386 packages fixes both problems.


Version-Release number of selected component (if applicable):
openssl-0.9.8e-12.el5_4.1.i686

How reproducible:
about 5% of the time?

Steps to Reproduce:
1. install RHEL 5.4 i386 onto an HP dc5850 system
2. dd if=/dev/urandom of=/tmp/bigfile bs=1M count=2048
3. scp localhost:/tmp/bigfile /tmp/bigfile-2
4. run the attached enc_test_aes.sh script
  
Actual results:
scp fails with "Corrupted MAC on input."
files encrypted and decrypted with openssl are corrupt

Expected results:
no scp failures
no corruption on local encrypted files

Additional info:

Comment 1 Jeff Bastian 2010-02-26 22:13:04 UTC
Created attachment 396686 [details]
openssl aes stress test

Comment 2 Jeff Bastian 2010-02-26 22:13:44 UTC
Created attachment 396687 [details]
openssl all ciphers stress test

Comment 4 Issue Tracker 2010-02-28 07:35:56 UTC
Event posted on 2010-02-28 08:35 CET by rdassen

The customer who originally reported this issue to us has confirmed the
workaround of using openssl i386. In their case, the affected system is
also an HP Compaq dc5850 Business PC, but with an AMD Phenom(tm) 9600B
Quad-Core Processor rather than a triple-core one.

They have also noted that while they can rebuild openssl i386 just fine on
this system, rebuilding openssl i686 fails.


This event sent from IssueTracker by rdassen 
 issue 563983

Comment 5 J.H.M. Dassen (Ray) 2010-02-28 07:39:07 UTC
Created attachment 396837 [details]
Log of openssl i686 rpmbuild failure on affected customer system

Comment 6 Tomas Mraz 2010-03-01 10:14:40 UTC
Can we get this reproduced on any other AMD machine with i386 RHEL-5 installed? This seriously looks as an bug on the AMD side which we can hardly do much about.

Comment 7 Issue Tracker 2010-03-01 10:23:07 UTC
Event posted on 2010-03-01 11:23 CET by rdassen

I'm currently trying to reproduce this issue with a different AMD CPU.
I'm testing using bl35p-1.gsslab.rdu.redhat.com, which has
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 250

No errors yet in 22 rounds.



This event sent from IssueTracker by rdassen 
 issue 563983

Comment 8 J.H.M. Dassen (Ray) 2010-03-01 10:26:36 UTC
Created attachment 397032 [details]
Revised openssl all ciphers stress test

Revised version of the openssl all ciphers stress test. This one will continue running until the issue is reproduced.

Comment 10 J.H.M. Dassen (Ray) 2010-03-01 13:55:30 UTC
bl35p-1.gsslab.rdu.redhat.com (the system from comment #7) has now successfully completed 490 rounds of testing (test from comment #8). This system/processor is not affected by this issue.

Comment 13 Issue Tracker 2010-03-09 14:21:19 UTC
Event posted on 2010-03-09 15:21 CET by rdassen

I've tried to reproduce this issue on another type of system with an AMD
Phenom CPU.

Testing was done using phenom-02.lab.bos.redhat.com which has
cpu family      : 16
model           : 2
model name      : AMD Phenom(tm) 9500 Quad-Core Processor

No errors were found during 666 rounds of testing, so this issue is not
reproducible on all AMD Phenom systems.



This event sent from IssueTracker by rdassen 
 issue 563983
it_file 463693

Comment 14 J.H.M. Dassen (Ray) 2010-03-09 15:33:14 UTC
(In reply to comment #6)
> Can we get this reproduced on any other AMD machine with i386 RHEL-5 installed?

Testing has shown this issue not to be a generic RHEL5 i386 on AMD issue (comment #10), nor a generic RHEL5 i386 on AMD Phenom issue (comment #13). To date, it has only been reproduced with RHEL5 i386 on HP dc5850 systems.