Bug 1295571 - Openssl errors cause glusterfs disconnects
Summary: Openssl errors cause glusterfs disconnects
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: core
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Kaushal
QA Contact: Anoop
URL:
Whiteboard: OnCustomer
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-01-04 21:58 UTC by Oonkwee Lim
Modified: 2020-03-11 15:01 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-02-23 07:29:39 UTC
Embargoed:


Attachments (Terms of Use)

Comment 2 Oonkwee Lim 2016-01-06 19:24:51 UTC
Hello,

Any idea what is going on and how to fix this?

Below is the SSL set-up from the customer:

SSL setup for the glusterfs:

[fkogan@ny1cs8smgprx02 ssl]$ pwd
/etc/ssl
[fkogan@ny1cs8smgprx02 ssl]$ ls -la
total 20
drwxr-xr-x   2 root root 4096 Dec  8 10:42 .
drwxr-xr-x. 92 root root 4096 Dec  8 11:12 ..
lrwxrwxrwx   1 root root   16 Apr  1  2015 certs -> ../pki/tls/certs
-rw-------   1 root root  733 Dec  8 10:42 glusterfs.ca
-rw-------   1 root root  887 Dec  8 10:42 glusterfs.key
-rw-------   1 root root  733 Dec  8 10:42 glusterfs.pem

[fkogan@ny1cs8smgprx02 ssl]$ openssl x509 -in glusterfs.pem -noout -text
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 9626495786608863886 (0x85982eea209daa8e)
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: CN=Anyone
        Validity
            Not Before: Nov  5 21:51:26 2015 GMT
            Not After : Nov  2 21:51:26 2025 GMT
        Subject: CN=Anyone
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (1024 bit)
                Modulus:
                    00:bc:de:3a:b4:57:8f:e9:09:5f:a0:2f:f0:8c:43:
                    3e:eb:ea:a0:64:86:c8:ee:55:99:a8:83:3d:3a:2c:
                    a9:e9:d5:6e:cc:96:57:e6:b3:93:22:91:98:ce:c5:
                    95:b3:29:eb:e1:de:f3:ff:81:49:bd:af:97:c9:22:
                    2e:4c:9c:9c:be:50:97:8a:ad:3f:f3:ca:9d:e6:a6:
                    b1:0c:46:05:da:cc:45:83:a6:ca:e8:bf:99:16:a4:
                    fb:f2:d2:ba:d5:94:b6:eb:ec:03:26:dc:8e:c6:97:
                    41:1e:ab:2a:39:d3:fc:43:7c:6f:a1:a7:cd:bd:34:
                    0f:5a:02:ca:cb:59:13:07:97
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Subject Key Identifier: 
                67:04:E7:54:24:AF:1D:D4:C9:7A:CC:1B:9B:64:27:99:8D:51:5C:75
            X509v3 Authority Key Identifier: 
                keyid:67:04:E7:54:24:AF:1D:D4:C9:7A:CC:1B:9B:64:27:99:8D:51:5C:75

            X509v3 Basic Constraints: 
                CA:TRUE
    Signature Algorithm: sha256WithRSAEncryption
         6d:15:e6:7b:a7:c0:ea:a7:4e:1c:1f:8c:63:36:e5:96:9f:e7:
         40:a7:3d:72:1a:42:a8:98:ab:13:de:e1:e7:2e:ad:15:9c:45:
         0e:b6:aa:fc:3c:21:24:bd:a5:04:b3:7e:3f:1a:65:89:7f:1c:
         ed:ad:9e:16:ac:7f:e9:e5:4e:3a:59:11:1c:aa:68:02:66:6d:
         61:89:8b:8c:9b:b1:f8:b7:c2:89:87:fd:65:bc:4d:b0:fc:65:
         c3:92:66:71:5e:d3:62:f8:ce:e6:3d:a3:37:44:f2:65:08:21:
         11:8a:c5:ce:f8:e4:26:67:b4:69:70:b1:65:78:ab:62:52:89:
         f3:75

Glusterfs.ca is just a copy of the pem file.

Comment 4 Kaushal 2016-01-07 06:55:06 UTC
Hey Oonkwee,

This seems to be a bug with Openssl. Googling around gives a lot of similar issues, with some major causes,

1. Due to AES-NI being used.

2. In multi-threaded environments, with SSL_MODE_RELEASE_BUFFERS and mutli-threaded environment. But we shouldn't be affected by this because we use a single thread per connection and don't use SSL_MODE_RELEASE_BUFFERS.

3. With large reads happening.

1 and 2 are quite old issues, seen around 2013-2014 and are supposed to be have been fixed in 1.0.1e and 1.0.1h respectively.
3 is something more recent, I saw reports of this in 2015. I haven't found if it was fixed.

Seeing that the customer hit issues when doing rsync, it's possibly the issue 3 here.

Also, the customer seems to be using upstream binaries, (glusterfs-3.7.4-2.el7.x86_64 and glusterfs-3.7.0-2.el7.x86_64 are not rhgs builds). AFAIK these should'nt be supported.

Comment 5 Oonkwee Lim 2016-01-08 00:18:08 UTC
Hello,

I have engage the openssl folks.

CU has a question for glusterfs:

"I agree about the openssl part. Apparently, this bug does manifest itself in other circumstances, unrelated to glusterfs. Question is - are there fixes or workaround for that bug that can be used with glusterfs. E.g. if there is an environment variable that helps, how can I inject it into the glusterfs environment?

Thanks & Regards

Oonkwee
Emerging Technologies
RedHat Global Support

Comment 8 Kaushal 2016-01-12 12:38:32 UTC
Hi Oonkwee,

There are no fixes or workarounds available right now AFAIK. We also don't have ways to modify advanced openssl parameters, via environment variables or otherwise, without doing changes to the source and rebuilding GlusterFS.

Comment 9 Nagaprasad Sathyanarayana 2016-01-21 12:53:54 UTC
Oon, As there is no immediate action on Gluster team on this, we would wait for the fix from OpenSSL.  Can you please provide the BZ corresponding to the same?

Comment 10 Oonkwee Lim 2016-01-21 19:48:41 UTC
Hello,

The customer used RHUI repositories that point to another provider. 

It appear that the provider has outdated packages for one reason or another.

The default RHEL repo has newer packages:

[root@totty-rhel7-system1 ~]# rpm -qa|grep rsync
rsync-3.0.9-17.el7.x86_64

[root@totty-rhel7-system1 ~]# rpm -qa|grep openssl
openssl-1.0.1e-51.el7_2.2.x86_64
openssl-libs-1.0.1e-51.el7_2.2.x86_64

The customer had applied the above and so far had not seen the errors.

The customer will be monitoring for a few more days.

Until then there is no BZ for OpenSSL.

Comment 12 Oonkwee Lim 2016-02-22 17:57:00 UTC
From the CU:

(2/5/2016 6:36 AM)

No, looks like it helped with this issue. Thanks for your help!

Case is closed.

Comment 13 Atin Mukherjee 2016-02-23 07:29:39 UTC
Based on comment 12 closing the bug.


Note You need to log in before you can comment on or make changes to this bug.