Description of problem: After update to recent recent nss (v3.26.0), strange problem appears: firefox (or curl) drop connections to sites having outdated security settings and probably not FIPS-compatible. See https://api.telecom13.ru/v2/user_auth as example: ---- Secure Connection Failed An error occurred during a connection to api.telecom13.ru. A PKCS #11 module returned CKR_DEVICE_ERROR, indicating that a problem has occurred with the token or slot. Error code: SEC_ERROR_PKCS11_DEVICE_ERROR ---- I can reproduce this problem on recent Fedora, but not on Debian/Ubunty, Google Chrome and Firefox from mozilla.org site. Quick code lookup brings an idea that this could due enabled FIPS mode (because SEC_ERROR_PKCS11_DEVICE_ERROR is triggered by NSSLOW_Init call). But at the same time I don't have FIPS mode enabled (checked on two computers where /proc/sys/crypto/fips_enabled is missing or contains 0). Version-Release number of selected component (if applicable): nss.x86_64 3.26.0-1.0.fc23 nss-softokn.x86_64 3.26.0-1.0.fc23 nss-softokn-freebl.x86_64 3.26.0-1.0.fc23 nss-sysinit.x86_64 3.26.0-1.0.fc23 nss-tools.x86_64 3.26.0-1.0.fc23 nss-util.x86_64 3.26.0-1.0.fc23 How reproducible: The problem is reproducible constantly Actual results: Error code: SEC_ERROR_PKCS11_DEVICE_ERROR Expected results: Answer with status 405 and content: {"data":{"message":"method not allowed"},"error":"405","message":"invalid_method","status":"error","request_id":"c6697051847aaf62b05bcd9fe0f34dcf","auth_token":""}
CKR_DEVICE_ERROR is misleading. The error seems to come from RSA_CheckSign(), which is annotated by "XXX Doesn't set error code": 1310│ /* 1311│ * make sure we get the same results 1312│ */ 1313│ if (PORT_Memcmp(buffer + modulusLen - dataLen, data, dataLen) != 0) 1314├> goto loser; 1315│ 1316│ PORT_Free(buffer); 1317│ return SECSuccess; 1318│ 1319│ loser: 1320│ PORT_Free(buffer); 1321│ failure: 1322│ return SECFailure; 1323│ }
I cannot reproduce the bug on Fedora 24. I'm getting the expected result (the text result, no error) with NSS 3.26, with both firefox and curl. nss-util-3.26.0-1.0.fc24.x86_64 nss-3.26.0-1.0.fc24.x86_64 nss-softokn-freebl-3.26.0-1.0.fc24.x86_64 nss-pem-1.0.2-2.fc24.x86_64 firefox-48.0.1-1.fc24.x86_64 nss-tools-3.26.0-1.0.fc24.x86_64 nss-sysinit-3.26.0-1.0.fc24.x86_64 nss-softokn-3.26.0-1.0.fc24.x86_64 /proc/sys/crypto/fips_enabled is 0 Is this bug specific to Fedora 23 ? Do you have any global pkcs#11 modules registered on your systems?
This is a strange coincidence, because a few minutes after I looked at this bug for the first time, I saw the error myself for the very first time, on Fedora 24. I got the same error message with a different web page. When I hit reload, the error went away, and the page loaded fine. I wonder if this could be a side effect of a network connectivity issue occuring at an unepected time. It doesn't seem FIPS related.
On your systems, do you see the error every time, or just sometimes?
(In reply to Kai Engert (:kaie) from comment #4) > On your systems, do you see the error every time, or just sometimes? Hi Kai, From originating bug at ALTLinux I see that they're replaced Apache by nginx and problem disappeared (obviously due more strict cipher suite there). But old interface where the problem still exists available for testing on different port: https://api.telecom13.ru:8443/v2/user_auth Cheers
(In reply to Kai Engert (:kaie) from comment #2) > Is this bug specific to Fedora 23 ? Unlikely. Both f23 and f24 run the same version of nss if they are up2date. > Do you have any global pkcs#11 modules registered on your systems? I did not have any special setup when debugging this. (In reply to Kai Engert (:kaie) from comment #4) > On your systems, do you see the error every time, or just sometimes? Every time but the original URL cannot be used any more as the server changed. (In reply to Konstantin A. Lepikhov from comment #6) > https://api.telecom13.ru:8443/v2/user_auth Thanks! The above URL works reliably for me: $ curl -v https://api.telecom13.ru:8443/v2/user_auth * Trying 194.150.104.153... * TCP_NODELAY set * Connected to api.telecom13.ru (194.150.104.153) port 8443 (#0) * Initializing NSS with certpath: sql:/etc/pki/nssdb * CAfile: /etc/pki/tls/certs/ca-bundle.crt CApath: none * NSS error -8023 (SEC_ERROR_PKCS11_DEVICE_ERROR) * A PKCS #11 module returned CKR_DEVICE_ERROR, indicating that a problem has occurred with the token or slot. * Curl_http_done: called premature == 1 * stopped the pause stream! * Closing connection 0 curl: (35) A PKCS #11 module returned CKR_DEVICE_ERROR, indicating that a problem has occurred with the token or slot.
(In reply to Kai Engert (:kaie) from comment #2) > I cannot reproduce the bug on Fedora 24. They change URL. Try this: https://api.telecom13.ru:8443/v2/user_auth > Do you have any global pkcs#11 modules registered on your systems? No. (In reply to Kai Engert (:kaie) from comment #4) > On your systems, do you see the error every time, or just sometimes? Reproduced constantly.
Bob, any ideas what could produce this error?
Could you use ssltap to find out which ciphers were being used here? What computer hardware is involved here?
the server at api.telecom13.ru:8443 is intolerant to additional signature algorithms (RSA-PSS from TLSv1.3 draft)
Hubert, are you still able to reproduce the SEC_ERROR_PKCS11_DEVICE_ERROR failure? I am currently getting just PR_END_OF_FILE_ERROR, which differs from the behavior captured on September 29 (comment #7). I suspect that something on the server has changed since then...
I can reproduce: $ curl -v https://api.telecom13.ru:8443/v2/user_auth * Trying 194.150.104.153... * Connected to api.telecom13.ru (194.150.104.153) port 8443 (#0) * Initializing NSS with certpath: sql:/etc/pki/nssdb * CAfile: /etc/pki/tls/certs/ca-bundle.crt CApath: none * NSS error -8023 (SEC_ERROR_PKCS11_DEVICE_ERROR) * A PKCS #11 module returned CKR_DEVICE_ERROR, indicating that a problem has occurred with the token or slot. * Closing connection 0 curl: (35) A PKCS #11 module returned CKR_DEVICE_ERROR, indicating that a problem has occurred with the token or slot.
Then NSS needs to be fixed such that it does not fail with CKR_DEVICE_ERROR while connecting to TLS intolerant servers. I believe that the debugger snapshot presented in comment #1 is a good hint for writing the fix.
(In reply to Kamil Dudka from comment #12) > Hubert, are you still able to reproduce the SEC_ERROR_PKCS11_DEVICE_ERROR > failure? no, I never saw it > I am currently getting just PR_END_OF_FILE_ERROR, which differs from the > behavior captured on September 29 (comment #7). I suspect that something on > the server has changed since then... that's what I always saw, and I think it's because of what I said in Comment #11 (In reply to Kamil Dudka from comment #14) > Then NSS needs to be fixed such that it does not fail with CKR_DEVICE_ERROR > while connecting to TLS intolerant servers. I believe that the debugger > snapshot presented in comment #1 is a good hint for writing the fix. I don't think the intolerance is causing that, though I don't have any better explanation.
(In reply to Alexey Gladkov from comment #13) > I can reproduce: > > $ curl -v https://api.telecom13.ru:8443/v2/user_auth > * Trying 194.150.104.153... > * Connected to api.telecom13.ru (194.150.104.153) port 8443 (#0) > * Initializing NSS with certpath: sql:/etc/pki/nssdb > * CAfile: /etc/pki/tls/certs/ca-bundle.crt > CApath: none > * NSS error -8023 (SEC_ERROR_PKCS11_DEVICE_ERROR) > * A PKCS #11 module returned CKR_DEVICE_ERROR, indicating that a problem has > occurred with the token or slot. > * Closing connection 0 > curl: (35) A PKCS #11 module returned CKR_DEVICE_ERROR, indicating that a > problem has occurred with the token or slot. can you please make a packet capture of a connection that results in this error? tcpdump format preferably
(In reply to Hubert Kario from comment #16) > (In reply to Alexey Gladkov from comment #13) > > I can reproduce: > > > > $ curl -v https://api.telecom13.ru:8443/v2/user_auth > > * Trying 194.150.104.153... > > * Connected to api.telecom13.ru (194.150.104.153) port 8443 (#0) > > * Initializing NSS with certpath: sql:/etc/pki/nssdb > > * CAfile: /etc/pki/tls/certs/ca-bundle.crt > > CApath: none > > * NSS error -8023 (SEC_ERROR_PKCS11_DEVICE_ERROR) > > * A PKCS #11 module returned CKR_DEVICE_ERROR, indicating that a problem has > > occurred with the token or slot. > > * Closing connection 0 > > curl: (35) A PKCS #11 module returned CKR_DEVICE_ERROR, indicating that a > > problem has occurred with the token or slot. > > can you please make a packet capture of a connection that results in this > error? tcpdump format preferably Attached. Captured by tcpdump using this commmand: $ sudo tcpdump -s 65535 -w bug1377646.dump -i eth0 host 194.150.104.153
Created attachment 1212238 [details] tcpdump of failed session
I've checked the first 2 connections in the packet dump, in both cases the signature created by server is invalid - the hashes don't match the data exchanged. So it's the server that is buggy, not the client. While the error returned by NSS is misleading (as Kamil already said in Comment #1), the resulting connection abort (including the alert it sends back to the server) is entirely correct and expected.
Hubert, thank you very much for your analysis. Marking as not-a-bug. To get the error code improved, it would be best to file an upstream bug.
(In reply to Kai Engert (:kaie) from comment #20) > To get the error code improved, it would be best to file an upstream bug. Even with patch attached, it would take months or years to get a reply. I have no local reproducer for this and, by the time someone actually replies, I will no longer remember what this bug was about.
Kamil, I can create such misbehaving server in like 5 minutes of work :)
(In reply to Hubert Kario from comment #22) > Kamil, I can create such misbehaving server in like 5 minutes of work :) If you are able to create it in a way that NSS upstream can use, it would be really appreciated! Based on comment #5 I thought that a local reproducer was close to impossible because the error I debugged was supposed to only happen as a result of some internal error?
(In reply to Kamil Dudka from comment #23) > (In reply to Hubert Kario from comment #22) > > Kamil, I can create such misbehaving server in like 5 minutes of work :) > > If you are able to create it in a way that NSS upstream can use, it would be > really appreciated! > > Based on comment #5 I thought that a local reproducer was close to > impossible because the error I debugged was supposed to only happen as a > result of some internal error? well, if it *was* internal error then bug in bignum library would be correct, but your comment #1 looks to be correct - the error is just mislabeled To be precise, the server sends a signature that can be decrypted using public key from certificate, but the decrypted value does not include correct hashes, but *does* include correct padding. The solution I was thinking about was monkey-patching tlslite-ng - modifying ServerKeyExchange.hash() to ignore serverRandom for hash creation would be enough.