JSS produces incorrect error messages for socket accept due to bad pointer arithmetic. There are 3 places with code like this: JSSL_throwSSLSocketException(env, "Accept operation timed out with error code " + err); The variable err is an integer with a value typically around -5000. The above code isn't correct C code (just barely valid syntax) although the compiler will accept it while issuing a warning about bad subscripting. SSLServerSocket.c: In function 'Java_org_mozilla_jss_ssl_SSLServerSocket_socketAccept': SSLServerSocket.c:127: warning: array subscript is below array bounds SSLServerSocket.c:133: warning: array subscript is below array bounds SSLServerSocket.c:136: warning: array subscript is below array bounds I think the original author forget they weren't writing Java code. As far as I can tell what occurs is the pointer to the error string in the static data segment is offeset by the value of the err code to produce a totally bogus pointer. I think we've been lucky it hasn't segfaulted. Here is an example of what happens at run time: java.net.SocketException: SSL handshake error java.net.SocketTimeoutException: login to token: (-5990) I/O operation timed out. The error string in the call to JSSL_throwSSLSocketException() is replaced by another string entirely. In this instance causing a developer to investigate token login problems when that wasn't the issue at all. This is a great example of why we shouldn't ignore compiler warnings. I'm going to attach 2 patches. The first one is a minimal patch that just corrects the above incorrect behaviour. I also went through and fixed most of the warnings, I don't think the other warnings were as serious. The second patch just leaves a couple of warnings concerning unreferenced values that do need to be present. I didn't have the time to figure out how to silence those. Applying the first patch is an absolute must. Applying the second patch is probably a good idea, should probably be reviewed to assure the initial values I gave uninitialized variables are correct and to figure out the right way to address the few remaining warnings.
Created attachment 461316 [details] Minimal patch to correct bad string pointer Note, there is no need to specify the err code, JSSL_throwSSLSocketException() will add that information.
Created attachment 461317 [details] Clean up most compiler warnings Includes the error string pointer fix, but also addresses most of the other compiler warnings.
Comment on attachment 461316 [details] Minimal patch to correct bad string pointer approval for this attachment only. Please consult mharmsen on details of how to do JSS patches.
patch was checked into Fedora for rawhide, F14 & F13 on 12/6 and packages were built and pushed to updates-testing. However, I noticed the RHEL-6 is does not have the last 4 patches which are in Fedora, will follow up with Matt and Christina to see why not and if there is any reason they should be applied to the RHEL-6 version as well.
jss-4_2_6-10 has now been committed to rawhide, F14, F13 and RHEL-6. All packages have been built for rawhide, F14, F13 and RHEL-6. Final changes included fixing the attribution/blame in the changelog entry and relocating the jar to /usr/lib/jss and/or /usr/lib64/jss as per Fedora packaging guidelines. Reviews performed by mharmsen and cfu.
John, Kashyap was asking how to test this. If not too much trouble, could you give him a test case that could allow him to observe what would cause JSSL_throwSSLSocketException() now to spit out the correct message then that would be good. Thanks!
It's been so long since I discovered this I forget how I discovered it or what provoked it. We fixed a JSS socket timeout parameter problem at about the same time which is what I think provoked this error, perhaps you could revert that patch. Or perhaps you could set up a firewall to prevent communication on the specified port which would cause the timeout to occur. Or perhaps you could have faith in static source code analysis and accept the previous code was indeed wrong and was properly corrected :-) I realize it's better to test, sometimes the cost/benefit is not proportional, especially considering you only hit this code in aberrant circumstances, which is probably why the bogus code was in production for 8 years (if I recall correctly).