I believe this is a race condition in NSS. See this reproducer here: https://github.com/cipherboy/jss/blob/reproduce-1881999/org/mozilla/jss/tests/LWCAFailure.java Change count to 10 on line 68, build JSS via https://github.com/dogtagpki/jss/blob/master/docs/building.md#in-source-build and then run via `cd build && ./run_test.sh org.mozilla.jss.tests.LWCAFailure`. (I've added 1000 to see if it is reproducible via serial Key Gen. So far it hasn't after a couple hundred keygen attempts. So I'm inclined to believe race condition.) This successfully reproduces the failure after sufficient (random) time: org.mozilla.jss.crypto.TokenException: Keypair Generation failed on token with error: -8025 : Exception in thread "Thread-8" java.lang.RuntimeException: Keypair Generation failed on token with error: -8025 : at org.mozilla.jss.tests.LWCAFailure$Smasher.run(LWCAFailure.java:63) at java.lang.Thread.run(Thread.java:748) Caused by: org.mozilla.jss.crypto.TokenException: Keypair Generation failed on token with error: -8025 : at org.mozilla.jss.pkcs11.PK11KeyPairGenerator.generateRSAKeyPairWithOpFlags(Native Method) at org.mozilla.jss.pkcs11.PK11KeyPairGenerator.generateKeyPair(PK11KeyPairGenerator.java:502) at org.mozilla.jss.crypto.KeyPairGenerator.genKeyPair(KeyPairGenerator.java:50) at org.mozilla.jss.tests.LWCAFailure.createSubCA(LWCAFailure.java:49) at org.mozilla.jss.tests.LWCAFailure$Smasher.run(LWCAFailure.java:59) ... 1 more It is (subjectively) faster to reproduce on NSS @ 3.53 (in RHEL 8.3) than it is in NSS @ 3.56 (in Fedora 32 currently). I do not know yet how to trigger this in a reliable fashion other than brute-forcing parallel keygen and waiting for it to fail. My suggestion is we turn this over to the crypto/NSS team to see if they have any thoughts.
For context, the relevant JSS function (generateRSAKeyPairWithOpFlags) is fairly trivial: JNIEXPORT jobject JNICALL Java_org_mozilla_jss_pkcs11_PK11KeyPairGenerator_generateRSAKeyPairWithOpFlags (JNIEnv *env, jobject this, jobject token, jint keySize, jlong publicExponent, jboolean temporary, jint sensitive, jint extractable, jint op_flags, jint op_flags_mask) { PK11RSAGenParams params; PR_ASSERT(env!=NULL && this!=NULL && token!=NULL); /************************************************** * setup parameters *************************************************/ params.keySizeInBits = keySize; params.pe = publicExponent; return PK11KeyPairGeneratorWithOpFlags(env, this, token, CKM_RSA_PKCS_KEY_PAIR_GEN, ¶ms, temporary, sensitive, extractable, op_flags, op_flags_mask); } At the time it is called (and this executes), we've initialized NSS a while ago (because we've successfully generated keys $count times earlier). So the race condition must happen in the PK11 code somewhere. The question is why is this failure in NSS... now :-)
Sorry, PK11KeyPairGeneratorWithOpFlags is actually a JSS function. It calls JSS's JSS_PK11_generateKeyPairWithOpFlags and keysToKeyPair. So, we call the following two PK11 functions: - PK11_Authenticate (should be a no-op if token is already logged in, which it is since we authed earlier). - PK11_GenerateKeyPairWithOpFlags No JSS method returns SEC_ERROR_PKCS11_GENERAL_ERROR, so it is one of these two functions (or the interaction thereof). Guess the next step is a C reproducer without JSS.