Red Hat Bugzilla – Bug 1006079
Serialization failing due to non-sorted OTCs
Last modified: 2014-08-06 16:19:20 EDT
Description of problem:
Serialization is failing intermittently in different platforms due to OTCs not being sorted prior to serialization.
This problem is very hard to isolate in a test because it happens randomly. It depends on the hashcode of objects calculated by the JVM. Given that, I am not adding a test case.
6.0.x : http://github.com/droolsjbpm/drools/commit/43434acb7
master : http://github.com/droolsjbpm/drools/commit/4c05531ac
Edson, could you share the environment you encountered the problem in? I am trying to verify the issue, but I can't get the error even with older BRMS versions.
Tomas, I was unable to create a test case that would reproduce the problem because it depends on how the JVM calculates the hashcode of the OTCs. This is JVM/OS dependent.
The only thing you can do to verify is create a positive test, meaning you would check the lexicographic order of the OTCs that were serialized to the byte/disk, but even this is very low level and would not test the code for failures, only for successes. This is why I did not add such a test.
Having said that, we have hundreds of serialization tests that indirectly test this code by doing double roundtrip. I.e., each serialization test we have will do:
1. create a session "a"
2. serialize "a" into byte "x"
3. deserialize "x" into a new session "b"
4. serialize "b" into byte "y"
5. compare "x" and "y" for equality
If the problem still exists, eventually some of these tests will fail (that is how we caught the problem). It is not ideal, but as I said, since it is dependent on JVM/OS, it is quite hard to enforce reproducibility.
Could you, please, tell me the configuration (OS and JVM) where the original was discovered? I'd like to at least try to reproduce it.
Sent an e-mail to the team copying you.
The issue is very hard to reproduce. I have looked over the code of the fix and mark this issue as verified in BRMS.6.0.0.ER4.