Bug 1420518 - Intermittent java.io.InvalidClassException: Not a proxy with Java 1.8
Intermittent java.io.InvalidClassException: Not a proxy with Java 1.8
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: java-1.8.0-openjdk (Show other bugs)
6.8
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: Andrew Dinn
BaseOS QE - Apps
:
Depends On:
Blocks: 1374441 1461138
  Show dependency treegraph
 
Reported: 2017-02-08 16:12 EST by Robert Bost
Modified: 2017-10-02 16:14 EDT (History)
5 users (show)

See Also:
Fixed In Version: java-1.8.0-openjdk-1.8.0.121-3.b13.el6
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-10-02 16:14:10 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
reproducer project (8.28 KB, application/x-gzip)
2017-02-08 16:12 EST, Robert Bost
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Icedtea Bugzilla 3336 None None None 2017-03-03 03:04 EST
Red Hat Knowledge Base (Solution) 2916111 None None None 2017-02-08 16:14 EST
openjdk bug system JDK-8174729 None None None 2017-02-17 02:08 EST

  None (edit)
Description Robert Bost 2017-02-08 16:12:20 EST
Created attachment 1248665 [details]
reproducer project

Description of problem: After updating to java-1.8.0-openjdk-1.8.0.111, I've started seeing java.io.InvalidClassException: Not a proxy occassionally.


Version-Release number of selected component (if applicable): java-1.8.0-openjdk-1.8.0.111 and java-1.8.0-openjdk-1.8.0.121


How reproducible: Sometimes, just a matter of time with reproducer.


Steps to Reproduce: 
1. Attached reproducer with steps in README.md

Actual results: InvalidClassException

Additional info: This issue has already been reported upstream and with oracle:

https://bugs.openjdk.java.net/browse/JDK-8087168
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8087168
Comment 1 Andrew Dinn 2017-02-09 05:26:35 EST
I believe that the test program is manifesting a race condition in the WeakCache code. The problem arises when using the both the jdk8u release mentioned by the customer and the latest jdk8u code.

I have a simple (one-line) fix for this issue which I believe to be correct. I have applied the patch to the latest jdk8u tree, rebuilt and rerun the test. This stops the 'Not a proxy' InvalidClassException from manifesting. However, removing that problem causes the test program to fail with other unexpected and unaccountable errors.

I am seeing a NullPointerException coming out of various method calls in the proxy and deserialization code. The stack backtraces for these exceptions put them at odd or, in some cases, impossible locations in the Java source code. I am still investigating this problem just to be sure that it does not relate to my fix (I think it almost certainly cannot be related but I am not yet certain of that).
Comment 3 Deepak Bhole 2017-02-09 09:36:05 EST
Assigning to Andrew. Andrew, once there is a patch, please ping Jiri to get it into the rpms.
Comment 4 Andrew Dinn 2017-02-10 11:20:09 EST
I have checked this with Peter Levart who wrote the Weakcache code and

i) he agrees that there is a race condition
ii) he agrees that my patch fixes it
iii) he produced a simpler and more reliable reproducer which manifests the problem before the patch and works ok after the patch
iv) his reproducer does not encounter the NullPointerException problems that I saw with the original reproducer

I raised https://bugs.openjdk.java.net/browse/JDK-8174729 to cover this problem and the fix. Peter has offered to post the patch to jdk8u. I am expecting him to include his reproducer as a regression test. Once that is patched upstream we can pull the fix into our release.

So far as the client is concerned this means that the race condition problem they have identified will be resolved but they will probably still continue to see problems in their serialization code along the lines of the ones I have been observing.

I have experimented to see if those problems relate to the diagnosis provided in the original JIRA and bugs databse issue and I am not convinced it is. I modified the client's reproducer to ensure that references to proxy instances and classes weer retained yet I still saw the NPEs. Also, even if the problem were to do with the client code not maintaining reachability for proxy instances or classes I believe it has to be an incorrect for the JVM to be generating NPE traces originating at lines where there is no object dereference and, hence, no potential for a NullPointerException to arise. I believe there is something else going wrong in the JDK/JVM here and that it may well relate to reference processing and GC. I will pursue this further to try to understand what is happening.
Comment 6 Andrew Dinn 2017-02-15 08:29:52 EST
The reproducer provided by Peter Levart has been attached to the the OpenJDK issue:

  https://bugs.openjdk.java.net/browse/JDK-8174729
Comment 7 Andrew Dinn 2017-02-24 10:49:03 EST
A fix for JDK-8174729 has been submitted to upstream jdk8u and approved for inclusion.

  http://mail.openjdk.java.net/pipermail/jdk8u-dev/2017-February/006451.html
Comment 8 Andrew Dinn 2017-02-28 12:15:20 EST
The fix for JDK-8174729 has now been committed to upstream jdk8u-dev

  http://hg.openjdk.java.net/jdk8u/jdk8u-dev/jdk/rev/9bc2f86c5e88

n.b. the fix version listed for the bug is 8u152
Comment 11 Andrew John Hughes 2017-04-27 11:12:54 EDT
This was fixed in the recent 8u131 security update.
See java-1.8.0-openjdk-1.8.0.131-0.b11.el6_9 for RHEL 6.9.
Comment 12 Deepak Bhole 2017-10-02 16:14:10 EDT
Closing based on comment #11

Note You need to log in before you can comment on or make changes to this bug.