Bug 1269146 - java.lang.NullPointerException at org.infinispan.container.InternalEntryFactoryImpl.create(InternalEntryFactoryImpl.java:62)
java.lang.NullPointerException at org.infinispan.container.InternalEntryFacto...
Status: CLOSED WORKSFORME
Product: JBoss Data Grid 6
Classification: JBoss
Component: Infinispan (Show other bugs)
6.3.2
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Tristan Tarrant
Martin Gencur
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-10-06 09:02 EDT by Shay Matasaro
Modified: 2016-01-05 11:28 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-01-05 11:28:31 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Shay Matasaro 2015-10-06 09:02:37 EDT
Customer is getting a NullPointerException from JDG during startup of a
new node.

      Caused by: java.lang.NullPointerException
          at
org.infinispan.container.InternalEntryFactoryImpl.create(InternalEntryFactoryImpl.java:62)


See my comment on 8/24/2015 9:10 PM EDT for more details.

It's only happening rarely in production, and extremely rarely in QA.

=================================================================

When a node joins and data is rebalanced, transactions are sent to the
new owner.  But it does not include all of the transaction data.
org.ininispan.interceptors.TxInterceptor#visitCommitCommand checks for
this case, and replays the transaction prepare if needed.

The only place I've found that this NPE should be occurring is if this
occurs and the transaction needs to be replayed, but it doesn't get
replayed for some reason.
The latest logs show that it is *not* replaying any prepare before the
exception occurs.

I suspect there's something wrong with this check to see if it needs to
replay the prepare.

The following warning is occurring in close proximity to the exception,
and may be related:

      ISPN000071: Caught exception when handling command
CacheTopologyControlCommand{cache=dataMap, type=REBALANCE_START
      ...
      java.lang.IllegalArgumentException: Received a rebalance start
topology ... while there already was a rebalance in progress: ...

I believe we'll need to trace the topology IDs for prepare and commit
commands, and the state transfer,
to see if there's a discrepancy causing the topology ID check in
TxInterceptor#visitCommitCommand to not work correctly.
Comment 2 Tristan Tarrant 2015-10-07 08:12:26 EDT
That part of the code has changed after 6.3.2, so we'd need the customer and GSS to validate this with a current release (6.5.x).
Comment 5 wfink 2016-01-05 11:28:31 EST
The issue is not reproducable (JDG 6.5.1) and there are no further informations or details how to reproduce it.
Also no logfiles with details (TRACE/DEBUG) during the failure.

Note You need to log in before you can comment on or make changes to this bug.