1269146 – java.lang.NullPointerException at org.infinispan.container.InternalEntryFactoryImpl.create(InternalEntryFactoryImpl.java:62)

Bug 1269146 - java.lang.NullPointerException at org.infinispan.container.InternalEntryFactoryImpl.create(InternalEntryFactoryImpl.java:62)

Summary: java.lang.NullPointerException at org.infinispan.container.InternalEntryFacto...

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	JBoss Data Grid 6
Classification:	JBoss
Component:	Infinispan
Sub Component:
Version:	6.3.2
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	---
Assignee:	Tristan Tarrant
QA Contact:	Martin Gencur
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-10-06 13:02 UTC by Shay Matasaro
Modified:	2019-10-10 10:18 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2016-01-05 16:28:31 UTC
Type:	Bug
Embargoed:

Attachments	(Terms of Use)

Description Shay Matasaro 2015-10-06 13:02:37 UTC

Customer is getting a NullPointerException from JDG during startup of a
new node.

      Caused by: java.lang.NullPointerException
          at
org.infinispan.container.InternalEntryFactoryImpl.create(InternalEntryFactoryImpl.java:62)


See my comment on 8/24/2015 9:10 PM EDT for more details.

It's only happening rarely in production, and extremely rarely in QA.

=================================================================

When a node joins and data is rebalanced, transactions are sent to the
new owner.  But it does not include all of the transaction data.
org.ininispan.interceptors.TxInterceptor#visitCommitCommand checks for
this case, and replays the transaction prepare if needed.

The only place I've found that this NPE should be occurring is if this
occurs and the transaction needs to be replayed, but it doesn't get
replayed for some reason.
The latest logs show that it is *not* replaying any prepare before the
exception occurs.

I suspect there's something wrong with this check to see if it needs to
replay the prepare.

The following warning is occurring in close proximity to the exception,
and may be related:

      ISPN000071: Caught exception when handling command
CacheTopologyControlCommand{cache=dataMap, type=REBALANCE_START
      ...
      java.lang.IllegalArgumentException: Received a rebalance start
topology ... while there already was a rebalance in progress: ...

I believe we'll need to trace the topology IDs for prepare and commit
commands, and the state transfer,
to see if there's a discrepancy causing the topology ID check in
TxInterceptor#visitCommitCommand to not work correctly.

Comment 2 Tristan Tarrant 2015-10-07 12:12:26 UTC

That part of the code has changed after 6.3.2, so we'd need the customer and GSS to validate this with a current release (6.5.x).

Comment 5 wfink 2016-01-05 16:28:31 UTC

The issue is not reproducable (JDG 6.5.1) and there are no further informations or details how to reproduce it.
Also no logfiles with details (TRACE/DEBUG) during the failure.

Note You need to log in before you can comment on or make changes to this bug.