Bug 892565

Summary: StateConsumerImpl can build InvalidateL1Command with replicated cache
Product: [JBoss] JBoss Data Grid 6 Reporter: Radim Vansa <rvansa>
Component: InfinispanAssignee: Tristan Tarrant <ttarrant>
Status: VERIFIED --- QA Contact: Martin Gencur <mgencur>
Severity: high Docs Contact:
Priority: medium    
Version: 6.1.0CC: jdg-bugs
Target Milestone: DR2   
Target Release: 6.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
JBoss Data Grid throws a NullPointerException on cache shutdown, which is a harmless warning. This exception is never propagated to user code. This situation occurs when on cache shutdown, the node sometimes receives the new consistent hash, where all its data segments are removed. It will then attempt to discard that data or move it to L1 if enabled. This fails due to NullPointerException as some cache components are already shut down (DistributionManager). This affects both distributed and replicated caches. </para> <para> This last consistent hash update is irrelevant and is not always received, thus not always reproducible. Failing to process it is logged as a warning. This issue can only be avoided by ensuring LocalTopologyManager does not deliver consistent hash updates after leave.
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Radim Vansa 2013-01-07 10:55:21 UTC
StateConsumerImpl.invalidateSegments(...) may build up InvalidateL1Command even for replicated cache which cannot feature L1.

This may later result in

ISPN000136: Execution error: java.lang.NullPointerException
at org.infinispan.commands.write.InvalidateL1Command.perform(InvalidateL1Command.java:109) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.CallInterceptor.handleDefault(CallInterceptor.java:110) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.AbstractVisitor.visitInvalidateCommand(AbstractVisitor.java:141) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.AbstractVisitor.visitInvalidateL1Command(AbstractVisitor.java:146) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.write.InvalidateL1Command.acceptVisitor(InvalidateL1Command.java:192) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.AbstractVisitor.visitInvalidateCommand(AbstractVisitor.java:141) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.AbstractVisitor.visitInvalidateL1Command(AbstractVisitor.java:146) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.write.InvalidateL1Command.acceptVisitor(InvalidateL1Command.java:192) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.EntryWrappingInterceptor.invokeNextAndApplyChanges(EntryWrappingInterceptor.java:274) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.EntryWrappingInterceptor.visitInvalidateL1Command(EntryWrappingInterceptor.java:157) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.write.InvalidateL1Command.acceptVisitor(InvalidateL1Command.java:192) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.AbstractVisitor.visitInvalidateCommand(AbstractVisitor.java:141) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.AbstractVisitor.visitInvalidateL1Command(AbstractVisitor.java:146) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.write.InvalidateL1Command.acceptVisitor(InvalidateL1Command.java:192) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.locking.AbstractLockingInterceptor.visitInvalidateL1Command(AbstractLockingInterceptor.java:98) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.write.InvalidateL1Command.acceptVisitor(InvalidateL1Command.java:192) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:132) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.AbstractVisitor.visitInvalidateCommand(AbstractVisitor.java:141) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.AbstractVisitor.visitInvalidateL1Command(AbstractVisitor.java:146) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.write.InvalidateL1Command.acceptVisitor(InvalidateL1Command.java:192) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.statetransfer.StateTransferInterceptor.visitInvalidateL1Command(StateTransferInterceptor.java:173) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.write.InvalidateL1Command.acceptVisitor(InvalidateL1Command.java:192) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:118) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:128) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.InvocationContextInterceptor.handleDefault(InvocationContextInterceptor.java:92) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.AbstractVisitor.visitInvalidateCommand(AbstractVisitor.java:141) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.AbstractVisitor.visitInvalidateL1Command(AbstractVisitor.java:146) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.commands.write.InvalidateL1Command.acceptVisitor(InvalidateL1Command.java:192) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.interceptors.InterceptorChain.invoke(InterceptorChain.java:343) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.statetransfer.StateConsumerImpl.invalidateSegments(StateConsumerImpl.java:666) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.statetransfer.StateConsumerImpl.onTopologyUpdate(StateConsumerImpl.java:267) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.statetransfer.StateTransferManagerImpl.doTopologyUpdate(StateTransferManagerImpl.java:194) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.statetransfer.StateTransferManagerImpl.access$000(StateTransferManagerImpl.java:60) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.statetransfer.StateTransferManagerImpl$1.updateConsistentHash(StateTransferManagerImpl.java:120) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.topology.LocalTopologyManagerImpl.handleConsistentHashUpdate(LocalTopologyManagerImpl.java:194) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.topology.CacheTopologyControlCommand.doPerform(CacheTopologyControlCommand.java:165) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.topology.CacheTopologyControlCommand.perform(CacheTopologyControlCommand.java:137) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.executeCommandFromLocalCluster(CommandAwareRpcDispatcher.java:251) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher.handle(CommandAwareRpcDispatcher.java:218) [infinispan-core-5.2.0.Beta6-redhat-1.jar:5.2.0.Beta6-redhat-1]
at org.jgroups.blocks.RequestCorrelator.handleRequest(RequestCorrelator.java:483) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.blocks.RequestCorrelator.receiveMessage(RequestCorrelator.java:390) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.blocks.RequestCorrelator.receive(RequestCorrelator.java:248) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.blocks.MessageDispatcher$ProtocolAdapter.up(MessageDispatcher.java:598) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.blocks.mux.MuxUpHandler.up(MuxUpHandler.java:130) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.JChannel.up(JChannel.java:703) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.stack.ProtocolStack.up(ProtocolStack.java:1020) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.RSVP.up(RSVP.java:172) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.FRAG2.up(FRAG2.java:181) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.FlowControl.up(FlowControl.java:400) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.FlowControl.up(FlowControl.java:418) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.pbcast.GMS.up(GMS.java:896) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:244) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.UNICAST2.up(UNICAST2.java:432) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.pbcast.NAKACK2.handleMessage(NAKACK2.java:721) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:574) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:187) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:288) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.MERGE2.up(MERGE2.java:205) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.Discovery.up(Discovery.java:359) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.TP$ProtocolAdapter.up(TP.java:2640) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.TP.passMessageUp(TP.java:1287) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1850) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1823) [jgroups-3.2.4.Final-redhat-1.jar:3.2.4.Final-redhat-1]
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) [rt.jar:1.6.0_33]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) [rt.jar:1.6.0_33]
at java.lang.Thread.run(Thread.java:662) [rt.jar:1.6.0_33]

Comment 1 JBoss JIRA Server 2013-01-07 11:00:52 UTC
Radim Vansa <rvansa> updated the status of jira ISPN-2692 to Closed

Comment 2 JBoss JIRA Server 2013-01-07 11:00:52 UTC
Radim Vansa <rvansa> made a comment on jira ISPN-2692

This is basically a duplicate of ISPN-2589 (sorry, noticed that too late) closely related to ISPN-2691.

Comment 3 JBoss JIRA Server 2013-01-10 10:26:37 UTC
Mircea Markus <mmarkus> made a comment on jira ISPN-2589

this is a replicated cache though, L1 logic should not be present at all.

Comment 4 JBoss JIRA Server 2013-01-10 10:38:13 UTC
Adrian Nistor <anistor> made a comment on jira ISPN-2589

Indeed, this being a repl cache there should be no L1 logic. But even if L1 invalidation logic exist (and it's there because same code is used for dist mode) it should be harmless because that code path should never be reached since in repl mode no segments are ever removed. So I'm thinking the whole issue is an anomaly that disappears after we solve ISPN-2691.

Comment 5 JBoss JIRA Server 2013-01-23 15:04:39 UTC
Michal Linhard <mlinhard> made a comment on jira ISPN-2589

Occurence in JDG 6.1.0.ER9 (infinispan 5.2.0.CR2):
http://www.qa.jboss.com/~mlinhard/hyperion3/run0037-startup-ER9/report/loganalysis/server/categories/cat8_entry0.txt

Comment 6 JBoss JIRA Server 2013-01-23 15:12:43 UTC
Michal Linhard <mlinhard> made a comment on jira ISPN-2589

sorry this is not REPL_SYNC case, should this be a new JIRA ?

Comment 7 JBoss JIRA Server 2013-01-23 15:15:17 UTC
Michal Linhard <mlinhard> made a comment on jira ISPN-2589

This is my config: http://www.qa.jboss.com/~mlinhard/hyperion3/run0037-startup-ER9/report/configs.zip
added standalone_node0001.xml

Comment 8 JBoss JIRA Server 2013-01-23 15:17:49 UTC
Adrian Nistor <anistor> made a comment on jira ISPN-2589

Can't access www.qa.jboss.com even with the vpn. So what cache mode is it if not REPL_SYNC?

Comment 9 JBoss JIRA Server 2013-01-23 15:32:01 UTC
Michal Linhard <mlinhard> made a comment on jira ISPN-2589

from attached file standalone_node0001.xml:

{code}
      <subsystem default-cache-container="default" xmlns="urn:jboss:datagrid:infinispan:6.1">
         <cache-container default-cache="testCache" name="default">
            <transport executor="infinispan-transport" lock-timeout="600000"/>

            <distributed-cache batching="false" indexing="NONE" l1-lifespan="0" mode="SYNC" name="testCache" owners="2" remote-timeout="60000" segments="40" start="EAGER">
               <locking acquire-timeout="3000" concurrency-level="1000" isolation="REPEATABLE_READ" striping="false"/>
               <transaction mode="NONE"/>
               <state-transfer enabled="true" timeout="600000"/>
               <eviction max-entries="-1" strategy="NONE"/>
            </distributed-cache>
         </cache-container>
      </subsystem>

{code}

Comment 10 JBoss JIRA Server 2013-01-23 16:20:53 UTC
Adrian Nistor <anistor> made a comment on jira ISPN-2589

And the NPE is thrown from exactly the same line from InvalidateL1Command? (sorry to ask so many details - cannot access the qa machine for logs)

Comment 11 JBoss JIRA Server 2013-01-23 16:49:34 UTC
Michal Linhard <mlinhard> made a comment on jira ISPN-2589

No probs. Does the link http://dev39.mw.lab.eng.bos.redhat.com/~mlinhard/hyperion3/run0037-startup-ER9/report/loganalysis/server/categories/cat8_entry0.txt work for you ? if not I'll paste here

Comment 12 JBoss JIRA Server 2013-01-23 16:58:42 UTC
Adrian Nistor <anistor> made a comment on jira ISPN-2589

Works. Thanks!

Comment 13 JBoss JIRA Server 2013-01-23 17:08:27 UTC
Michal Linhard <mlinhard> made a comment on jira ISPN-2589

And one more occurence in 5.2.0.CR2 with DIST_SYNC mode, now originating in JGroups stack:
https://jenkins.mw.lab.eng.bos.redhat.com/hudson/view/EDG6/view/EDG-REPORTS-PERF/job/jdg-cs-perf-client-stress-test-hotrod/3/artifact/report/size8/loganalysis/server/categories/cat8_entry0.txt

Comment 14 JBoss JIRA Server 2013-02-05 13:48:19 UTC
Adrian Nistor <anistor> made a comment on jira ISPN-2589

It seems this exception is actually a warning logged during cache shutdown. The exception is benign and is never propagated to user code. All we can do about it is add a simple check to prevent it happening.

Comment 15 JBoss JIRA Server 2013-02-05 13:48:44 UTC
Adrian Nistor <anistor> updated the status of jira ISPN-2589 to Coding In Progress

Comment 17 JBoss JIRA Server 2013-02-07 14:23:16 UTC
Adrian Nistor <anistor> made a comment on jira ISPN-2589

On cache leave (shutdown) the node _sometimes_ receives the new CH in which all its data segments are removed and it will (needlessly) attempt to discard that data (or move it to L1 if enabled) but fails due to NPE because some cache components are already shut down (DistributionManager). This affects both DIST and REPL caches because the state transfer code is common but it was never expected that segments are ever removed in a REPL cache (can happen only during shutdown). This last CH update is irrelevant and is not always received, thus not always reproducible and failing to process it is just logged as a warning. The best way to fix this is to ensure LocalTopologyManager does not deliver CH updates after leave.