Bug 1012999 - [ISPN-3321] NPE in MapReduceTask reduce phase
Summary: [ISPN-3321] NPE in MapReduceTask reduce phase
Keywords:
Status: VERIFIED
Alias: None
Product: JBoss Data Grid 6
Classification: JBoss
Component: Embedded
Version: 6.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ER2
: 6.2.0
Assignee: Tristan Tarrant
QA Contact: Martin Gencur
URL:
Whiteboard:
Depends On:
Blocks: 1010419
TreeView+ depends on / blocked
 
Reported: 2013-09-27 14:48 UTC by Alan Field
Modified: 2014-04-28 15:39 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker ISPN-3321 0 Critical Resolved NPE in MapReduceTask reduce phase 2013-11-26 20:29:51 UTC

Description Alan Field 2013-09-27 14:48:37 UTC
Description of problem:

During the execution of a MapReduce word count job with 6 nodes, the following NPE is thrown:
11:19:37,870 ERROR [org.infinispan.remoting.InboundInvocationHandlerImpl] (remote-thread-2) Exception executing command
java.lang.NullPointerException
at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.reduce(MapReduceManagerImpl.java:153)
at org.infinispan.commands.read.ReduceCommand.perform(ReduceCommand.java:88)
at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:122)
at org.infinispan.remoting.InboundInvocationHandlerImpl.access$000(InboundInvocationHandlerImpl.java:68)
at org.infinispan.remoting.InboundInvocationHandlerImpl$2.run(InboundInvocationHandlerImpl.java:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
The full log is here - https://jenkins.mw.lab.eng.bos.redhat.com/hudson/user/afield@REDHAT.COM/my-views/view/afield's%20jobs/job/jdg-radargun-mapreduce-test/181/console-edg-perf06/
Looking at the code to see if I can figure out what happened.

Comment 2 JBoss JIRA Server 2013-09-30 14:16:12 UTC
Dan Berindei <dberinde> made a comment on jira ISPN-3321

[~afield] The stack traces are a outdated, could we get another run with the latest master?

Comment 3 JBoss JIRA Server 2013-09-30 14:29:00 UTC
Alan Field <afield> made a comment on jira ISPN-3321

[~dan.berindei] I'll try, but this one isn't 100% reproducible

Comment 4 JBoss JIRA Server 2013-10-01 17:39:25 UTC
Alan Field <afield> made a comment on jira ISPN-3321

[~dan.berindei] These tests were run with Infinispan 5.3.0 Final, if that helps any. Still trying to see if I can reproduce it.

Comment 5 JBoss JIRA Server 2013-10-02 14:05:15 UTC
Dan Berindei <dberinde> made a comment on jira ISPN-3321

[~afield] I think the problem here is that the default intermediary cache configuration has a {{lifespan}} and {{maxIdle}} of 120 seconds. The error appeared almost 3 minutes after the map/reduce task started, so it's very likely that the entry inserted by the mapper in the intermediary cache expired just as the reducer was trying to read it. 

I don't think there is any reason to have expiry in the intermediary cache, so I'll change the default configuration in CreateCacheCommand. 

If a cache {{\_\_tmpMapReduce}} is defined, its configuration will be used instead of the default configuration in CreateCacheCommand. So defining a cache {{\_\_tmpMapReduce}} without any special settings is a good workaround in the meantime.

Comment 6 JBoss JIRA Server 2013-10-04 08:10:25 UTC
Adrian Nistor <anistor> made a comment on jira ISPN-3321

Integrated in master. Thanks!

Comment 7 Alan Field 2013-10-04 17:57:25 UTC
I'll try to verify this

Comment 8 Alan Field 2013-10-14 21:03:07 UTC
I have not been able to reproduce this NPE anymore. If I see this again, I will reopen this bug.


Note You need to log in before you can comment on or make changes to this bug.