This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1012999 - [ISPN-3321] NPE in MapReduceTask reduce phase
[ISPN-3321] NPE in MapReduceTask reduce phase
Status: VERIFIED
Product: JBoss Data Grid 6
Classification: JBoss
Component: Embedded (Show other bugs)
6.2.0
Unspecified Unspecified
unspecified Severity high
: ER2
: 6.2.0
Assigned To: Tristan Tarrant
Martin Gencur
:
Depends On:
Blocks: 1010419
  Show dependency treegraph
 
Reported: 2013-09-27 10:48 EDT by Alan Field
Modified: 2014-04-28 11:39 EDT (History)
1 user (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
JBoss Issue Tracker ISPN-3321 Critical Resolved NPE in MapReduceTask reduce phase 2013-11-26 15:29:51 EST

  None (edit)
Description Alan Field 2013-09-27 10:48:37 EDT
Description of problem:

During the execution of a MapReduce word count job with 6 nodes, the following NPE is thrown:
11:19:37,870 ERROR [org.infinispan.remoting.InboundInvocationHandlerImpl] (remote-thread-2) Exception executing command
java.lang.NullPointerException
at org.infinispan.distexec.mapreduce.MapReduceManagerImpl.reduce(MapReduceManagerImpl.java:153)
at org.infinispan.commands.read.ReduceCommand.perform(ReduceCommand.java:88)
at org.infinispan.remoting.InboundInvocationHandlerImpl.handleInternal(InboundInvocationHandlerImpl.java:122)
at org.infinispan.remoting.InboundInvocationHandlerImpl.access$000(InboundInvocationHandlerImpl.java:68)
at org.infinispan.remoting.InboundInvocationHandlerImpl$2.run(InboundInvocationHandlerImpl.java:194)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
The full log is here - https://jenkins.mw.lab.eng.bos.redhat.com/hudson/user/afield@REDHAT.COM/my-views/view/afield's%20jobs/job/jdg-radargun-mapreduce-test/181/console-edg-perf06/
Looking at the code to see if I can figure out what happened.
Comment 2 JBoss JIRA Server 2013-09-30 10:16:12 EDT
Dan Berindei <dberinde@redhat.com> made a comment on jira ISPN-3321

[~afield] The stack traces are a outdated, could we get another run with the latest master?
Comment 3 JBoss JIRA Server 2013-09-30 10:29:00 EDT
Alan Field <afield@redhat.com> made a comment on jira ISPN-3321

[~dan.berindei] I'll try, but this one isn't 100% reproducible
Comment 4 JBoss JIRA Server 2013-10-01 13:39:25 EDT
Alan Field <afield@redhat.com> made a comment on jira ISPN-3321

[~dan.berindei] These tests were run with Infinispan 5.3.0 Final, if that helps any. Still trying to see if I can reproduce it.
Comment 5 JBoss JIRA Server 2013-10-02 10:05:15 EDT
Dan Berindei <dberinde@redhat.com> made a comment on jira ISPN-3321

[~afield] I think the problem here is that the default intermediary cache configuration has a {{lifespan}} and {{maxIdle}} of 120 seconds. The error appeared almost 3 minutes after the map/reduce task started, so it's very likely that the entry inserted by the mapper in the intermediary cache expired just as the reducer was trying to read it. 

I don't think there is any reason to have expiry in the intermediary cache, so I'll change the default configuration in CreateCacheCommand. 

If a cache {{\_\_tmpMapReduce}} is defined, its configuration will be used instead of the default configuration in CreateCacheCommand. So defining a cache {{\_\_tmpMapReduce}} without any special settings is a good workaround in the meantime.
Comment 6 JBoss JIRA Server 2013-10-04 04:10:25 EDT
Adrian Nistor <anistor@redhat.com> made a comment on jira ISPN-3321

Integrated in master. Thanks!
Comment 7 Alan Field 2013-10-04 13:57:25 EDT
I'll try to verify this
Comment 8 Alan Field 2013-10-14 17:03:07 EDT
I have not been able to reproduce this NPE anymore. If I see this again, I will reopen this bug.

Note You need to log in before you can comment on or make changes to this bug.