Bug 822815

Summary: NPE during JGroups Channel Service startup
Product: [JBoss] JBoss Data Grid 6 Reporter: Michal Linhard <mlinhard>
Component: EAPAssignee: Tristan Tarrant <ttarrant>
Status: CLOSED WORKSFORME QA Contact: Michal Linhard <mlinhard>
Severity: high Docs Contact:
Priority: high    
Version: 6.1.0CC: gsheldon, jdg-bugs, mhusnain, myarboro, nobody, rvansa
Target Milestone: ER3   
Target Release: 6.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Occasionally, when starting a JBoss Data Grid server, the JGroups subsystem would not start because of a NullPointerException during service installation, leaving the server in an unusable state. This situation does not affect data integrity within the cluster, and simply killing the server and restarting it solves the problem.
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-11-28 14:00:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Log from the failing node none

Description Michal Linhard 2012-05-18 09:08:28 UTC
Created attachment 585369 [details]
Log from the failing node

See the attached log

Comment 1 Tristan Tarrant 2012-05-18 12:18:18 UTC
How often does it happen ?

Comment 2 Michal Linhard 2012-05-18 14:12:37 UTC
I've seen it only once. It was when I was starting a 32node test, when 20th node was starting this happened. I then restarted the test and it worked alright.

Comment 3 Tristan Tarrant 2012-05-18 15:55:31 UTC
The code where this happens is (Lines 49-51):

for (Address address: this.channel.getView()) {
            String name = this.channel.getName(address);
            if (name.equals(localName) && !address.equals(localAddress)) {

So an NPE can only be if name is null or address is null. And name can be null only if address is null. Very odd.

Comment 4 mark yarborough 2012-05-23 14:21:25 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Tristan to supply CCFR

Comment 5 Misha H. Ali 2012-06-06 03:29:35 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-Tristan to supply CCFR+<remark>Tristan to supply CCFR</remark>

Comment 6 Misha H. Ali 2012-06-07 03:22:04 UTC
Flagging tristan for information about this bug.

Comment 7 Tristan Tarrant 2012-06-07 09:39:11 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1 +1 @@
-<remark>Tristan to supply CCFR</remark>+Occasionally, when starting a JDG server, the JGroups subsystem would not start because of a NullPointerException during service installation, leaving the server in an unusable state. This situation does not affect data integrity within the cluster, and simply killing the server and restarting it solves the problem.

Comment 8 Michal Linhard 2012-06-08 12:46:58 UTC
Happened again in CR1
http://www.qa.jboss.com/~mlinhard/hyperion/run176-elas-dist-32-CR1/logs/analysis/server/categories/cat6_entry0.txt
This time the test went ok to node 24 and failed.