Bug 1267599

Summary: SASL failure on jgroups merge event
Product: [JBoss] JBoss Data Grid 6 Reporter: Shay Matasaro <smatasar>
Component: JGroupsAssignee: Tristan Tarrant <ttarrant>
Status: VERIFIED --- QA Contact: Martin Gencur <mgencur>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 6.5.0CC: afield, bban, chuffman, pslavice, vjuranek, wfink
Target Milestone: DR4   
Target Release: 6.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
In Red Hat JBoss Data Grid, when node authentication was enabled and a JDG node left the cluster it was not able to rejoin the cluster. This issue is resolved as of Red Hat JBoss Data Grid 6.6.0. When node authentication is turned on, then the JDG node can rejoin the cluster without any issue.
Story Points: ---
Clone Of:
: 1271662 1275292 (view as bug list) Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1271662, 1275292    
Attachments:
Description Flags
jgroups unit test reproducer none

Description Shay Matasaro 2015-09-30 14:00:35 UTC
When using SASL for a JDG cluster  if a node drops , it is unable to rejoin


18:33:52,203 WARNING [org.jgroups.protocols.pbcast.Merger] (MergeTask,xxxxxxxxxxavs20-41239) xxxxxxxxxxavs20-41239: merge is cancelled: did not get any merge responses from partition coordinators


18:33:52,207 WARNING [org.jgroups.protocols.SASL] (OOB-326,xxxxxxxxxxavs20-41239) failed to validate CHALLENGE from xxxxxxxxxxavs20-41239, token: javax.security.sasl.SaslException: DIGEST-MD5: digest response format violation. Missing username.

a test harness is attached

Comment 1 Shay Matasaro 2015-09-30 14:09:53 UTC
Created attachment 1078667 [details]
jgroups unit test reproducer

You should be able to reproduce by running the testMerging2Members() unit test in  MergeTest.java . This is adapted from MergeTest in the jgroups unit tests.  

The test can also be run from the command line from the bin folder of AvsSaslError with the following command:

"java -cp .;..\lib\JDGSASLPropCallbackHandlers-module.jar;..\lib\jgroups-3.4.4.Final-redhat-5.jar;..\lib\junit-4.8.1.jar;..\lib\log4j-1.2.15.jar  org.junit.runner.JUnitCore org.jgroups.protocols.MergeTest"

Comment 3 Shay Matasaro 2015-09-30 14:14:40 UTC
another stack trace

12:19:55,566 WARN  [org.jgroups.protocols.pbcast.GMS] (Incoming-2,shared=tcp) jdg2/clustered: not member of view [jdg1/clustered|2]; discarding it
12:20:43,618 WARN  [org.jgroups.protocols.SASL] (OOB-19,shared=tcp) failed to validate CHALLENGE from jdg2/clustered, token: javax.security.sasl.SaslException: DIGEST-MD5: digest response format violation. Missing username.
at org.jboss.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:441)
at org.jboss.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:270)
at org.jgroups.auth.sasl.SaslServerContext.nextMessage(SaslServerContext.java:73) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at org.jgroups.protocols.SASL.up(SASL.java:234) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:234) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at org.jgroups.protocols.UNICAST3.deliverMessage(UNICAST3.java:1064) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at org.jgroups.protocols.UNICAST3.handleDataReceivedFromSelf(UNICAST3.java:810) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at org.jgroups.protocols.UNICAST3.up(UNICAST3.java:424) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at org.jgroups.protocols.pbcast.NAKACK2.up(NAKACK2.java:652) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:155) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at org.jgroups.protocols.FD_ALL.up(FD_ALL.java:200) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:299) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at org.jgroups.protocols.MERGE3.up(MERGE3.java:286) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at org.jgroups.protocols.Discovery.up(Discovery.java:291) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at org.jgroups.protocols.TP$ProtocolAdapter.up(TP.java:2842) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at org.jgroups.protocols.TP.passMessageUp(TP.java:1577) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at org.jgroups.protocols.TP$3.run(TP.java:1511) [jgroups-3.6.3.Final-redhat-3.jar:3.6.3.Final-redhat-3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_40]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_40]
at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_40]

12:20:48,615 WARN  [org.jgroups.protocols.pbcast.GMS] (MergeTask,jdg2/clustered) jdg2/clustered: merge is cancelled: merge leader rejected merge request
12:20:48,618 WARN  [org.jgroups.protocols.SASL] (INT-5,shared=tcp) failed to validate SaslHeader from jdg2/clustered, header: payload=[B@28c0b63c

Comment 8 Tristan Tarrant 2015-10-06 09:08:09 UTC
I have issued a PR for upstream https://github.com/belaban/JGroups/pull/240

Comment 10 Vojtech Juranek 2015-10-16 15:38:40 UTC
Verified with provided reproducer, that it's fixed in JGroups 3.6.3.Final-redhat-4.