Bug 1198662

Summary: BPM cluster fails to initialize properly due to ConcurrentRefUpdateException thrown by jgit
Product: [Retired] JBoss BPMS Platform 6 Reporter: Radovan Synek <rsynek>
Component: Business CentralAssignee: Alexandre Porcelli <porcelli>
Status: CLOSED EOL QA Contact: Radovan Synek <rsynek>
Severity: high Docs Contact:
Priority: high    
Version: 6.1.0CC: kverlaen, rrajasek, rsynek
Target Milestone: ER5   
Target Release: 6.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-03-27 19:09:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1153674, 1159278    
Attachments:
Description Flags
server log node one
none
server log node two
none
server log excerpt ER6
none
server log excerpt CR1 none

Description Radovan Synek 2015-03-04 15:24:04 UTC
Created attachment 997924 [details]
server log node one

Description of problem:
Having a BPM cluster with two EAP 6.4 nodes in a domain, sometimes the second cluster node fails to deploy business central properly due to org.eclipse.jgit.api.errors.ConcurrentRefUpdateException: Could not lock HEAD. RefUpdate return code was: LOCK_FAILURE. Please take a look at attached server logs.

In Majority of attempts the issue does not occur, it's likely a synchronization problem. The first node seems to be working and responsive, the second node is completely lost when the issue happens.

Version-Release number of selected component (if applicable):
6.1.0.ER5

How reproducible:
10% - 20%

Steps to Reproduce:
1. configure two EAP 6.4 nodes in domain mode together with helix and zookeeper
2. deploy business central
3. watch for errors in server logs

Comment 1 Radovan Synek 2015-03-04 15:24:55 UTC
Created attachment 997925 [details]
server log node two

Comment 4 Radovan Synek 2015-03-10 13:24:59 UTC
Alex,

unfortunately the issue shows itself with ER6 as well, attaching new server log excerpt.

Comment 5 Radovan Synek 2015-03-10 13:26:29 UTC
Created attachment 999911 [details]
server log excerpt ER6

Comment 6 Alexandre Porcelli 2015-03-10 13:36:19 UTC
What exactly setup this happens? I mean.. is this first startup (no existing repo)? Or is this happens during a restart of an existing setup (some repos already existis)?

Comment 7 Radovan Synek 2015-03-10 13:51:01 UTC
Last time it happened after a failover simulation, so the repository should have been there. To be more specific, exception showed on node two, but node one was stopped and started again.

Comment 8 Alexandre Porcelli 2015-03-10 20:48:32 UTC
Additional improvements:

(master) http://github.com/uberfire/uberfire/commit/f656d837f

Here a change to remove a missing sync method:

(master) http://github.com/droolsjbpm/kie-wb-common/commit/0ab922f2d

Comment 9 Kris Verlaenen 2015-03-10 20:58:18 UTC
Requesting blocker flag for backport to 6.2.x

Comment 11 Radovan Synek 2015-03-27 14:49:39 UTC
Maybe from a different place, but the "could not lock HEAD" error is still being thrown, take a look at the new log excerpt (CR1)

Comment 12 Radovan Synek 2015-03-27 14:50:37 UTC
Created attachment 1007330 [details]
server log excerpt CR1

Comment 16 Radovan Synek 2015-11-13 14:57:18 UTC
Verified with BPMS-6.2.0.ER5 that this problem no longer exists