908261 – Attribute check-for-live-server must set on live server to faillback

Bug 908261 - Attribute check-for-live-server must set on live server to faillback

Summary: Attribute check-for-live-server must set on live server to faillback

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	JBoss Enterprise Application Platform 6
Classification:	JBoss
Component:	Documentation
Sub Component:
Version:	6.1.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	urgent
Target Milestone:	ER7
Target Release:	EAP 6.3.0
Assignee:	David Michael
QA Contact:
Docs Contact:	David Michael
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2013-02-06 09:48 UTC by Miroslav Novak
Modified:	2014-08-14 15:19 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2014-06-28 15:41:15 UTC
Type:	Bug
Embargoed:

Attachments	(Terms of Use)
configuration for live server (27.83 KB, text/xml) 2013-02-06 09:49 UTC, Miroslav Novak	no flags	Details
configuration for backup server (27.99 KB, text/xml) 2013-02-06 09:50 UTC, Miroslav Novak	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	AS7-6460	0	Major	Closed	Attribute check-for-live-server must set on live server to faillback	2017-08-22 09:38:48 UTC

Description Miroslav Novak 2013-02-06 09:48:04 UTC

When there is live/backup pair with replicated journal then it should be sufficient to set attribute "check-for-live-server" in messaging subsystem only on backup to force backup server to shutdown when live server comes alive again. Problem is this won't happen.
Only when attribute "check-for-live-server" is set on live server then failback is successful (backup shutdown itself)
It's not well documented where "check-for-live-server" should be set in HornetQ project documentation:
http://docs.jboss.org/hornetq/2.3.0.CR1/docs/user-manual/html_single/index.html#ha.allow-fail-back

This issue was hit with EAP 6.1.0.DR2 (HQ 2.3.0.CR1).

Comment 1 Miroslav Novak 2013-02-06 09:49:29 UTC

Created attachment 693833 [details]
configuration for live server

Comment 2 Miroslav Novak 2013-02-06 09:50:03 UTC

Created attachment 693834 [details]
configuration for backup server

Comment 3 JBoss JIRA Server 2013-02-08 12:46:56 UTC

Francisco Borges <francisco.borges> made a comment on jira AS7-6460

Hi,

I just sent a PR improving the documentation of this option. Commit is this one
https://github.com/FranciscoBorges/hornetq/commit/60381397aeacba97e95f10df4647a494785468fa

Comment 4 JBoss JIRA Server 2013-02-08 13:19:18 UTC

Miroslav Novak <mnovak> made a comment on jira AS7-6460

There is one more thing we could mention. I'm not sure if it's really problem now. Imagine this scenario:

1. There is live/backup pair in dedicated topology with replicated journal
2. Live is killed and backup takes its role (so everything failover to backup)
3. Now for whatever reason backup is shutdowned/killed too
4. Administrator comes and start live server first. Then he starts backup server.

Live is not up-to-date and also backup will corrupt its journal when it synchronizes with "old" live.

I guess we could add warning to doc not to do this if this is really an issue.

Comment 5 JBoss JIRA Server 2013-02-18 13:07:01 UTC

Francisco Borges <francisco.borges> made a comment on jira AS7-6460

My assumption is that the backup having newer data in a case like this is a "given assumption". @AndyTaylor and @Clebert, do you guys have any opinions?

Notice that when the backup "restarts" as a back, it will move its data to a "side" directory. So its "exclusive" data won't get deleted that easily.

Comment 6 Clebert Suconic 2013-02-18 20:41:02 UTC

@Francisco +1

Comment 7 Miroslav Novak 2013-02-21 15:11:51 UTC

PR for HornetQ doc:
https://github.com/FranciscoBorges/hornetq/commit/8d828124365a5f4de236d6228a114dffa13d2d08

Moving bz to ON_QA status for later verification.

Comment 8 Miroslav Novak 2013-02-21 15:20:39 UTC

After discussion with dev it was agreed that this is a feature. Can it be documented in EAP 6 doc?

PR:
https://github.com/FranciscoBorges/hornetq/commit/8d828124365a5f4de236d6228a114dffa13d2d08

This is a feature which was documented in above pull request.

Assigning bz to doc team.

Comment 9 Tom WELLS 2013-02-21 23:05:06 UTC

This BZ will be added to the list for review once the EAP 6.1 PRD commitments are complete.

Comment 12 Miroslav Novak 2014-05-22 10:13:24 UTC

Hi David,

thanks a lot for adding all those attributes. I did a review in another bz [1] which is related to this one (it's in comment #4).  

Can you coordinate with Nichola to take updates, please?

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1099809#c4

Comment 15 Miroslav Novak 2014-06-04 07:23:56 UTC

Hi David,

reading this BZ there is one more thing to be documented here. Can you add new paragraph to chapter ⁠18.10.4. HornetQ Message Replication. This is for administrators to avoid dangerous situation:

"To get to original state after failover, it necessary start live server again and wait until it's fully synchronized with backup. Only then you can shutdown backup for original live to activate again (this happens automatically when attribute allow-failback is set to true)."

Thanks,
Mirek

Comment 17 Nikoleta Hlavickova 2014-06-06 11:25:51 UTC

The book has not been rebuilt since this BZ was moved to MODIFIED so it should not be in ON_QA state.

Comment 18 sgilda 2014-06-09 19:55:12 UTC

This can be verified on DocStage here:

http://documentation-devel.engineering.redhat.com/site/documentation/en-US/JBoss_Enterprise_Application_Platform/6.3/html-single/Administration_and_Configuration_Guide/index.html#HornetQ_Message_Replication

Comment 20 Miroslav Novak 2014-06-16 09:59:24 UTC

Thanks David! Setting as verified.

Note You need to log in before you can comment on or make changes to this bug.