1159290 – [GSS](6.4.z) JBAS011603: Failed to destroy queue: DLQ: java.lang.IllegalStateException: Cannot access JMS Server, core server is not yet active...

Bug 1159290 - [GSS](6.4.z) JBAS011603: Failed to destroy queue: DLQ: java.lang.IllegalStateException: Cannot access JMS Server, core server is not yet active...

Summary: [GSS](6.4.z) JBAS011603: Failed to destroy queue: DLQ: java.lang.IllegalState...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	JBoss Enterprise Application Platform 6
Classification:	JBoss
Component:	JMS
Sub Component:
Version:	6.4.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	CR1
Target Release:	EAP 6.4.13
Assignee:	Petr Jurak
QA Contact:	Peter Mackay
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	eap6413-payload 1387698 1390788
TreeView+	depends on / blocked

Reported:	2014-10-31 11:57 UTC by Miroslav Novak
Modified:	2020-04-15 14:10 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2017-02-03 16:43:07 UTC
Type:	Bug
Embargoed:

Attachments	(Terms of Use)
server.log (backup) (86.74 KB, text/plain) 2014-10-31 11:57 UTC, Miroslav Novak	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	HORNETQ-1535	Major	Open	HornetQ message broker should be able to cope with network failure	2018-01-31 07:18:20 UTC
Red Hat Issue Tracker	JBEAP-447	Critical	Closed	After failback backup prints warnings to log	2018-01-31 07:18:20 UTC
Red Hat Issue Tracker	PRODMGT-1607	Blocker	Pending Engineering Triage	HornetQ message broker should be able to cope with network failure	2018-01-31 07:18:20 UTC
Red Hat Issue Tracker	WFLY-4957	Critical	Closed	After failback backup prints warnings to log	2018-01-31 07:18:20 UTC

Description Miroslav Novak 2014-10-31 11:57:28 UTC

Description of problem:

Sometimes there is IllegalStateException after failback from backup to live (in dedicated topology with replicated journal). It appears that backup server is stopped before destinations are unbound from JNDI which causes this error:
...
12:31:18,500 WARN  [org.hornetq.core.server] (Thread-103) HQ222015: LIVE IS STOPPING?!? message=STOP_CALLED enabled=true
12:31:18,500 WARN  [org.hornetq.core.server] (Thread-103) HQ222015: LIVE IS STOPPING?!? message=STOP_CALLED true
...
12:31:18,571 WARN  [org.jboss.messaging] (ServerService Thread Pool -- 69) JBAS011603: Failed to destroy queue: DLQ: java.lang.IllegalStateException: Cannot access JMS Server, core server is not yet active
	at org.hornetq.jms.server.impl.JMSServerManagerImpl.checkInitialised(JMSServerManagerImpl.java:1657) [hornetq-jms-server-2.3.21.Final-redhat-1.jar:2.3.21.Final-redhat-1]
	at org.hornetq.jms.server.impl.JMSServerManagerImpl.access$1100(JMSServerManagerImpl.java:108) [hornetq-jms-server-2.3.21.Final-redhat-1.jar:2.3.21.Final-redhat-1]
	at org.hornetq.jms.server.impl.JMSServerManagerImpl$3.runException(JMSServerManagerImpl.java:820) [hornetq-jms-server-2.3.21.Final-redhat-1.jar:2.3.21.Final-redhat-1]
	at org.hornetq.jms.server.impl.JMSServerManagerImpl.runAfterActive(JMSServerManagerImpl.java:1869) [hornetq-jms-server-2.3.21.Final-redhat-1.jar:2.3.21.Final-redhat-1]
	at org.hornetq.jms.server.impl.JMSServerManagerImpl.removeQueueFromJNDI(JMSServerManagerImpl.java:809) [hornetq-jms-server-2.3.21.Final-redhat-1.jar:2.3.21.Final-redhat-1]
	at org.jboss.as.messaging.jms.JMSQueueService$2.run(JMSQueueService.java:89) [jboss-as-messaging-7.5.0.Final-redhat-9.jar:7.5.0.Final-redhat-9]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_20]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_20]
	at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_20]
	at org.jboss.threads.JBossThread.run(JBossThread.java:122) [jboss-threads-2.1.1.Final-redhat-1.jar:2.1.1.Final-redhat-1]
...

Steps to Reproduce:
1. Start 2 EAP 6.4.0.DR7 in dedicated topology with replicated journal
2. Start producer and consumer on queue
3. Kill "live" server
4. Wait for clients to failover and start "live" server again
5. Clients failback to live and backup stops itself

Actual results:
Sometimes there "IllegalStateExceptions" with Failed to destroy queue/topic in log of backup server.

Expected results:
No exceptions should be thrown.

Additional info:
Adding server.log from backup server.

Comment 1 Miroslav Novak 2014-10-31 11:57:52 UTC

Created attachment 952466 [details]
server.log (backup)

Comment 2 Miroslav Novak 2014-10-31 12:07:24 UTC

To reproduce the problem follow those steps:
clone our testsuite from git:
git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git

Go to eap-tests-hornetq/scripts and run groovy script PrepareServers.groovy with -DEAP_VERSION=6.4.0.DR7 parameter:
groovy -DEAP_VERSION=6.4.0.DR7 PrepareServers.groovy

(Script will prepare 4 servers - server1..4 in the directory where are you currently standing.)

Export these paths to server directories + directory for shared journal and mcast addresse.:
export JBOSS_HOME_1=$PWD/server1/jboss-eap
export JBOSS_HOME_2=$PWD/server2/jboss-eap
export JBOSS_HOME_3=$PWD/server3/jboss-eap
export JBOSS_HOME_4=$PWD/server4/jboss-eap
export MCAST_ADDR=235.3.4.5

And finally: go to jboss-hornetq-testsuite/ in our testsuite and run
mvn clean test  -Darquillian.xml=arquillian-4-nodes.xml -Peap6x -Dtest=ReplicatedDedicatedFailoverTestCase#testFailbackTransAckQueue

Test does not fail! Only way to is to check server.log of server2 which is the replicated backup.

Comment 4 Justin Bertram 2016-06-17 18:12:38 UTC

This looks like a classic race condition.  When org.jboss.as.messaging.jms.JMSQueueService invokes org.hornetq.jms.server.impl.JMSServerManagerImpl.removeQueueFromJNDI the method checks to see if the broker is active (which it is). However, by the time it reaches the next check the broker isn't active anymore and so the exception is thrown.  It looks like the JMSQueueService is working in its own thread while another thread has stopped the broker.  I'm no expert on the messaging subsystem, but it seems to me these threads should coordinate with each other somehow to avoid this race.

Comment 6 Petr Jurak 2016-11-15 08:20:20 UTC

PR: https://github.com/jbossas/jboss-eap/pull/2880

Comment 7 Peter Mackay 2017-01-17 14:32:28 UTC

I am not seeing the exception anymore with EAP 6.4.13.CP.CR2. Verified.

Comment 8 Petr Penicka 2017-02-03 16:43:07 UTC

Released with EAP 6.4.13 on Feb 02 2017.

Note You need to log in before you can comment on or make changes to this bug.