Bug 1159290 - [GSS](6.4.z) JBAS011603: Failed to destroy queue: DLQ: java.lang.IllegalStateException: Cannot access JMS Server, core server is not yet active...
Summary: [GSS](6.4.z) JBAS011603: Failed to destroy queue: DLQ: java.lang.IllegalState...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: JMS
Version: 6.4.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: CR1
: EAP 6.4.13
Assignee: Petr Jurak
QA Contact: Peter Mackay
URL:
Whiteboard:
Depends On:
Blocks: eap6413-payload 1387698 1390788
TreeView+ depends on / blocked
 
Reported: 2014-10-31 11:57 UTC by Miroslav Novak
Modified: 2020-04-15 14:10 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-03 16:43:07 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
server.log (backup) (86.74 KB, text/plain)
2014-10-31 11:57 UTC, Miroslav Novak
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker HORNETQ-1535 0 Major Open HornetQ message broker should be able to cope with network failure 2018-01-31 07:18:20 UTC
Red Hat Issue Tracker JBEAP-447 0 Critical Closed After failback backup prints warnings to log 2018-01-31 07:18:20 UTC
Red Hat Issue Tracker PRODMGT-1607 0 Blocker Pending Engineering Triage HornetQ message broker should be able to cope with network failure 2018-01-31 07:18:20 UTC
Red Hat Issue Tracker WFLY-4957 0 Critical Closed After failback backup prints warnings to log 2018-01-31 07:18:20 UTC

Description Miroslav Novak 2014-10-31 11:57:28 UTC
Description of problem:

Sometimes there is IllegalStateException after failback from backup to live (in dedicated topology with replicated journal). It appears that backup server is stopped before destinations are unbound from JNDI which causes this error:
...
12:31:18,500 WARN  [org.hornetq.core.server] (Thread-103) HQ222015: LIVE IS STOPPING?!? message=STOP_CALLED enabled=true
12:31:18,500 WARN  [org.hornetq.core.server] (Thread-103) HQ222015: LIVE IS STOPPING?!? message=STOP_CALLED true
...
12:31:18,571 WARN  [org.jboss.messaging] (ServerService Thread Pool -- 69) JBAS011603: Failed to destroy queue: DLQ: java.lang.IllegalStateException: Cannot access JMS Server, core server is not yet active
	at org.hornetq.jms.server.impl.JMSServerManagerImpl.checkInitialised(JMSServerManagerImpl.java:1657) [hornetq-jms-server-2.3.21.Final-redhat-1.jar:2.3.21.Final-redhat-1]
	at org.hornetq.jms.server.impl.JMSServerManagerImpl.access$1100(JMSServerManagerImpl.java:108) [hornetq-jms-server-2.3.21.Final-redhat-1.jar:2.3.21.Final-redhat-1]
	at org.hornetq.jms.server.impl.JMSServerManagerImpl$3.runException(JMSServerManagerImpl.java:820) [hornetq-jms-server-2.3.21.Final-redhat-1.jar:2.3.21.Final-redhat-1]
	at org.hornetq.jms.server.impl.JMSServerManagerImpl.runAfterActive(JMSServerManagerImpl.java:1869) [hornetq-jms-server-2.3.21.Final-redhat-1.jar:2.3.21.Final-redhat-1]
	at org.hornetq.jms.server.impl.JMSServerManagerImpl.removeQueueFromJNDI(JMSServerManagerImpl.java:809) [hornetq-jms-server-2.3.21.Final-redhat-1.jar:2.3.21.Final-redhat-1]
	at org.jboss.as.messaging.jms.JMSQueueService$2.run(JMSQueueService.java:89) [jboss-as-messaging-7.5.0.Final-redhat-9.jar:7.5.0.Final-redhat-9]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_20]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_20]
	at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_20]
	at org.jboss.threads.JBossThread.run(JBossThread.java:122) [jboss-threads-2.1.1.Final-redhat-1.jar:2.1.1.Final-redhat-1]
...

Steps to Reproduce:
1. Start 2 EAP 6.4.0.DR7 in dedicated topology with replicated journal
2. Start producer and consumer on queue
3. Kill "live" server
4. Wait for clients to failover and start "live" server again
5. Clients failback to live and backup stops itself

Actual results:
Sometimes there "IllegalStateExceptions" with Failed to destroy queue/topic in log of backup server.

Expected results:
No exceptions should be thrown.

Additional info:
Adding server.log from backup server.

Comment 1 Miroslav Novak 2014-10-31 11:57:52 UTC
Created attachment 952466 [details]
server.log (backup)

Comment 2 Miroslav Novak 2014-10-31 12:07:24 UTC
To reproduce the problem follow those steps:
clone our testsuite from git:
git://git.app.eng.bos.redhat.com/jbossqe/eap-tests-hornetq.git

Go to eap-tests-hornetq/scripts and run groovy script PrepareServers.groovy with -DEAP_VERSION=6.4.0.DR7 parameter:
groovy -DEAP_VERSION=6.4.0.DR7 PrepareServers.groovy

(Script will prepare 4 servers - server1..4 in the directory where are you currently standing.)

Export these paths to server directories + directory for shared journal and mcast addresse.:
export JBOSS_HOME_1=$PWD/server1/jboss-eap
export JBOSS_HOME_2=$PWD/server2/jboss-eap
export JBOSS_HOME_3=$PWD/server3/jboss-eap
export JBOSS_HOME_4=$PWD/server4/jboss-eap
export MCAST_ADDR=235.3.4.5

And finally: go to jboss-hornetq-testsuite/ in our testsuite and run
mvn clean test  -Darquillian.xml=arquillian-4-nodes.xml -Peap6x -Dtest=ReplicatedDedicatedFailoverTestCase#testFailbackTransAckQueue

Test does not fail! Only way to is to check server.log of server2 which is the replicated backup.

Comment 4 Justin Bertram 2016-06-17 18:12:38 UTC
This looks like a classic race condition.  When org.jboss.as.messaging.jms.JMSQueueService invokes org.hornetq.jms.server.impl.JMSServerManagerImpl.removeQueueFromJNDI the method checks to see if the broker is active (which it is). However, by the time it reaches the next check the broker isn't active anymore and so the exception is thrown.  It looks like the JMSQueueService is working in its own thread while another thread has stopped the broker.  I'm no expert on the messaging subsystem, but it seems to me these threads should coordinate with each other somehow to avoid this race.

Comment 6 Petr Jurak 2016-11-15 08:20:20 UTC
PR: https://github.com/jbossas/jboss-eap/pull/2880

Comment 7 Peter Mackay 2017-01-17 14:32:28 UTC
I am not seeing the exception anymore with EAP 6.4.13.CP.CR2. Verified.

Comment 8 Petr Penicka 2017-02-03 16:43:07 UTC
Released with EAP 6.4.13 on Feb 02 2017.


Note You need to log in before you can comment on or make changes to this bug.