Bug 901137 (JBPAPP6-1273)
Summary: | Server cannot be shutdowned gracefully when reconnect-attempts is set to -1 | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [JBoss] JBoss Enterprise Application Platform 6 | Reporter: | Miroslav Novak <mnovak> | ||||||||||||||
Component: | HornetQ | Assignee: | Francisco Borges <francisco.borges> | ||||||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||||||||||||
Severity: | high | Docs Contact: | |||||||||||||||
Priority: | high | ||||||||||||||||
Version: | 6.1.0 | CC: | anmiller, atangrin, cdewolf, csuconic, dandread, francisco.borges, jawilson, mnovak, myarboro, nziakova, pslavice, sappleto | ||||||||||||||
Target Milestone: | ER6 | Keywords: | TestBlocker | ||||||||||||||
Target Release: | EAP 6.1.0 | ||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||
OS: | Unspecified | ||||||||||||||||
URL: | http://jira.jboss.org/jira/browse/JBPAPP6-1273 | ||||||||||||||||
Whiteboard: | |||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | Type: | Bug | |||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Bug Depends On: | 928911 | ||||||||||||||||
Bug Blocks: | |||||||||||||||||
Attachments: |
|
Description
Miroslav Novak
2012-10-31 12:59:02 UTC
Attachment: Added: reproducer-shutdown.zip Attachment: Added: mdb-server-console.log This issue is listed as Major or below and as such is not targetted for the EAP 6.0.1 release, now that we are in Blocker or Critical issue only mode. Should this be reconsidered, please contact the EAP PM team. Docs QE Status: Removed: NEW Link: Added: This issue Cloned to JBPAPP6-1654 This issue is still valid with a little different scenario (step 5. is new) Steps to reproduce: 1. Download and unzip reproducer.zip from attachement. Next steps excexute in unzipped "reproducer" directory 2. run "sh prepare.sh" - dowloads EAP 6.1.0.DR4 - creates two directories server1 and server2 - copies directory jboss-eap-6.1 to server1 and server2 - copies configuration standalone-full-ha-jms.xml to server1 - copies configuration standalone-full-ha-mdb.xml to server2 - copies mdb1.jar to server2's deployments directory 3. start first (jms) server by "sh start-server1.sh localhost" 4. start second (mdb) server by "sh start-server2.sh <some_other_ip>" 5. start jms producer by "sh start-producer.sh localhost 1000" 6. shutdown first (jms) server by ctrl-c 7. try to shutdown second (mdb) server -> server hangs (threadump.txt attached) Created attachment 699585 [details]
reproducer.zip
Created attachment 699586 [details]
thread dump from mdb server (EAP 6.1.0.DR4)
Can you try replacing the Jars from trunk? I believe this is fixed. Server still hangs with trunk/master. Check what I did, please: - switched to master branch in git in HornetQ project: "git checkout master; git pull" - build hornetq jars by: "mvn -Prelease package" - copied built ./hornetq-ra/target/hornetq-ra-2.3.0.CR1.jar to ./server2/jboss-eap-6.1/modules/system/layers/base/org/hornetq/ra/main/hornetq-ra-2.3.0.CR1.jar - tried last test scenario (from comment 2013-02-19 13:34:49 EST) Created attachment 699886 [details]
threaddump-master.txt
Can you try with the latest CR2? PR for the hornetq CR2 upgrade: https://github.com/jbossas/jboss-eap/pull/79 I can still hit this problem with HornetQ 2.3.0.CR2. Thread dump from mdb server attached (threaddump_hq230cr2.txt) Created attachment 731118 [details]
threaddump_hq230cr2.txt
@Miroslav: I"m not sure we should fix this... First, the use case is something really of an edge case.. you first shutdown one server, than the remote server. it's not even a developer's case. Second, that would break other cases that are more important because of this edge case. So, I would say this is a won't fix it.. you could even document the case if you wanted.. but this is also somewhat obvious... I think we should just close this as won't fix. The ristk of breaking other cases is too great... The proper fix here would be to change the session.close() to be ignored in case of a failover is in place.. and this could break other scenarios that are not considered as edgy as this one here. I'm also afraid of regressions. Problem is that this is not such edge case it appears to be. We're testing this scenario because there were support tickets for it from our customers. Check comments in related jira from Jimmy Wilson and Shaun Appleton [1]. There is a high probability that we'll have to fix it anyway. [1] https://issues.jboss.org/browse/JBPAPP-10450?focusedCommentId=12737770&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12737770 This issue is related to: https://bugzilla.redhat.com/show_bug.cgi?id=877277 Fixed on https://github.com/FranciscoBorges/hornetq/tree/shutdownOnReconnect but I still need to verify it (although the fix is so simple that I am calling it "fixed") @Francisco: I looked at your fix... Do we really to still close those sessions? AFAIK a connection.close() will close any session. (Maybe I am missing something on the Resource Adapter?) the fix looks good BTW: simple change! which is great! thanks man! No, we do not need those close sessions, I left them there for safety sake until I figured it out how to reproduce and verify this case. Fwiw, I just tried to verify and we are now hanging somewhere else, assuming I did everything correctly. Ok, Miroslav Novak confirmed, that change got us ahead but the server still did not exit. I made a second change and pushed, after a while the server will exit. At least it did here for me. On Monday we try to do some more throughout verification. A fix was merged. The commit is this one https://github.com/hornetq/hornetq/commit/6eb89a7288fc1f9a569641ee9058df004db24257 Cannot hit the problem with EAP 6.1.0.ER6 (HQ 2.3.0.Final). Great work, Francisco! |