Affects: Release Notes project_key: JBPAPP6 This JIRA captures the fact that failover, even with shutdown (not kill) is quite slow. What do you think about this: {noformat} 10.16.89.39 - - [14/Mar/2012:16:12:24 -0400] "GET /SessionTest/SessionTestServlet HTTP/1.1" 200 2 10.16.89.39 - - [14/Mar/2012:16:12:24 -0400] "GET /SessionTest/SessionTestServlet HTTP/1.1" 200 2 10.16.88.188 - - [14/Mar/2012:16:12:27 -0400] "DISABLE-APP / HTTP/1.1" 200 - 10.16.88.188 - - [14/Mar/2012:16:12:27 -0400] "DISABLE-APP / HTTP/1.1" 200 - 10.16.88.188 - - [14/Mar/2012:16:12:27 -0400] "STOP-APP / HTTP/1.1" 200 74 10.16.88.188 - - [14/Mar/2012:16:12:27 -0400] "STOP-APP / HTTP/1.1" 200 81 10.16.88.188 - - [14/Mar/2012:16:12:27 -0400] "REMOVE-APP / HTTP/1.1" 200 - 10.16.88.188 - - [14/Mar/2012:16:12:27 -0400] "REMOVE-APP /* HTTP/1.1" 200 - 10.16.89.39 - - [14/Mar/2012:16:12:28 -0400] "GET /SessionTest/SessionTestServlet HTTP/1.1" 503 323 10.16.89.39 - - [14/Mar/2012:16:12:29 -0400] "GET /SessionTest/SessionTestServlet HTTP/1.1" 503 323 10.16.89.39 - - [14/Mar/2012:16:12:30 -0400] "GET /SessionTest/SessionTestServlet HTTP/1.1" 503 323 10.16.89.39 - - [14/Mar/2012:16:12:31 -0400] "GET /SessionTest/SessionTestServlet HTTP/1.1" 503 323 10.16.89.39 - - [14/Mar/2012:16:12:33 -0400] "GET /SessionTest/SessionTestServlet HTTP/1.1" 503 323 10.16.89.39 - - [14/Mar/2012:16:12:35 -0400] "GET /SessionTest/SessionTestServlet HTTP/1.1" 503 323 10.16.89.39 - - [14/Mar/2012:16:12:36 -0400] "GET /SessionTest/SessionTestServlet HTTP/1.1" 503 323 10.16.89.39 - - [14/Mar/2012:16:12:39 -0400] "GET /SessionTest/SessionTestServlet HTTP/1.1" 200 2 10.16.89.39 - - [14/Mar/2012:16:12:40 -0400] "GET /SessionTest/SessionTestServlet HTTP/1.1" 200 2 10.16.89.39 - - [14/Mar/2012:16:12:41 -0400] "GET /SessionTest/SessionTestServlet HTTP/1.1" 200 2 10.16.89.39 - - [14/Mar/2012:16:12:41 -0400] "GET /SessionTest/SessionTestServlet HTTP/1.1" 200 2 {noformat} There were 7 "503" HTTP errors in 15 seconds time span, despite the fact that the balancer has received the *REMOVE-APP /\** message... [Error_log on pastebin|http://pastebin.com/aF7P2iSn]. Is it ok, that there was no DISABLE-APP and STOP-APP for context */\** ? Mod_cluster 1.1.3 with EAP5 was not presenting this behaviour :-( (i) Note: We are talking just manual testing on windows(balancer) and 2 RHEL workers here, just Ctrl+F5 in Firefox and Ctrl+C in terminal. No hundreds of thousands of requests and killing jvm with -9.
Labels: Added: eap6_need_triage
Link: Added: This issue relates to JBPAPP-8502
This sounds like a race condition between the Connector stopping and the deployment stopping. Connectors don't actually have a dependent services, therefore nothing prevents a Service<Connector> from stopping before a application undeploys during server shutdown. A simple fix would be to add dependency to Service<ModCluster> on the requisite Service<Connector>. This would trigger mod_cluster's shutdown hook before shutting down the web connectors. This would require a change to the mod_cluster subsystem schema - to identify the dependent connector; and ideally, to mod_cluster upstream, to allow AS7 to indicate which connector mod_cluster should use, in lieu of the current logic which tries to figure out which connector is most ideal.
The reason this isn't an issue in EAP5/AS5/AS6, is that mod_cluster listens for JMX notifications emitted by the server before it shuts down.
Link: Added: This issue depends AS7-4448
Michal, looks like the upstream issue has been fixed (ER6 build). Do you still see this issue?
Link: Added: This issue relates to JBPAPP-9195
Link: Added: This issue relates to MODCLUSTER-314
@[~rrajesh] -> [JBPAPP-9195 comments|https://issues.jboss.org/browse/JBPAPP-9195?focusedCommentId=12697377&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12697377]
Link: Added: This issue relates to MODCLUSTER-316
Fixed by MODCLUSTER-302 and AS7-4448.
Setting fix version to CR1 even though this issue was resolved today as the linked fixes might have even been available a few builds back. Michal you also indicated affects version of CR1 on JBPAPP-9195 which is basically the same issue. Can you check with dev on how this issue was resolved and cross check your test setup?
Release Notes Docs Status: Added: Documented as Resolved Issue Release Notes Text: Added: During server shutdown, the connector used for mod_cluster communication was shutdown before the mod_cluster service itself was stopped. This resulted in many failed mod_cluster requests in the intervening time period. A service dependency has now been added on the web connector service and the mod_cluster subsystem must declare which connector it is using. This means that the mod_cluster service will be automatically be stopped when the connector is shutdown.
Docs QE Status: Removed: NEW Added: ASSIGNED
Writer: Added: Darrin Release Notes Text: Removed: During server shutdown, the connector used for mod_cluster communication was shutdown before the mod_cluster service itself was stopped. This resulted in many failed mod_cluster requests in the intervening time period. A service dependency has now been added on the web connector service and the mod_cluster subsystem must declare which connector it is using. This means that the mod_cluster service will be automatically be stopped when the connector is shutdown. Added: During server shutdown, the connector used for mod_cluster communication was shutdown before the mod_cluster service was stopped. This resulted in many failed mod_cluster requests in the intervening time period. A service dependency has now been added on the web connector service and the mod_cluster subsystem must declare which connector it is using. This means that the mod_cluster service will be automatically be stopped when the connector is shutdown.
Release Notes Text: Removed: During server shutdown, the connector used for mod_cluster communication was shutdown before the mod_cluster service was stopped. This resulted in many failed mod_cluster requests in the intervening time period. A service dependency has now been added on the web connector service and the mod_cluster subsystem must declare which connector it is using. This means that the mod_cluster service will be automatically be stopped when the connector is shutdown. Added: During server shutdown, the connector used for mod_cluster communication was shutdown before the mod_cluster service was stopped. This resulted in many failed mod_cluster requests in the intervening time period. A service dependency has now been added on the web connector service and the mod_cluster subsystem must now declare which connector it is using. This means that the mod_cluster service will be automatically be stopped when the connector is shutdown.
@Rajesh: I am keeping an eye on this issue, I am gonna verify as soon as possible (in the scope of the related JIRAs as well).
RemoteIssueLink: Added: This issue links to "Failover on worker (tomcat) causes non 200 HTTP codes for few seconds (Web Link)"
RemoteIssueLink: Added: This issue links to "Bug 850769 - Failover on worker (tomcat) causes non 200 HTTP codes for few seconds (Web Link)"
RemoteIssueLink: Removed: This issue links to "Failover on worker (tomcat) causes non 200 HTTP codes for few seconds (Web Link)"
Affects: Added: Release Notes
Release Notes Docs Status: Removed: Documented as Resolved Issue Writer: Removed: Darrin Release Notes Text: Removed: During server shutdown, the connector used for mod_cluster communication was shutdown before the mod_cluster service was stopped. This resulted in many failed mod_cluster requests in the intervening time period. A service dependency has now been added on the web connector service and the mod_cluster subsystem must now declare which connector it is using. This means that the mod_cluster service will be automatically be stopped when the connector is shutdown. Docs QE Status: Removed: ASSIGNED
Closing. Can't reproduce with the current code base.