Bug 899385 (JBEWS-256) - mod_cluster doesn't failover with Tomcat backend
Summary: mod_cluster doesn't failover with Tomcat backend
Keywords:
Status: CLOSED NEXTRELEASE
Alias: JBEWS-256
Product: JBoss Enterprise Web Server 1
Classification: JBoss
Component: unspecified
Version: EWS 1.0.2
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: EWS 1.0.2
Assignee: Permaine Cheung
QA Contact:
URL: http://jira.jboss.org/jira/browse/JBE...
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-04-06 11:47 UTC by Radoslav Husar
Modified: 2012-11-13 16:26 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Tomcat6, mod_cluster 1.0.8, Linux 2.6.18-194.3.1.el5 #1 SMP x86_64 x86_64 x86_64 GNU/Linux
Last Closed: 2011-06-20 09:13:57 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
ews-10-mod_cluster-rhel5-x86_64-failover-9.zip (836.65 KB, application/x-download)
2011-04-06 11:48 UTC, Radoslav Husar
no flags Details
mc.zip (292.07 KB, application/x-download)
2011-04-06 11:48 UTC, Radoslav Husar
no flags Details
RHEL6-x86_64_baseline_setup.tar.gz (100.78 KB, application/x-gzip)
2011-04-06 13:48 UTC, Michal Karm Babacek
no flags Details
RHEL6-x86_64_shutdown-tomcat2.tar.gz (104.37 KB, application/x-gzip)
2011-04-06 13:50 UTC, Michal Karm Babacek
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 899382 0 high CLOSED Frequent 503 when using mod_cluster with Tomcat6 2021-02-22 00:41:40 UTC
Red Hat Issue Tracker JBEWS-256 0 Critical Closed mod_cluster doesn't failover with Tomcat backend 2020-07-17 18:30:12 UTC

Internal Links: 899382

Description Radoslav Husar 2011-04-06 11:47:16 UTC
Affects: Documentation (Ref Guide, User Guide, etc.)
project_key: JBEWS

Four tomcat nodes are registered with mod_cluster frontend, when a node fails no retry happens. Looks like that 'All workers are in error state for route (perf01)' is cause because mod_cluster doesn't want to fail-over to other routes probably thinking they are different deployments.

{code}[Wed Apr 06 07:10:07 2011] [notice] SELinux policy enabled; httpd running as context user_u:system_r:unconfined_t
[Wed Apr 06 07:10:07 2011] [notice] suEXEC mechanism enabled (wrapper: /usr/sbin/suexec)
[Wed Apr 06 07:10:07 2011] [warn] module proxy_ajp_module is already loaded, skipping
[Wed Apr 06 07:10:07 2011] [notice] Digest: generating secret for digest authentication ...
[Wed Apr 06 07:10:07 2011] [notice] Digest: done
[Wed Apr 06 07:10:07 2011] [notice] Advertise initialized for process 7907
[Wed Apr 06 07:10:07 2011] [notice] Apache/2.2.17 (Unix) DAV/2 configured -- resuming normal operations
[Wed Apr 06 07:12:26 2011] [error] (104)Connection reset by peer: ajp_ilink_receive() can't receive header
[Wed Apr 06 07:12:26 2011] [error] ajp_handle_cping_cpong: ajp_ilink_receive failed
[Wed Apr 06 07:12:26 2011] [error] (120006)APR does not understand this error code: proxy: AJP: cping/cpong failed to 10.16.88.179:8009 (perf01)
[Wed Apr 06 07:12:26 2011] [error] (104)Connection reset by peer: ajp_ilink_receive() can't receive header
[Wed Apr 06 07:12:26 2011] [error] ajp_handle_cping_cpong: ajp_ilink_receive failed
[Wed Apr 06 07:12:26 2011] [error] (120006)APR does not understand this error code: proxy: AJP: cping/cpong failed to 10.16.88.179:8009 (perf01)
[Wed Apr 06 07:12:26 2011] [error] (111)Connection refused: proxy: AJP: attempt to connect to 10.16.88.179:8009 (perf01) failed
[Wed Apr 06 07:12:26 2011] [error] ap_proxy_connect_backend disabling worker for (perf01)
[Wed Apr 06 07:12:26 2011] [error] proxy: AJP: failed to make connection to backend: perf01
[Wed Apr 06 07:12:26 2011] [error] proxy: CLUSTER: (balancer://qacluster). All workers are in error state for route (perf01)
[Wed Apr 06 07:12:26 2011] [error] (111)Connection refused: proxy: AJP: attempt to connect to 10.16.88.179:8009 (perf01) failed
[Wed Apr 06 07:12:26 2011] [error] proxy: AJP: failed to make connection to backend: perf01
[Wed Apr 06 07:12:26 2011] [error] proxy: CLUSTER: (balancer://qacluster). All workers are in error state for route (perf01)
[Wed Apr 06 07:12:26 2011] [error] proxy: CLUSTER: (balancer://qacluster). All workers are in error state for route (perf01)
...repeats for every request
{code}

Comment 1 Radoslav Husar 2011-04-06 11:48:24 UTC
Attaching logs, but they dont say much, the issue is quite clear from the Jira description itself.

Comment 2 Radoslav Husar 2011-04-06 11:48:24 UTC
Attachment: Added: ews-10-mod_cluster-rhel5-x86_64-failover-9.zip
Attachment: Added: mc.zip


Comment 3 Radoslav Husar 2011-04-06 11:53:41 UTC
Link: Added: This issue relates to JBPAPP-6221


Comment 5 Michal Karm Babacek 2011-04-06 13:48:51 UTC
Attachment: Added: RHEL6-x86_64_baseline_setup.tar.gz


Comment 7 Michal Karm Babacek 2011-04-06 13:50:35 UTC
Attachment: Added: RHEL6-x86_64_shutdown-tomcat2.tar.gz


Comment 10 Michal Karm Babacek 2011-04-06 21:13:14 UTC
Mod_cluster does failover with Tomcat backend on Solaris 10 SPARC_64 correctly except for this issue:
https://issues.jboss.org/browse/JBPAPP-6262

Note: Jars jboss-logging-jdk.jar, jboss-logging-spi.jar and mod-cluster.jar were taken from the jboss-ews-1.0.2-RHEL6-x86_64.zip for there were none in the Solaris distribution https://issues.jboss.org/browse/JBPAPP-6224

Comment 11 Jean-Frederic Clere 2011-04-07 09:02:01 UTC
"Mod_cluster does failover with Tomcat backend on Solaris 10 SPARC_64 correctly"
and it doesn't work on RHEL6-x84_64? ... Well redo the test you must be doing something wrong.

Comment 12 Michal Karm Babacek 2011-04-07 12:05:52 UTC
@Jean: Glad to hear that...can you try to point out what might be wrong? I've done the testing on RHEL several times and a few times on Solaris yesterday personally and the setting (httpd & tomcat) is definitely the same except for the fact that on RHEL I've been using two httpd instances (two balancers). Even the web app is the same, I've actually just replaced /conf and /conf.d folders with those from RHEL testing...

Comment 13 Jean-Frederic Clere 2011-04-07 12:47:54 UTC
Weird... stickySession="false" on both Tomcats?

Comment 14 Jean-Frederic Clere 2011-04-07 12:59:36 UTC
manager_handler CONFIG (/) processing: "JVMRoute=perf04&Port=8009&Host=perf04&Type=ajp"

Well you ask for Sicky Session so you get 503, that is normal.

Comment 15 Michal Karm Babacek 2011-04-07 13:00:18 UTC
Don't know how it is in Rado's files, however as you can see in my RHEL6-x86_64_baseline_setup.tar.gz and RHEL6-x86_64_shutdown-tomcat2.tar.gz, there is a setting as follows (on both Tomcats in server.xml...)

<Listener className="org.jboss.modcluster.ModClusterListener" advertise="true" advertiseGroupAddress="224.0.1.105" advertisePort="23364" stickySession="false" />




Comment 16 Michal Karm Babacek 2011-04-07 17:27:48 UTC
I've re-run the test suite as to the undeploy app/deploy app/stop app/start app/shutdown tomcat/startup tomcat scenarios with the result that the Mod_cluster DOES failover on RHEL6 x86_64 as long as you have stickySession="false" (stickySession="true" gives you 503). 

The necessary setting to make it work was:
Comment all the lines in the httpd/conf.d/proxy_ajp.conf but let the module itself loaded as: LoadModule proxy_ajp_module modules/mod_proxy_ajp.so

As far as I'm aware, this step is not mentioned anywhere in the documentation. Furthermore, it is noteworthy that it is not necessary to perform this step on Solaris.

Comment 17 Radoslav Husar 2011-04-11 14:11:32 UTC
This is a documentation issue.

The problem is that with same defaults the failover does not happen with Tomcat but happens on EAP just fine. Is it because mod_cluster talks to the underlying Tomcat clustering layer and understands that it is disabled in EWS and that is the reason why it does not failover by default?

Here is my perception of how it should be setup in each scenarios with current implementation:

{code}stickySession="false"; stickySessionRemove/stickySessionForce are ignored{code}
Use in case if you have a stateless webapp, i.e. not using sessions.

{code}stickySession="true" stickySessionRemove="false" (default) stickySessionForce="true" (default){code}
Use in case if you have a stateful webapp using sessions and in case a server crashes or fails to reply, client will get 503 error until this exact server node is not started again with the same jvmRoute or the client's session cookie expires.

{code}stickySession="true" stickySessionRemove="true" stickySessionForce="false"{code}
or
{code}domain="some_string" (same on all nodes){code} 
Use in case you have a stateful webapp using sessions and in case a server crashes thus the session is very likely lost anyway, to failover to another instance losing the current session and starting a new one.

@Jean, does that sound correct?

Comment 18 Jean-Frederic Clere 2011-04-11 14:54:56 UTC
Yep I am not sure we want support TC6 clusters in fact :D

Note that domain="some_string" won't remove the sessionid but failover without the sessionid.


Comment 19 Rajesh Rajasekaran 2011-04-11 19:32:44 UTC
Rebecca, can you document the use of Sticky sessions and how to achieve fail over when using mod_cluster tomcat? 

Comment 20 Rajesh Rajasekaran 2011-04-11 19:32:44 UTC
Affects: Added: [Documentation (Ref Guide, User Guide, etc.)]


Comment 21 Rebecca Newton 2011-04-12 06:58:52 UTC
Is a release note enough to cover this issue?

If not, there is information on sticky sessions with mod_jk in the HTTP Connectors Guide sections 3.2 and 4.2: http://documentation-stage.bne.redhat.com/docs/en-US/JBoss_Enterprise_Application_Platform/5/html-single/HTTP_Connectors_Load_Balancing_Guide/index.html . Is it possible to fit this information in there?



Comment 22 Jean-Frederic Clere 2011-04-12 07:08:21 UTC
A release notice should be enough.

Comment 23 Rebecca Newton 2011-04-13 04:31:07 UTC
I've created a draft of a release note in the Release Notes Text box, could someone please look at it and let me know how to make it accurate? Thanks.

Comment 24 Rebecca Newton 2011-04-13 04:31:07 UTC
Release Notes Docs Status: Added: Not Yet Documented
Release Notes Text: Added: Nodes do not failover when four or more are registered with the mod_cluster front end. This only occurs on Red Hat Enterprise Linux6 x86_64, and the workaround is to disable stickySessions. You can do this by enabling all the lines in the httpd/conf.d/proxy_ajp.conf.

This is the configuration for using a stateless webapp:

stickySession="true" stickySessionRemove="false" (default) stickySessionForce="true" (default)

This is the configuration for using a stateful webapp using sessions and when a server crashes or fails to reply. The client will get a 503 error until a server with the same jvmRoute is started, or the client's cookie expires.

stickySession="true" stickySessionRemove="true" stickySessionForce="false"



Comment 25 Jean-Frederic Clere 2011-04-13 07:31:18 UTC
"You can do this by enabling all the lines in the httpd/conf.d/proxy_ajp.conf."
NO!!! :D

It has to be configured in the server.xml on the Tomcat side, in the listener something like:
<Listener className="org.jboss.modcluster.ModClusterListener" stickySession="true" stickySessionRemove="true" stickySessionForce="false" />

Comment 26 Rebecca Newton 2011-04-15 03:14:35 UTC
Release Notes Text: Removed: Nodes do not failover when four or more are registered with the mod_cluster front end. This only occurs on Red Hat Enterprise Linux6 x86_64, and the workaround is to disable stickySessions. You can do this by enabling all the lines in the httpd/conf.d/proxy_ajp.conf.

This is the configuration for using a stateless webapp:

stickySession="true" stickySessionRemove="false" (default) stickySessionForce="true" (default)

This is the configuration for using a stateful webapp using sessions and when a server crashes or fails to reply. The client will get a 503 error until a server with the same jvmRoute is started, or the client's cookie expires.

stickySession="true" stickySessionRemove="true" stickySessionForce="false"
 Added: Nodes do not failover when four or more are registered with the mod_cluster front end. This only occurs on Red Hat Enterprise Linux6 x86_64, and the workaround is to disable stickySessions. You can do this by configuring the listener in the server.xml on the Tomcat side like this:

<Listener className="org.jboss.modcluster.ModClusterListener" stickySession="true" stickySessionRemove="true" stickySessionForce="false" />

This is the configuration for using a stateless webapp:

stickySession="true" stickySessionRemove="false" (default) stickySessionForce="true" (default)

This is the configuration for using a stateful webapp using sessions and when a server crashes or fails to reply. The client will get a 503 error until a server with the same jvmRoute is started, or the client's cookie expires.

stickySession="true" stickySessionRemove="true" stickySessionForce="false"



Comment 27 Jean-Frederic Clere 2011-04-15 06:44:13 UTC
The wording is wrong.... For me that is not a bug but a test error (be may a documentation error):

"Nodes do not failover when four or more are registered with the mod_cluster front end. This only occurs on Red Hat Enterprise Linux6 x86_64, and the workaround is to disable stickySessions."

replace it by something like:
"By default with Tomcat the session are sticky, so the Nodes do not failover. To activate fail over disable stickySessions."

The background is that we don't support Tomcat clustering (someone should correct me if I am wrong).

Comment 29 Rebecca Newton 2011-04-21 01:18:53 UTC
is this release notes text any better, Jean-Frederic? Does the additional configuration information need to be included? (using a stateless webapp, etc)

Comment 30 Rebecca Newton 2011-04-21 01:18:53 UTC
Release Notes Text: Removed: Nodes do not failover when four or more are registered with the mod_cluster front end. This only occurs on Red Hat Enterprise Linux6 x86_64, and the workaround is to disable stickySessions. You can do this by configuring the listener in the server.xml on the Tomcat side like this:

<Listener className="org.jboss.modcluster.ModClusterListener" stickySession="true" stickySessionRemove="true" stickySessionForce="false" />

This is the configuration for using a stateless webapp:

stickySession="true" stickySessionRemove="false" (default) stickySessionForce="true" (default)

This is the configuration for using a stateful webapp using sessions and when a server crashes or fails to reply. The client will get a 503 error until a server with the same jvmRoute is started, or the client's cookie expires.

stickySession="true" stickySessionRemove="true" stickySessionForce="false"
 Added: Sessions are sticky by default with Tomcat, so nodes do not failover. Disable stickySessions to activate failover. 

This is the configuration for using a stateless webapp:

stickySession="true" stickySessionRemove="false" (default) stickySessionForce="true" (default)

This is the configuration for using a stateful webapp using sessions and when a server crashes or fails to reply. The client will get a 503 error until a server with the same jvmRoute is started, or the client's cookie expires.

stickySession="true" stickySessionRemove="true" stickySessionForce="false"



Comment 31 Jean-Frederic Clere 2011-04-21 06:36:04 UTC
The:
stickySession="true" stickySessionRemove="false" (default) stickySessionForce="true" (default) 
and
stickySession="true" stickySessionRemove="true" stickySessionForce="false
are inverted.

That is why the JIRA was created.... The defaults were causing 503 :D

Comment 32 Rebecca Newton 2011-04-21 06:52:42 UTC
... How about I just say: "Sessions are sticky by default with Tomcat, so nodes do not failover. Disable stickySessions to activate failover." And give no examples? Work for you?

Comment 33 Jean-Frederic Clere 2011-04-21 07:18:15 UTC
sure ;-)

Comment 34 Rebecca Newton 2011-05-03 03:40:27 UTC
Release Notes Docs Status: Removed: Not Yet Documented Added: Documented as Known Issue
Release Notes Text: Removed: Sessions are sticky by default with Tomcat, so nodes do not failover. Disable stickySessions to activate failover. 

This is the configuration for using a stateless webapp:

stickySession="true" stickySessionRemove="false" (default) stickySessionForce="true" (default)

This is the configuration for using a stateful webapp using sessions and when a server crashes or fails to reply. The client will get a 503 error until a server with the same jvmRoute is started, or the client's cookie expires.

stickySession="true" stickySessionRemove="true" stickySessionForce="false"
 Added: Sessions are sticky by default with Tomcat, so nodes do not failover. Disable stickySessions to activate failover. 




Comment 35 Rebecca Newton 2011-05-19 06:23:38 UTC
Release Notes Text: Removed: Sessions are sticky by default with Tomcat, so nodes do not failover. Disable stickySessions to activate failover. 

 Added: Sessions are sticky by default with Tomcat, so nodes do not failover. Disable stickySessions to activate failover. You can do this by changing the default:

stickySession="true" stickySessionForce="true"

to:

stickySession="true" stickySessionForce="false





Comment 36 Michal Karm Babacek 2011-05-19 08:40:03 UTC
@Rebecca: Just a note: The expected default stickySession configuration is finally in the [docs|http://documentation-stage.bne.redhat.com/docs/en-US/JBoss_Enterprise_Application_Platform/5/html/HTTP_Connectors_Load_Balancing_Guide/worker_install-eap.html], by [~jaredmorgs]...

Comment 37 Rebecca Newton 2011-05-19 23:10:09 UTC
Oh hey, look at that! Might just say, '... See the HTTP Connector Guide for information on disabling stickysessions' then. Thanks!

Comment 38 Rebecca Newton 2011-05-19 23:11:07 UTC
Release Notes Text: Removed: Sessions are sticky by default with Tomcat, so nodes do not failover. Disable stickySessions to activate failover. You can do this by changing the default:

stickySession="true" stickySessionForce="true"

to:

stickySession="true" stickySessionForce="false


 Added: Sessions are sticky by default with Tomcat, so nodes do not failover. See the HTTP Connectors Load Balancing Guide for information on disabling StickySessions.




Comment 39 Rajesh Rajasekaran 2011-05-24 19:30:38 UTC
Michal, I noticed this issue has been documented. Can this be closed now or is there some item to work on for the next release? 

Comment 40 Michal Karm Babacek 2011-06-20 09:08:43 UTC
@[~rrajesh]: As I [commented|https://issues.jboss.org/browse/JBPAPP-6257?focusedCommentId=12603146&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-12603146] earlier, this issue was fixed by documenting it. The point is that the default setting does not fail-over, so it is necessary to [tweak it a bit|http://documentation-stage.bne.redhat.com/docs/en-US/JBoss_Enterprise_Application_Platform/5/html/HTTP_Connectors_Load_Balancing_Guide/worker_install-eap.html].

{code:title=Former wrong default|borderStyle=solid}
<Listener className="org.jboss.modcluster.ModClusterListener" advertise="true"/>
{code}
{code:title=Current correct default|borderStyle=solid}
<Listener className="org.jboss.modcluster.ModClusterListener" advertise="true" stickySession="true" stickySessionForce="false" stickySessionRemove="true"/>
{code}

Comment 41 Jiri Skrabal 2012-11-13 16:26:58 UTC
Release Notes Docs Status: Removed: Documented as Known Issue 
Release Notes Text: Removed: Sessions are sticky by default with Tomcat, so nodes do not failover. See the HTTP Connectors Load Balancing Guide for information on disabling StickySessions.

 



Note You need to log in before you can comment on or make changes to this bug.