project_key: JBPAPP6 [~jkudrnac] reported she had experienced HTTP 503 errors with mod_cluster & pure IPv6 setup on node shutdown. I have done some investigation and I can reproduce the problem. h3. Http client I have a curl client issuing requests to [2620:52:0:105f::ffff:c]:8888/SessionTest/hostname periodically, delay being 1 s. Note that there is always a new session for each request (no JSESSIONID stuff anywhere). There are two nodes I switch off and on randomly, always giving enough time so as the starting one may take off safely. After several dozens of this start-ups and shut-downs a 503 error occurs. {noformat} Wed May 30 14:07:15 EDT 2012 [Balancer]:8888 0 Wed May 30 14:07:16 EDT 2012 [Balancer]:8888 0 Wed May 30 14:07:17 EDT 2012 HTTP Status 503 - This application is not currently available Wed May 30 14:07:18 EDT 2012 [Balancer]:8888 0 Wed May 30 14:07:19 EDT 2012 [Balancer]:8888 0 {noformat} please, note the time *14:07:17* as it is very well synchronized with the other node and it will show us what happened. h3. Worker nodes h4. Node vmg36 [2620:52:0:105f::ffff:3] (/) running normally, there were requests served by it before and after the incident. {noformat} 14:07:11,920 INFO [org.jboss.test.cluster.servlet.SessionHostnameServlet] (ajp-/2620:52:0:105f:0:0:ffff:0%2:8009-3) Setting up new session - QpLxi52+8NmJNFMW2YDZm2MT.dc7bd552-a020-3d08-acee-4ae3e0f178a8 14:07:12,985 INFO [org.jboss.test.cluster.servlet.SessionHostnameServlet] (ajp-/2620:52:0:105f:0:0:ffff:0%2:8009-1) Setting up new session - 16FUOxeUglbTql5+PzEJSLno.dc7bd552-a020-3d08-acee-4ae3e0f178a8 14:07:18,187 INFO [org.jboss.as.clustering] (Incoming-20,null) JBAS010225: New cluster view for partition web (id: 8, delta: -1, merge: false) : [vmg36/web] 14:07:18,192 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-20,null) ISPN000094: Received new cluster view: [vmg36/web|8] [vmg36/web] 14:07:18,342 INFO [org.jboss.test.cluster.servlet.SessionHostnameServlet] (ajp-/2620:52:0:105f:0:0:ffff:0%2:8009-2) Setting up new session - gdiFjJDwfFUSL2vlkqweUGzi.dc7bd552-a020-3d08-acee-4ae3e0f178a8 {noformat} h4. Node vmg35 [2620:52:0:105f::ffff:c] (x) Please, note the moment when *^C* was pressed at *^C14:07:17,110*: {noformat} 14:07:15,113 INFO [org.jboss.test.cluster.servlet.SessionHostnameServlet] (ajp-/2620:52:0:105f:0:0:ffff:c%2:8009-8) Setting up new session - XGBdEy3g22SQZI0nyOQT7kRF.f49689d6-cdbb-3015-a642-f8200ea456ff 14:07:16,184 INFO [org.jboss.test.cluster.servlet.SessionHostnameServlet] (ajp-/2620:52:0:105f:0:0:ffff:c%2:8009-7) Setting up new session - 9l8RklnPOpvG5D2BKRHNjKV2.f49689d6-cdbb-3015-a642-f8200ea456ff ^C14:07:17,110 INFO [org.apache.coyote.http11.Http11Protocol] (MSC service thread 1-3) Pausing Coyote HTTP/1.1 on http-/2620:52:0:105f:0:0:ffff:c%2:8080 14:07:17,122 INFO [org.apache.coyote.http11.Http11Protocol] (MSC service thread 1-3) Stopping Coyote HTTP/1.1 on http-/2620:52:0:105f:0:0:ffff:c%2:8080 14:07:17,291 INFO [org.jboss.as.logging] JBAS011503: Restored bootstrap log handlers 14:07:17,272 INFO [org.jboss.modcluster.ModClusterService] All active sessions drained from context [/SessionTest] in 0 seconds 14:07:17,408 INFO [org.apache.coyote.ajp.AjpProtocol] Pausing Coyote AJP/1.3 on ajp-/2620:52:0:105f:0:0:ffff:c%2:8009 14:07:17,421 INFO [org.apache.coyote.ajp.AjpProtocol] Stopping Coyote AJP/1.3 on ajp-/2620:52:0:105f:0:0:ffff:c%2:8009 +++ 14:07:17,810 INFO [org.infinispan.eviction.PassivationManagerImpl] ISPN000029: Passivating all entries to disk 14:07:17,811 INFO [org.infinispan.eviction.PassivationManagerImpl] ISPN000030: Passivated 0 entries in 0 milliseconds 14:07:17,848 INFO [org.infinispan.eviction.PassivationManagerImpl] ISPN000029: Passivating all entries to disk 14:07:17,869 INFO [com.arjuna.ats.jbossatx] ARJUNA032018: Destroying TransactionManagerService 14:07:17,873 INFO [org.infinispan.eviction.PassivationManagerImpl] ISPN000030: Passivated 1 entries in 24 milliseconds 14:07:17,895 INFO [com.arjuna.ats.jbossatx] ARJUNA032014: Stopping transaction recovery manager 14:07:17,911 INFO [org.jboss.as.clustering.infinispan] JBAS010282: Stopped default-host/SessionTest cache from web container 14:07:17,925 INFO [org.jboss.as.clustering.infinispan] JBAS010282: Stopped repl cache from web container 14:07:18,005 INFO [org.jboss.as.server.deployment] JBAS015877: Stopped deployment SessionTest.war in 960ms 14:07:18,152 INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] ISPN000082: Stopping the RpcDispatcher 14:07:18,506 INFO [org.jboss.as] JBAS015950: JBoss EAP 6.0.0.GA (AS 7.1.2.Final-redhat-1) stopped in 1424ms {noformat} h3. HTTPD In *Access log*, there is the 503 (in the middle) as the aforementioned http client has shown already: {noformat} 2620:52:0:105f::ffff:3 - - [30/May/2012:14:07:13 -0400] "STATUS / HTTP/1.1" 200 85 "-" "ClusterListener/1.0" 2620:52:0:105f:5054:ff:fe52:e008 - - [30/May/2012:14:07:15 -0400] "GET /SessionTest/hostname HTTP/1.1" 200 31 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.12.9.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2" 2620:52:0:105f:5054:ff:fe52:e008 - - [30/May/2012:14:07:16 -0400] "GET /SessionTest/hostname HTTP/1.1" 200 31 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.12.9.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2" 2620:52:0:105f::ffff:c - - [30/May/2012:14:07:17 -0400] "DISABLE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" 2620:52:0:105f::ffff:c - - [30/May/2012:14:07:17 -0400] "STOP-APP / HTTP/1.1" 200 107 "-" "ClusterListener/1.0" 2620:52:0:105f:5054:ff:fe52:e008 - - [30/May/2012:14:07:17 -0400] "GET /SessionTest/hostname HTTP/1.1" 503 1114 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.12.9.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2" 2620:52:0:105f::ffff:c - - [30/May/2012:14:07:17 -0400] "REMOVE-APP / HTTP/1.1" 200 - "-" "ClusterListener/1.0" 2620:52:0:105f::ffff:c - - [30/May/2012:14:07:17 -0400] "REMOVE-APP /* HTTP/1.1" 200 - "-" "ClusterListener/1.0" 2620:52:0:105f:5054:ff:fe52:e008 - - [30/May/2012:14:07:18 -0400] "GET /SessionTest/hostname HTTP/1.1" 200 31 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.12.9.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2" 2620:52:0:105f:5054:ff:fe52:e008 - - [30/May/2012:14:07:19 -0400] "GET /SessionTest/hostname HTTP/1.1" 200 31 "-" "curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.12.9.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2" {noformat} finally, we may observe the exact part of the *Error log* that differs from the healthy behaviour: {noformat} [Wed May 30 14:07:17 2012] [debug] ajp_header.c(224): Into ajp_marshal_into_msgb [Wed May 30 14:07:17 2012] [debug] ajp_header.c(290): ajp_marshal_into_msgb: Header[0] [User-Agent] = [curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.12.9.0 zlib/1.2.3 libidn/1.18 libssh2/1.2.2] [Wed May 30 14:07:17 2012] [debug] ajp_header.c(290): ajp_marshal_into_msgb: Header[1] [Host] = [[2620:52:0:105f::ffff:c]:8888] [Wed May 30 14:07:17 2012] [debug] ajp_header.c(290): ajp_marshal_into_msgb: Header[2] [Accept] = [*/*] [Wed May 30 14:07:17 2012] [debug] ajp_header.c(290): ajp_marshal_into_msgb: Header[3] [Via] = [1.1 proxy-01.mw.lab.eng.bos.redhat.com (squid/3.1.10)] [Wed May 30 14:07:17 2012] [debug] ajp_header.c(290): ajp_marshal_into_msgb: Header[4] [X-Forwarded-For] = [2620:52:0:105f:5054:ff:fe9e:282e] [Wed May 30 14:07:17 2012] [debug] ajp_header.c(290): ajp_marshal_into_msgb: Header[5] [Cache-Control] = [max-age=259200] [Wed May 30 14:07:17 2012] [debug] ajp_header.c(290): ajp_marshal_into_msgb: Header[6] [Connection] = [keep-alive] [Wed May 30 14:07:17 2012] [debug] ajp_header.c(450): ajp_marshal_into_msgb: Done [Wed May 30 14:07:17 2012] [debug] mod_proxy_ajp.c(265): proxy: APR_BUCKET_IS_EOS [Wed May 30 14:07:17 2012] [debug] mod_proxy_ajp.c(270): proxy: data to read (max 8186 at 4) [Wed May 30 14:07:17 2012] [debug] mod_proxy_ajp.c(285): proxy: got 0 bytes of data [Wed May 30 14:07:17 2012] [debug] mod_manager.c(1817): manager_trans DISABLE-APP (/) [Wed May 30 14:07:17 2012] [debug] mod_manager.c(2473): manager_handler DISABLE-APP (/) processing: "JVMRoute=f49689d6-cdbb-3015-a642-f8200ea456ff&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2FSessionTest" [Wed May 30 14:07:17 2012] [debug] mod_manager.c(2522): manager_handler DISABLE-APP OK [Wed May 30 14:07:17 2012] [debug] mod_manager.c(1817): manager_trans STOP-APP (/) [Wed May 30 14:07:17 2012] [debug] mod_manager.c(2473): manager_handler STOP-APP (/) processing: "JVMRoute=f49689d6-cdbb-3015-a642-f8200ea456ff&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2FSessionTest" [Wed May 30 14:07:17 2012] [debug] mod_manager.c(2522): manager_handler STOP-APP OK [Wed May 30 14:07:17 2012] [debug] ajp_header.c(687): ajp_read_header: ajp_ilink_received 04 [Wed May 30 14:07:17 2012] [debug] ajp_header.c(697): ajp_parse_type: got 04 [Wed May 30 14:07:17 2012] [debug] ajp_header.c(516): ajp_unmarshal_response: status = 503 [Wed May 30 14:07:17 2012] [debug] ajp_header.c(537): ajp_unmarshal_response: Number of headers is = 2 [Wed May 30 14:07:17 2012] [debug] ajp_header.c(599): ajp_unmarshal_response: Header[0] [Content-Type] = [text/html;charset=utf-8] [Wed May 30 14:07:17 2012] [debug] ajp_header.c(609): ajp_unmarshal_response: ap_set_content_type done [Wed May 30 14:07:17 2012] [debug] ajp_header.c(599): ajp_unmarshal_response: Header[1] [Content-Length] = [1114] [Wed May 30 14:07:17 2012] [debug] ajp_header.c(687): ajp_read_header: ajp_ilink_received 03 [Wed May 30 14:07:17 2012] [debug] ajp_header.c(697): ajp_parse_type: got 03 [Wed May 30 14:07:17 2012] [debug] ajp_header.c(687): ajp_read_header: ajp_ilink_received 05 [Wed May 30 14:07:17 2012] [debug] ajp_header.c(697): ajp_parse_type: got 05 [Wed May 30 14:07:17 2012] [debug] mod_proxy_ajp.c(577): proxy: got response from [2620:52:0:105f::ffff:c]:8009 (2620:52:0:105f:0:0:ffff:c%2) [Wed May 30 14:07:17 2012] [debug] proxy_util.c(2029): proxy: AJP: has released connection for (2620:52:0:105f:0:0:ffff:c%2) [Wed May 30 14:07:17 2012] [debug] mod_manager.c(1817): manager_trans REMOVE-APP (/) [Wed May 30 14:07:17 2012] [debug] mod_manager.c(2473): manager_handler REMOVE-APP (/) processing: "JVMRoute=f49689d6-cdbb-3015-a642-f8200ea456ff&Alias=default-host%2Clocalhost%2Cexample.com&Context=%2FSessionTest" [Wed May 30 14:07:17 2012] [debug] mod_manager.c(2522): manager_handler REMOVE-APP OK [Wed May 30 14:07:17 2012] [debug] mod_manager.c(1817): manager_trans REMOVE-APP (/*) [Wed May 30 14:07:17 2012] [debug] mod_manager.c(2473): manager_handler REMOVE-APP (/NODE_COMMAND) processing: "JVMRoute=f49689d6-cdbb-3015-a642-f8200ea456ff" [Wed May 30 14:07:17 2012] [debug] mod_manager.c(1169): process_node_cmd 4 processing node: 2 [Wed May 30 14:07:17 2012] [debug] mod_manager.c(2522): manager_handler REMOVE-APP OK [Wed May 30 14:07:17 2012] [debug] mod_proxy_cluster.c(678): update_workers_node starting {noformat} It looks like the *manager_handler* did not make it in time so as to avoid the choice of wrong node. (i) *Note:* Does the fact that nodes are talking via *proxy-01.mw.lab.eng.bos.redhat.com (squid/3.1.10)* anything to do with the problem on hand? h3. Configuration Configs are pretty simple, no silly tricks here: h4. HTTPD {noformat} <IfModule manager_module> Listen [2620:52:0:105f::ffff:c]:8888 <VirtualHost [2620:52:0:105f::ffff:c]:8888> ServerName vmg35 ServerAlias *.vmg35 <Directory /> Order deny,allow Deny from all Allow from all </Directory> AdvertiseGroup [ff01::1]:23364 ServerAdvertise off KeepAliveTimeout 60 MaxKeepAliveRequests 0 ManagerBalancerName mbabaceksCluster AdvertiseFrequency 5 EnableMCPMReceive <Location /mcm> SetHandler mod_cluster-manager Order deny,allow Deny from all Allow from all </Location> </VirtualHost> </IfModule> {noformat} h4. AS7 {noformat} <subsystem xmlns="urn:jboss:domain:modcluster:1.1"> <mod-cluster-config advertise="false" proxy-list="[2620:52:0:105f::ffff:c]:8888" sticky-session="true" sticky-session-force="false" sticky-session-remove="false" connector="ajp" > <dynamic-load-provider> <load-metric type="busyness"/> </dynamic-load-provider> </mod-cluster-config> </subsystem> {noformat} h3. IPv4 I tried to reproduce this problem on IPv4 with the exactly same instances and configurations (only IP addresses being different) and I was not able to get any 503 errors. There are some error-free IPv4 logs attached for comparison: * access_log-IPv4-OK.zip * error_log-IPv4-OK.zip * error_log-IPv6-NOK.zip * perf07-Ctrl+C-log-IPv4.zip * perf08-Ctrl+C-log-IPv4.zip
Attachment: Added: access_log-IPv4-OK.zip Attachment: Added: error_log-IPv4-OK.zip Attachment: Added: error_log-IPv6-NOK.zip Attachment: Added: perf07-Ctrl+C-log-IPv4.zip Attachment: Added: perf08-Ctrl+C-log-IPv4.zip
* Is it possible it has anything to do with [JBPAPP-9087], provided *ServerAdvertise off* ? * How about the good old [JBPAPP-8466] ?
Link: Added: This issue is related to JBPAPP-8466
Link: Added: This issue is related to JBPAPP-9087
Link: Added: This issue relates to MODCLUSTER-314
that is a kind of race condition the request is been processed when the application is stopped. The 503 is coming from AS7.
@Michal: I don't think it is related to JBPAPP-9087 nor JBPAPP-8466 as the error comes for AS7.
that is JBPAPP-8466
Link: Added: This issue relates to MODCLUSTER-316
Docs QE Status: Removed: NEW
Link: Added: This issue is related to JBEWS-496