Bug 900606 (JBPAPP6-1343) - CLONE - mod_cluster: HTTP 404 on node shutdown
Summary: CLONE - mod_cluster: HTTP 404 on node shutdown
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: JBPAPP6-1343
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: Apache Server (httpd) and Connectors
Version: 6.0.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ER1
: EAP 6.4.0
Assignee: Paul Ferraro
QA Contact: Michal Karm Babacek
URL: http://jira.jboss.org/jira/browse/JBP...
Whiteboard: as7 eap6 ipv6 mod_cluster
Depends On:
Blocks: JBPAPP6-1159
TreeView+ depends on / blocked
 
Reported: 2012-06-06 13:50 UTC by Michal Karm Babacek
Modified: 2015-01-20 14:36 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
RHEL 6 x86_64, pure IPv6, *Apache/2.2.21* (Unix) *mod_cluster/1.2.1.Final*
Last Closed: 2015-01-20 14:36:02 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
access_log_report.zip (2.06 KB, application/zip)
2012-06-06 13:50 UTC, Michal Karm Babacek
no flags Details
error_log_report.zip (61.05 KB, application/zip)
2012-06-06 13:50 UTC, Michal Karm Babacek
no flags Details
httpd.conf.zip (6.11 KB, application/zip)
2012-06-06 13:50 UTC, Michal Karm Babacek
no flags Details
node-vmg35-Ctrl+C-log.zip (15.57 KB, application/zip)
2012-06-06 13:50 UTC, Michal Karm Babacek
no flags Details
node-vmg36-Ctrl+C-log.zip (13.35 KB, application/zip)
2012-06-06 13:50 UTC, Michal Karm Babacek
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 900052 0 high CLOSED mod_cluster: Failover on worker shutdown takes too much time 2021-02-22 00:41:40 UTC
Red Hat Bugzilla 900559 0 urgent CLOSED mod_cluster: HTTP 503 on node shutdown with pure IPv6 setup 2021-02-22 00:41:40 UTC
Red Hat Issue Tracker JBPAPP6-1343 0 Major Closed CLONE - mod_cluster: HTTP 404 on node shutdown 2015-05-21 10:54:33 UTC

Internal Links: 900052 900559

Description Michal Karm Babacek 2012-06-06 13:50:19 UTC
project_key: JBPAPP6

As a follow up on
 * [JBPAPP-9195] mod_cluster: HTTP 503 on node shutdown with pure IPv6 setup

I have tried this mod_cluster + httpd bundle featuring *Apache/2.2.21* (Unix) *mod_cluster/1.2.1.Final* (unlike in [JBPAPP-9195] where we used Apache/2.2.17 (Unix) DAV/2 mod_cluster/1.2.1.Beta2)

 * [mod_cluster-1.2.1.Final-linux2-x64.tar.gz|http://hudson.qa.jboss.com/hudson/view/Mod_cluster/job/mod_cluster-linux-x86_64-rhel6/47/artifact/jbossnative/build/unix/output/mod_cluster-1.2.1.Final-linux2-x64.tar.gz]

the result is surprising: Very frequent HTTP 404 errors on node shutdown.

h3. Http client
I have a curl client issuing requests to [2620:52:0:105f::ffff:c]:8888/SessionTest/hostname periodically, delay being 1 s. Note that there is always a new session for each request (no JSESSIONID stuff anywhere). There are two nodes I switch off and on randomly, always giving enough time so as the starting one may take off safely.
{noformat}
Wed May 30 17:00:13 EDT 2012 [2620:52:0:105f::ffff:c]:8888 0
+++ No errors in meanwhile +++
Wed May 30 17:05:24 EDT 2012 [2620:52:0:105f::ffff:c]:8888 0
Wed May 30 17:05:25 EDT 2012 404 Not Found The requested URL /SessionTest/hostname was not found on this server.
+++ HTTP 404 errors keep showing up every second +++
Wed May 30 17:05:58 EDT 2012 404 Not Found The requested URL /SessionTest/hostname was not found on this server.
Wed May 30 17:05:59 EDT 2012 [2620:52:0:105f::ffff:c]:8888 0
+++ No errors in meanwhile +++
Wed May 30 17:06:03 EDT 2012 [2620:52:0:105f::ffff:c]:8888 0
Wed May 30 17:06:04 EDT 2012 404 Not Found The requested URL /SessionTest/hostname was not found on this server.
+++ HTTP 404 errors keep showing up every second +++
Wed May 30 17:06:08 EDT 2012 404 Not Found The requested URL /SessionTest/hostname was not found on this server.
Wed May 30 17:06:09 EDT 2012 [2620:52:0:105f::ffff:c]:8888 0
+++ No errors in meanwhile +++
Wed May 30 17:06:25 EDT 2012 [2620:52:0:105f::ffff:c]:8888 0
{noformat}
please, note the time stamps marking HTTP 404 errors, we will match them against the attached debug logs.

h4. IO error
(i) *Note:* At *17:05:24* node vmg36 was switched off and vmg35 (up and running by that time) was supposed to take over. What actually happened with *vmg35* was the undermentioned *IO error sending command CONFIG to proxy* exception at *17:05:29*, which is 5 seconds after the vmg36's shutdown. Hmmm...was httpd somehow too busy to accept the command?
(i) *Note:* Does the fact that nodes are talking via proxy-01.mw.lab.eng.bos.redhat.com (squid/3.1.10) anything to do with the problem on hand?

h3. Worker nodes
The configuration is exactly the same as in [JBPAPP-9195], I just swapped the balancer. If you take a look at the attached
 * node-vmg35-Ctrl+C-log.zip
 * node-vmg36-Ctrl+C-log.zip

you may observe the shutdown time stamps ( *^C* ) as well as several exceptions:
*vmg35, IP:2620:52:0:105f:0:0:ffff:c, JvmRoute:f49689d6-cdbb-3015-a642-f8200ea456ff*
 * 17:04:26,550 WARN  [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] Problems unmarshalling remote command from byte buffer: java.lang.NullPointerException
 * 17:05:29,133 INFO  [org.jboss.modcluster.mcmp.impl.DefaultMCMPHandler] (ContainerBackgroundProcessor[StandardEngine[jboss.web]]) IO error sending command CONFIG to proxy 
2620:52:0:105f:0:0:ffff:c/2620:52:0:105f:0:0:ffff:c:8888: java.net.SocketTimeoutException: Read timed out

*vmg36, IP:2620:52:0:105f::ffff:0, JvmRoute:dc7bd552-a020-3d08-acee-4ae3e0f178a8*
 * 17:03:36,275 WARN  [org.infinispan.remoting.transport.jgroups.CommandAwareRpcDispatcher] Problems unmarshalling remote command from byte buffer: java.lang.NullPointerException

h3. Httpd
There is the attached *error_log_report.zip* I am about to investigate. I have not managed to see what was wrong yet.
The promising reading probably lay between *17:05:24* and *17:05:29* throughout to the glitch at *17:05:59* and *17:05:58*.

(i) *Note:* I have not yet carried the IPv4/IPv6 comparison out, the fact that this issue is IPv6 / network related is just a suspicion. 

To be continued...

Comment 1 Michal Karm Babacek 2012-06-06 13:50:19 UTC
Link: Added: This issue Cloned from MODCLUSTER-314


Comment 2 Michal Karm Babacek 2012-06-06 13:50:20 UTC
Link: Added: This issue is related to JBPAPP-8466


Comment 3 Michal Karm Babacek 2012-06-06 13:50:20 UTC
Link: Added: This issue is related to JBPAPP-9195


Comment 4 Michal Karm Babacek 2012-06-06 13:51:10 UTC
Security: Added: Public
Docs QE Status: Added: NEW


Comment 5 Michal Karm Babacek 2012-06-06 13:52:31 UTC
I have cloned the Issue so as to have it in JBPAPP space as well.

Comment 6 Jean-Frederic Clere 2012-06-06 14:38:46 UTC
As it is JBPAPP-8466.

Comment 7 Rajesh Rajasekaran 2012-06-06 16:40:26 UTC
Changing the JIRA title as per Jean-Frederic's comment on the upstream JIRA
"BTW: I don't think it is related to IPv6 I managed to have it on IPv4 but it is really seldom, it is probably related to some timing issues."

Comment 8 Rostislav Svoboda 2012-06-07 09:27:42 UTC
Link: Added: This issue is a dependency of JBPAPP-9188


Comment 9 Jean-Frederic Clere 2012-06-07 16:46:14 UTC
According to the trace it seems there is a network issue during the test, that is why we see 404.
from 17:04:44 to ~ 17:05:59 AS7 is not able to send the CONFIG+ENABLE-APP that explains the 404.

I have tried something similar on IPv4 yesterday I can't reproduce it.

Comment 10 Rajesh Rajasekaran 2012-06-07 23:58:37 UTC
I might have misunderstood Jean-Frederic's comment when I changed the JIRA title to remove ipv6
The bug was filed for ipv6, but was verified as not reproducible on ipv4? 

Michal, can you comment if this issue is specific to ipv6 or applies to ipv4 as well? 
Could you also look at the potential network issue in the above comment? 

Comment 11 Jean-Frederic Clere 2012-06-08 14:12:11 UTC
I can't reproduce the issue here using 3 boxes directly connected.
tried IPv6 and IPv4.

Comment 12 Anne-Louise Tangring 2012-11-13 20:57:46 UTC
Docs QE Status: Removed: NEW 


Comment 14 Michal Karm Babacek 2015-01-20 14:36:02 UTC
It's been impossible to reproduce it since EAP 6.3.


Note You need to log in before you can comment on or make changes to this bug.