Bug 987259 - hot-standby for mod_cluster
hot-standby for mod_cluster
Status: CLOSED CURRENTRELEASE
Product: JBoss Enterprise Application Platform 6
Classification: JBoss
Component: mod_cluster (Show other bugs)
6.1.0
Unspecified Unspecified
high Severity high
: ER9
: EAP 6.3.0
Assigned To: Vaclav Tunka
Michal Karm Babacek
:
Depends On: 1107551 1101681
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-23 01:58 EDT by Hisanobu Okuda
Modified: 2017-10-09 20:22 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Enhancement
Doc Text:
This release of JBoss EAP 6 introduces a 'hot-standby' feature to mod_cluster.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-08-06 10:36:26 EDT
Type: Feature Request
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
JBoss Issue Tracker EAP6-172 Major Closed hot-standby for mod_cluster 2017-09-20 00:38 EDT
JBoss Issue Tracker MODCLUSTER-235 Major Closed Support hot-standby worker nodes 2017-09-20 00:38 EDT
JBoss Issue Tracker MODCLUSTER-391 Major Open mod_cluster and mod_proxy integration 2017-09-20 00:38 EDT
JBoss Issue Tracker PRODMGT-494 Major Resolved Need fail_on_status and hot-standby of mod_jk for mod_cluster 2017-09-20 00:38 EDT

  None (edit)
Description Hisanobu Okuda 2013-07-23 01:58:35 EDT
Description of problem:
Need hot-standby feature for mod_cluster

Version-Release number of selected component (if applicable):
EAP 6.x

How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:
Comment 1 JBoss JIRA Server 2013-07-26 15:04:38 EDT
John Doyle <jdoyle@jboss.org> made a comment on jira PRODMGT-494

We can look at this RFE, but we don't have a release for EAP that could deliver the capability by the October 2013 date specified by the user.
Comment 2 JBoss JIRA Server 2013-07-26 15:08:18 EDT
John Doyle <jdoyle@jboss.org> made a comment on jira PRODMGT-494

I see you have this in mod_cluster already and have been moving the target forward.  Do you have a release in mind?
Comment 3 JBoss JIRA Server 2013-07-29 03:47:28 EDT
Hisanobu Okuda <hokuda@redhat.com> made a comment on jira PRODMGT-494

If you meant https://issues.jboss.org/browse/MODCLUSTER-235 , it is not implemented yet.
Comment 7 Jean-frederic Clere 2014-02-07 10:44:37 EST
I think the feature is already there just use the node as:
<simple-load-provider factor="0"/>
Comment 8 John Doyle 2014-02-07 11:13:22 EST
Do you know when that came into code so we can mark a fixed version?  Or has it always been there?
Comment 9 Jean-frederic Clere 2014-02-07 11:16:20 EST
In fact it is crippled by the description of the modcluster subsystem.
"JBAS014708: 0 is an invalid value for parameter factor. A minimum value of 1 is required"
That needs to be changed and the feature should be tested.
Comment 11 JBoss JIRA Server 2014-02-11 03:28:35 EST
Jean-Frederic Clere <jfclere@jboss.org> updated the status of jira MODCLUSTER-235 to Resolved
Comment 14 Hisanobu Okuda 2014-02-11 19:15:45 EST
https://issues.jboss.org/browse/EAP6-172 was filed for this request.
Comment 16 Jean-frederic Clere 2014-02-12 03:55:00 EST
mod_cluster won't any request to a hot standby node except:
1 - The node is changed to a normal node (factor > 0)
2 - All the other nodes are in error or have be removed
Note that as soon as another node starts the request will be directed to the new node.
Comment 18 Kabir Khan 2014-02-16 04:59:56 EST
It is not clear to me if https://github.com/jbossas/jboss-eap/pull/916 completely solves the issue. If that is not the case, please move this issue back to the ASSIGNED state.
Comment 20 Jean-frederic Clere 2014-02-17 02:00:24 EST
the issue also requires a fix in the C part (mod_clsuter-1.2.8.Final).
Comment 23 Rostislav Svoboda 2014-02-20 13:10:30 EST
https://issues.jboss.org/browse/EAP6-172 not yet acked, removing ack
Comment 26 Michal Karm Babacek 2014-03-10 09:32:50 EDT
QA_ACK, thanks Hisanobu for the cooperation.
Comment 28 JBoss JIRA Server 2014-04-18 13:02:50 EDT
Michal Babacek <mbabacek@redhat.com> updated the status of jira MODCLUSTER-235 to Closed
Comment 29 Michal Karm Babacek 2014-06-18 10:56:56 EDT
Bad news: It's broken.

Good news: Here is a patch that fixes it: https://github.com/modcluster/mod_cluster/pull/95

What's wrong:

Hot-standby node appears as a "Load: -1" node. That's wrong, it must be Load: 0 so as to allow for forwarding requests to it in case no other nodes are available.

i.e.:

 * load > 0  : a load factor.
 * load = 0  : standby worker.
 * load = -1 : errored worker.
 * load = -2 : just do a cping/cpong.
Comment 30 Jean-frederic Clere 2014-06-18 11:33:07 EDT
The requests should be forwarded even the load is -1.
Comment 31 Michal Karm Babacek 2014-06-18 11:37:33 EDT
I get 

Balancer: qacluster,LBGroup: ,Flushpackets: Off,Flushwait: 10000,Ping: 10000000,Smax: 1,Ttl: 60000000,Status: OK,Elected: 0,Read: 0,Transferred: 0,Connected: 0,Load: -1 

<title>503 Service Temporarily Unavailable</title>

with Load: -1
Comment 32 Radoslav Husar 2014-06-18 11:40:19 EDT
Hi Michal, could you elaborate how did you test? (I couldn't reproduce with master which should have the very similar code in this area).
Comment 33 Jean-frederic Clere 2014-06-18 11:51:03 EDT
See https://bugzilla.redhat.com/show_bug.cgi?id=1074550#c2
Comment 34 Michal Karm Babacek 2014-06-18 12:12:26 EDT
Sure, let's have a look:

- grab the closest RHEL7 x86_64 box (haven't tried to reproduce elsewhere yet)
- download these (sha1sum):

  29b1578ade041492cd18db9f225f4de1bf025a7f  jboss-eap-6.3.0.ER7.zip
  2ba91fed7bf2ea830e1469f66baeabb0dd05701e  httpd/jboss-ews-httpd-2.1.0-RHEL7-x86_64.zip
  9807dcabcb91f1f102e278d33f503c399f985264  jboss-eap-native-webserver-connectors-6.3.0.ER7-RHEL7-x86_64.zip

- start httpd, my config: 

MemManagerFile "/dev/shm/mod_cluster-eapx/jboss-ews-2.1/httpd/cache/mod_cluster"
ServerName 192.168.122.78:2181
<IfModule manager_module>
  Listen 192.168.122.78:8847
  LogLevel debug
  <VirtualHost 192.168.122.78:8847>
    ServerName 192.168.122.78:8847
    <Directory />
      Order deny,allow
      Deny from all
      Allow from all
    </Directory>
    KeepAliveTimeout 60
    MaxKeepAliveRequests 0
    ServerAdvertise on
    AdvertiseFrequency 5
    ManagerBalancerName qacluster
    AdvertiseGroup 224.0.5.12:65409
    EnableMCPMReceive
    <Location /mcm>
      SetHandler mod_cluster-manager
      Order deny,allow
      Deny from all
      Allow from all
    </Location>
  </VirtualHost>
</IfModule>

- configure two ordinary, standalone-ha.xml EAP instances and set them jvmRoutes
  <system-properties>
    <property name="jboss.mod_cluster.jvmRoute" value="jboss-eap-6.3"/>
    <property name="jboss.node.name" value="jboss-eap-6.3"/>
  </system-properties>
  <socket-binding name="modcluster" port="0" multicast-address="224.0.5.12" multicast-port="65409"/>

- configure one hot-standby one (jvmRoute set via property as above...)
  <subsystem xmlns="urn:jboss:domain:modcluster:1.2">
      <mod-cluster-config advertise-socket="modcluster" connector="ajp">
          <simple-load-provider factor="0"/>
       </mod-cluster-config>
  </subsystem>

- start it

- you have, e.g.:

Node jboss-eap-6.3 (ajp://192.168.122.78:8009): 
+++ SNAP +++ Status: OK,Elected: 0,Read: 0,Transferred: 0,Connected: 0,Load: 100 

Node jboss-eap-6.3-2 (ajp://192.168.122.78:8110): 
+++ SNAP +++ Status: OK,Elected: 0,Read: 0,Transferred: 0,Connected: 0,Load: 100 

Node jboss-eap-6.3-3 (ajp://192.168.122.78:8215): 
+++ SNAP +++ Status: OK,Elected: 0,Read: 0,Transferred: 0,Connected: 0,Load: -1

- curling:

  repeatedly accessing: curl 'http://rhel7x86-64:8847/clusterbench/requestinfo;jsessionid=awyGRY5SLkVYqhQuhSoHqwGp.jboss-eap-6.3-2';


JVM route: jboss-eap-6.3-2
Session ID: awyGRY5SLkVYqhQuhSoHqwGp.jboss-eap-6.3-2
Session isNew: false

JVM route: jboss-eap-6.3-2
Session ID: awyGRY5SLkVYqhQuhSoHqwGp.jboss-eap-6.3-2
Session isNew: false

JVM route: jboss-eap-6.3-2
Session ID: awyGRY5SLkVYqhQuhSoHqwGp.jboss-eap-6.3-2
Session isNew: false

-- Node jboss-eap-6.3-2 stopped --

JVM route: jboss-eap-6.3
Session ID: awyGRY5SLkVYqhQuhSoHqwGp.jboss-eap-6.3
Session isNew: false

JVM route: jboss-eap-6.3
Session ID: awyGRY5SLkVYqhQuhSoHqwGp.jboss-eap-6.3
Session isNew: false

-- Node jboss-eap-6.3 stopped --


<title>503 Service Temporarily Unavailable</title>
+++ SNAP +++
<address>Apache/2.2.26 (Red Hat Enterprise Web Server) Server at rhel7x86-64 Port 8847</address>

<title>503 Service Temporarily Unavailable</title>
+++ SNAP +++
<address>Apache/2.2.26 (Red Hat Enterprise Web Server) Server at rhel7x86-64 Port 8847</address>





The problem in the code is the lbfactor, as you might observe. Take a look at these macros and maybe run preprocessor so as to see what C code you are actually going to compile...
Comment 35 Michal Karm Babacek 2014-06-18 12:18:54 EDT
@All: Load: -1 for a hot-standby node is wrong, it shows that lbfactor is -1. -1 is an error code for lbfactor -- i.e. node won't be used at all.
Comment 36 Jean-frederic Clere 2014-06-19 13:34:29 EDT
+++
 Node 4e6189af-0502-3305-8ff3-fad7fee8b516 (ajp://127.0.0.1:8009):
Enable Contexts Disable Contexts Stop Contexts
Balancer: mycluster,LBGroup: ,Flushpackets: Off,Flushwait: 10000,Ping: 10000000,Smax: 1,Ttl: 60000000,Status: OK,Elected: 0,Read: 0,Transferred: 0,Connected: 0,Load: -1
Virtual Host 1:
Contexts:

/compileFailure, Status: ENABLED Request: 0 Disable Stop
+++

+++
[jfclere@jfcpc APACHE-2.2.21]$ curl -v http://localhost:8080/compileFailure/
.....
HTTP/1.1 200 OK
....
+++

It works for me I don't understand....
Comment 37 Jean-frederic Clere 2014-06-19 13:59:40 EDT
I am using 2.2.21 it works.
2.2.26 it fails :-(
Something is wrong :-(
Comment 39 Jean-frederic Clere 2014-06-20 03:27:04 EDT
I have merged https://github.com/modcluster/mod_cluster/pull/95

I can't explain why it worked for me in 2.2.21... I was probably compiling with another httpd than 2.2.21 :-(
Comment 40 Kabir Khan 2014-06-23 05:03:13 EDT
Speaking to Jean-Frederic, this will be a native upgrade
Comment 42 Michal Karm Babacek 2014-07-02 06:59:45 EDT
This issue will be verified in ER9 as a part of mod_cluster 1.2.9.Final + comment 29 patch. There won't be any component upgrade to 1.2.10.Final.

Regarding documentation and release notes:

Please,
See BZ 1074550
See https://bugzilla.redhat.com/show_bug.cgi?id=1115083#c0, Paragraph 4.

This new feature should be mentioned in release notes as per BZ 1115083.
Comment 43 Michal Karm Babacek 2014-07-11 14:42:51 EDT
Uff...it's been a long journey :-)
Comment 44 Michal Karm Babacek 2014-07-11 14:43:41 EDT
EAP 6.3.0.ER9
Comment 45 Michal Karm Babacek 2014-07-11 15:29:04 EDT
I was too quick in my judgement in comment 43.

While RHEL, HP-UX and Solaris builds work, the Windows binaries still present the "Load: -1" error.
It looks like the patch wasn't applied on mod_cluster 1.2.9.Final on Windows :-(
Comment 46 Jean-frederic Clere 2014-07-15 04:01:54 EDT
According to my investigations in fact the patch is applied but some wrong occured in the production.
Could you please test with https://brewweb.devel.redhat.com/buildinfo?buildID=368750
Comment 47 Vaclav Tunka 2014-07-15 06:45:04 EDT
This is an RCM issue, I created RT for them:
#306213 Investigate: extreme repo-regen delays, bad NVRs picked by Brew/Win builds

It seems Brew incorrectly picked the old mod_cluster-native-1.2.9-4.Final.win6 version, even thought mod_cluster-native-1.2.9-5.Final.win6 was built for more than hour before the compose was run.

2014-07-08 09:45:44              mod_cluster-native-1.2.9-5.Final.win6 
Tue, 08 Jul 2014 10:58:10 EDT    jboss-eap-native-webserver-connectors-6.3.0-7.win6

Log for connectors:
http://download.devel.redhat.com/brewroot/packages/jboss-eap-native-webserver-connectors/6.3.0/7.win6/data/logs/win/build.log

Here you can see picked up version (wrong one):
2014-07-08 10:55:00,265 [INFO] koji.vm: Retrieved /tmp/build/buildreqs/mod_cluster-native/win/mod_cluster-native-1.2.9-4.Final.win6.x86_64.zip (70762 bytes, md5: 7c7280627c5021d9abdbe176c9f0d4a7)
2014-07-08 10:55:00,296 [INFO] koji.vm: Retrieved /tmp/build/buildreqs/mod_cluster-native/win/mod_cluster-native-1.2.9-4.Final.win6.i686.zip (66322 bytes, md5: baa6fd0347cb6cfe1fd3b9c8c74ecbc1)

In Brew/Win there is no way, how to specify exact version. Latest version is used all the time for dependencies.

I will respin the jboss-eap-native-webserver-connectors and run the compose again today.
Comment 48 Vaclav Tunka 2014-07-15 06:46:26 EDT
The same is true for the src compose:
http://download.devel.redhat.com/brewroot/packages/jboss-eap/6.3.0/9.win6/data/logs/win/build.log

I will run that one as well.
Comment 49 Vaclav Tunka 2014-07-15 09:10:36 EDT
Windows containers (now verified to contain mod_cluster-native-1.2.9-5.Final.win6)

jboss-eap-6.3.0-10.win6
https://brewweb.devel.redhat.com/buildinfo?buildID=369683

jboss-eap-native-webserver-connectors-6.3.0-8.win6
https://brewweb.devel.redhat.com/buildinfo?buildID=369682

top level compose - httpd (also contains mod_cluster-native)
jboss-eap6-httpd-natives-6.3.0-8.ep6.el6
https://brewweb.devel.redhat.com/buildinfo?buildID=369695

top level compose for EAP 6.x handoff (running)
https://brewweb.devel.redhat.com/taskinfo?taskID=7701185
Comment 50 Michal Karm Babacek 2014-07-18 06:54:58 EDT
It works on Solaris and Windows now.
Verified with EAP 6.3.0.ER10.
Comment 51 JBoss JIRA Server 2014-08-01 07:05:45 EDT
Michal Babacek <mbabacek@redhat.com> updated the status of jira EAP6-172 to Resolved
Comment 52 JBoss JIRA Server 2015-04-28 11:09:42 EDT
John Doyle <jdoyle@jboss.org> updated the status of jira EAP6-172 to Closed

Note You need to log in before you can comment on or make changes to this bug.