Bug 880729 - WSRP clustering - Incorrect WSDL handling for producer in cluster
Summary: WSRP clustering - Incorrect WSDL handling for producer in cluster
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Enterprise Portal Platform 6
Classification: JBoss
Component: Portal
Version: 6.0.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: CR01
: 6.0.0
Assignee: hfnukal@redhat.com
QA Contact: Michal Vanco
URL:
Whiteboard:
Depends On:
Blocks: 905410
TreeView+ depends on / blocked
 
Reported: 2012-11-27 16:47 UTC by Michal Vanco
Modified: 2013-04-30 23:36 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
JBoss WS would output individual nodes' URL for ports in the WSDL it generated, even when the WSDL was generated from a loadbalancer-fronted cluster. This resulted in the WSRP consumer getting a WS client linked to an individual node's URL, instead of using the loadbalancer's consumer. Failover was not working properly because the consumer could potentially hold onto a node's URL even when the node was brought down. To fix the issue, a patch for the JBoss WS behavior was applied at the Apache CXF level. The WSDL now correctly uses the loadbalancer's URL, which allows proper failover to happen.
Clone Of:
: 905410 (view as bug list)
Environment:
Last Closed: 2013-04-16 08:55:33 UTC
Type: Bug


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker GTNWSRP-340 0 Major Closed WS ports link to individual nodes' URL instead of loadbalancer 2014-08-16 19:47:10 UTC
Red Hat Issue Tracker JBWS-3569 0 Critical Closed WSDL produced for a multi-port service contains invalid port addresses in clustered configuration 2014-08-16 19:47:09 UTC

Description Michal Vanco 2012-11-27 16:47:05 UTC
Description of problem:
Producer on loadbalancer is not handled properly, it always keep some URL of the node instead of LB - active node.

Version-Release number of selected component (if applicable):
JPP 6.0.0 ER03

How reproducible:
always

Steps to Reproduce:
1. setup producer as 2 node cluster + loadbalancer, use another instance against this balancer
2. register producer (with loadbalancer URL)
3. access remote portlet, stop active producer node
4. connection issue for node which was stopped - portlet not displayed, consumer have to be refreshed
  
Here's how wsdl looks like on loadbalancer:
<?xml version='1.0' encoding='UTF-8'?><definitions targetNamespace="urn:oasis:names:tc:wsrp:v2:wsdl" xmlns="http://schemas.xmlsoap.org/wsdl/" xmlns:bind="urn:oasis:names:tc:wsrp:v2:bind" xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/">
  <import location="http://perf13.mw.lab.eng.bos.redhat.com:8080/wsrp-producer/v2/MarkupService?wsdl=wsrp-2.0-bindings.wsdl" namespace="urn:oasis:names:tc:wsrp:v2:bind">
    </import>
  <service name="WSRPService">
    <port binding="bind:WSRP_v2_ServiceDescription_Binding_SOAP" name="WSRPServiceDescriptionService">
      <soap:address location="http://perf11.mw.lab.eng.bos.redhat.com:8080/wsrp-producer/v2/ServiceDescriptionService"/>
    </port>
    <port binding="bind:WSRP_v2_PortletManagement_Binding_SOAP" name="WSRPPortletManagementService">
      <soap:address location="http://perf11.mw.lab.eng.bos.redhat.com:8080/wsrp-producer/v2/PortletManagementService"/>
    </port>
    <port binding="bind:WSRP_v2_Markup_Binding_SOAP" name="WSRPMarkupService">
      <soap:address location="http://perf11.mw.lab.eng.bos.redhat.com:8080/wsrp-producer/v2/MarkupService"/>
    </port>
    <port binding="bind:WSRP_v2_Registration_Binding_SOAP" name="WSRPRegistrationService">
      <soap:address location="http://perf11.mw.lab.eng.bos.redhat.com:8080/wsrp-producer/v2/RegistrationService"/>
    </port>
  </service>
</definitions>

Comment 1 Michal Vanco 2012-11-29 10:51:41 UTC
I've just reproduced this with gatein master.

My scenario:
 - 2 nodes as producer + loadbalancer, one non-clustered instance as consumer
 - register consumer against loadbalancer
 - (add remote portlet)
 - stop active producer node
 - refresh consumer -> 
Caused by: java.net.ConnectException: ConnectException invoking http://perf07.mw.lab.eng.bos.redhat.com:8080/wsrp-producer/v2/ServiceDescriptionService: Connection refused
(perf07 was active producer node which was stopped)

Comment 2 Michal Vanco 2012-11-29 10:54:44 UTC
And it's also not possible to delete/deregister consumer if the node is down.

Comment 3 JBoss JIRA Server 2012-11-29 16:29:37 UTC
Chris Laprun <chris.laprun@jboss.com> made a comment on jira GTNWSRP-340

I have a consumer-side fix, which is sub-optimal since it will only fix the issue in the GateIn to GateIn case… :(

Comment 4 claprun@redhat.com 2012-11-30 17:59:31 UTC
Proper fix will require a JBoss WS patch, unfortunately.

Comment 6 JBoss JIRA Server 2012-12-05 15:05:10 UTC
Alessio Soldano <asoldano@redhat.com> updated the status of jira JBWS-3569 to Coding In Progress

Comment 7 JBoss JIRA Server 2012-12-05 15:17:11 UTC
Alessio Soldano <asoldano@redhat.com> made a comment on jira JBWS-3569

This is basically fixed by the changes I just applied for https://issues.apache.org/jira/browse/CXF-4677 . The 'autoRewriteSoapAddressForAllServices' option introduced there has to be activated in RequestHandlerImpl, as per the patch snippet below:

{noformat}

Index: modules/server/src/main/java/org/jboss/wsf/stack/cxf/RequestHandlerImpl.java
===================================================================
--- modules/server/src/main/java/org/jboss/wsf/stack/cxf/RequestHandlerImpl.java	(revision 17049)
+++ modules/server/src/main/java/org/jboss/wsf/stack/cxf/RequestHandlerImpl.java	(working copy)
@@ -197,8 +197,10 @@
          String ctxUri = req.getRequestURI();
          String baseUri = req.getRequestURL().toString() + "?" + req.getQueryString();
          EndpointInfo endpointInfo = dest.getEndpointInfo();
-         endpointInfo.setProperty(WSDLGetUtils.AUTO_REWRITE_ADDRESS,
-               ServerConfig.UNDEFINED_HOSTNAME.equals(serverConfig.getWebServiceHost()));
+         if (serverConfig.isModifySOAPAddress()) {
+            endpointInfo.setProperty(WSDLGetUtils.AUTO_REWRITE_ADDRESS_ALL,
+                  ServerConfig.UNDEFINED_HOSTNAME.equals(serverConfig.getWebServiceHost()));
+         }
 
          for (QueryHandler queryHandler : bus.getExtension(QueryHandlerRegistry.class).getHandlers())
          {
{noformat}

Comment 8 JBoss JIRA Server 2012-12-05 19:06:05 UTC
Alessio Soldano <asoldano@redhat.com> updated the status of jira JBWS-3569 to Open

Comment 9 JBoss JIRA Server 2012-12-10 10:45:35 UTC
Alessio Soldano <asoldano@redhat.com> updated the status of jira JBWS-3569 to Resolved

Comment 11 Boleslaw Dawidowicz 2012-12-17 09:26:57 UTC
I don't think we can do anything on our side - patch need to be provided and it is beyond JPP Dev team to provide proper EAP one. 

I'm reassigning to Honza as there is nothing more that Chris can help with. 

Honza let us know if you need any more assistance from us. I'm not sure who should be exactly in charge to arrange the patch.

Comment 16 JBoss JIRA Server 2012-12-21 13:37:02 UTC
Chris Laprun <chris.laprun@jboss.com> updated the status of jira GTNWSRP-340 to Closed

Comment 17 JBoss JIRA Server 2012-12-21 13:37:02 UTC
Chris Laprun <chris.laprun@jboss.com> made a comment on jira GTNWSRP-340

This is addressed at the JBoss WS level, see linked issue.

Comment 19 Michal Vanco 2013-01-24 11:08:34 UTC
Hi Chris, 

I was just checking the same scenario which is already described here with JPP 6.0.0 ER05 (which also includes the patch JBPAPP-10497) and it seems like issue is still present.
Again I have 2 producer nodes + loadbalancer + another jpp instance as consumer.
I registered consumer against loadbanacer WSDL, added remote portlet and failover active producer node. Unfortunately portlet isn't displayed and it's not possible to refresh consumer.

I'm getting:
java.net.ConnectException: ConnectException invoking http://perf15.mw.lab.eng.bos.redhat.com:8080/wsrp-producer/v2/ServiceDescriptionService: Connection refused
(perf15 was active producer node which was stopped).

There is a change at loadbalancer, host:port/wsrp-producer/v2/MarkupService?wsdl - before it showed different node for each refresh - right now it points to active node only.
But... the content is:

<definitions xmlns="http://schemas.xmlsoap.org/wsdl/" xmlns:bind="urn:oasis:names:tc:wsrp:v2:bind" xmlns:soap="http://schemas.xmlsoap.org/wsdl/soap/" xmlns:wsdl="http://schemas.xmlsoap.org/wsdl/" targetNamespace="urn:oasis:names:tc:wsrp:v2:wsdl">
<import location="http://perf13.mw.lab.eng.bos.redhat.com:8080/wsrp-producer/v2/MarkupService?wsdl=wsrp-2.0-bindings.wsdl" namespace="urn:oasis:names:tc:wsrp:v2:bind"></import>
<service name="WSRPService">
<port binding="bind:WSRP_v2_ServiceDescription_Binding_SOAP" name="WSRPServiceDescriptionService">
<soap:address location="http://perf11.mw.lab.eng.bos.redhat.com:8080/wsrp-producer/v2/ServiceDescriptionService"/>
</port>
<port binding="bind:WSRP_v2_PortletManagement_Binding_SOAP" name="WSRPPortletManagementService">
<soap:address location="http://perf11.mw.lab.eng.bos.redhat.com:8080/wsrp-producer/v2/PortletManagementService"/>
</port>
<port binding="bind:WSRP_v2_Markup_Binding_SOAP" name="WSRPMarkupService">
<soap:address location="http://perf11.mw.lab.eng.bos.redhat.com:8080/wsrp-producer/v2/MarkupService"/>
</port>
<port binding="bind:WSRP_v2_Registration_Binding_SOAP" name="WSRPRegistrationService">
<soap:address location="http://perf11.mw.lab.eng.bos.redhat.com:8080/wsrp-producer/v2/RegistrationService"/>
</port>
</service>
</definitions>

I expected perf13...(loadbalancer) will be used at all wsdl links (that's what patch was about), but it's not.
This also means that current docs test at BZ doesn't reflect current portal behavior.

Thanks for any updates!!!

Comment 20 claprun@redhat.com 2013-01-24 17:27:11 UTC
It is possible to fix this issue by changing the configuration of JBoss WS in standalone.xml (and standalone-ha.xml, of course) by changing:

<wsdl-host>${jboss.bind.address:127.0.0.1}</wsdl-host>

to 

<wsdl-host>jbossws.undefined.host</wsdl-host>

as explained in https://docs.jboss.org/author/display/JBWS/Advanced+User+Guide#AdvancedUserGuide-Dynamicrewrite.

There doesn't seem to be any need to use <modify-wsdl-address>true</modify-wsdl-address>, though. Alessio said that using jbossws.undefined.host probably forces the dynamic rewrite and that he needs to update the docs.

Comment 21 JBoss JIRA Server 2013-01-25 10:54:34 UTC
Alessio Soldano <asoldano@redhat.com> updated the status of jira JBWS-3569 to Closed

Comment 22 Thomas Heute 2013-01-28 09:36:39 UTC
Is it safe to put as default, or does it work only with a loadbalancer ?

Comment 23 claprun@redhat.com 2013-01-28 10:20:55 UTC
I think it's safe to use in non-clustered environment as well, yes. However, this should be confirmed by QA.

Comment 24 Michal Vanco 2013-01-28 16:16:36 UTC
Hi,
I just verified clustering scenario with <wsdl-host>jbossws.undefined.host</wsdl-host> on producer nodes and it worked perfect. Now it's possible to refresh consumer or directly display remote portlet after failover - these two main scenarios were expected! WSDL was properly rewritten and it included loadbalancer host instead of specific producer node!

I also tried <wsdl-host>jbossws.undefined.host</wsdl-host> at standalone.xml for singlenode testing of wsrp and it worked as expeceted -> it seems to be safe to use above wsdl-host as default.

This have to be done for CR01 build (who is gonna do it? Honza/Chris?).

Comment 25 claprun@redhat.com 2013-01-29 11:00:27 UTC
Note that applying the change to wsdl-host only seems to work *when* the JBoss WS patch has been applied. It seems that using a non-patched JBoss WS version results in WSRP not working at all in clustered or single-node mode since the port URLs use jbossws.undefined.host instead of localhost in that case and are not being rewritten.

Comment 26 Thomas Heute 2013-01-30 09:42:15 UTC
Needs WS patch + https://github.com/gatein-prod/gatein-portal/pull/7

Assigning to Honza, make the status to "POST" (Patch submitted, not included)

Comment 27 Michal Vanco 2013-02-15 13:56:46 UTC
Verified with CR01.1 and works as expected


Note You need to log in before you can comment on or make changes to this bug.