https://issues.jboss.org/browse/MODCLUSTER-339
Michal Babacek <mbabacek> made a comment on jira MODCLUSTER-339 As a comparison, here is a healthy debug log from a mod_cluster IPv6 test on RHEL [^error_log-mod_cluster-RHEL].
Jean-Frederic Clere <jfclere> made a comment on jira MODCLUSTER-339 %5B2620%3A52%3A0%3A105f%3A0%3A0%3Affff%3A50%252%5D that is [ ... %2] that is not a valid address. what is configured on the AS7 side?
Michal Babacek <mbabacek> made a comment on jira MODCLUSTER-339 {code} <interfaces> <interface name="management"> <inet-address value="2620:52:0:105f::ffff:50"/> </interface> <interface name="public"> <inet-address value="2620:52:0:105f::ffff:50"/> </interface> <interface name="unsecure"> inet-address value="${jboss.bind.address.unsecure:127.0.0.1}"/> </interface> </interfaces> <socket-binding-group name="standard-sockets" default-interface="public" port-offset="${jboss.socket.binding.port-offset:0}"> <socket-binding name="management-native" interface="management" port="${jboss.management.native.port:9999}"/> <socket-binding name="management-http" interface="management" port="${jboss.management.http.port:9990}"/> <socket-binding name="management-https" interface="management" port="${jboss.management.https.port:9443}"/> <socket-binding name="ajp" port="8009"/> <socket-binding name="http" port="8080"/> <socket-binding name="https" port="8443"/> <socket-binding name="jgroups-mping" port="0" multicast-address="ff01::3" multicast-port="45700"/> <socket-binding name="jgroups-tcp" port="7600"/> <socket-binding name="jgroups-tcp-fd" port="57600"/> <socket-binding name="jgroups-udp" port="55200" multicast-address="ff01::3" multicast-port="45688"/> <socket-binding name="jgroups-udp-fd" port="54200"/> <socket-binding name="modcluster" port="0" multicast-address="ff01::7" multicast-port="23964"/> <socket-binding name="remoting" port="4447"/> <socket-binding name="txn-recovery-environment" port="4712"/> <socket-binding name="txn-status-manager" port="4713"/> <outbound-socket-binding name="mail-smtp"> <remote-destination host="localhost" port="25"/> </outbound-socket-binding> </socket-binding-group> {code}
Jean-Frederic Clere <jfclere> made a comment on jira MODCLUSTER-339 Ok it seems EAP/AS adds the %2 which causes problem on solaris in the URL. That needs to be fixed.
Jean-Frederic Clere <jfclere> made a comment on jira MODCLUSTER-339 It looks like apr behaves differently on Solaris and Linux: rv = apr_sockaddr_info_get(&sa, "2001:db8:0:f101::1%2", APR_UNSPEC, 80, 0, p); works on Linux but not on Solaris. It seems the Solaris doesn't like the %.
Michal Babacek <mbabacek> made a comment on jira MODCLUSTER-339 h3. Thinking aloud I do not understand why should we put zone there at all. What should httpd, as a server, do with it? I had tried to look up some httpd tests with IPv6, and I found only this, not using zone id: [httpd-2.2.23/srclib/apr/test/testsock.c:314|https://gist.github.com/Karm/5642351#file-testsock-c-L314] Furthermore, I examined the functions in {{httpd-2.2.23/srclib/apr/network_io/unix/sockaddr.c}} leading to {{getaddrinfo(hostname, servname, &hints, &ai_list);}} Solaris POSIX mambo-jambo reveals a nice doc for [getaddrinfo()|http://docs.oracle.com/cd/E23823_01/html/816-5170/getaddrinfo-3socket.html#scrolltoc] {quote} The {{nodename}} can also be an IPv6 zone-id in the form: {code} <address>%<zone-id> {code} The address is the literal IPv6 link-local address or host name of the destination. The zone-id is the interface ID of the IPv6 link used to send the packet. The zone-id can either be a numeric value, indicating a literal zone value, or an interface name such as hme0. {quote} OK, we should be able to put %num there, still, why should be httpd interested in worker's interface zone id? It is not going to be binding to it... I guess there is even a room for a nasty error where, given that zone id has a priority over the actual address, httpd will try to use a specific interface just because it was given an unnecessary zone id... Dunno :-( h3. Toss % out How about stripping the %num from the CONFIG message on the native side? As I stated above, it's IMHO useless there anyhow. {code:title=RHEL with zone %666|borderStyle=solid|borderColor=#ccc| titleBGColor=#F7D6C1} [Fri May 24 06:44:25 2013] [debug] mod_proxy_cluster.c(655): add_balancer_node: Create balancer balancer://qacluster [Fri May 24 06:44:25 2013] [debug] mod_proxy_cluster.c(426): Created: worker for ajp://[2620:52:0:102f:221:5eff:fe96:8180%666]:8009 [Fri May 24 06:44:25 2013] [debug] mod_proxy_cluster.c(549): proxy: initialized single connection worker 1 in child 10070 for (2620:52:0:102f:221:5eff:fe96:8180%666) [Fri May 24 06:44:25 2013] [debug] mod_proxy_cluster.c(601): Created: worker for ajp://[2620:52:0:102f:221:5eff:fe96:8180%666]:8009 1 (status): 129 [Fri May 24 06:44:25 2013] [debug] mod_proxy_cluster.c(1025): update_workers_node done [Fri May 24 06:44:25 2013] [debug] mod_proxy_cluster.c(1010): update_workers_node starting [Fri May 24 06:44:25 2013] [debug] mod_proxy_cluster.c(1025): update_workers_node done {code} OK, RHEL can handle it, SOLARIS can't. On the other hand: {code:title=RHEL without any zone in the message|borderStyle=solid|borderColor=#ccc| titleBGColor=#F7D6C1} [Fri May 24 06:37:47 2013] [debug] mod_proxy_cluster.c(426): Created: worker for ajp://[2620:52:0:102f:221:5eff:fe96:8180]:8009 [Fri May 24 06:37:47 2013] [debug] mod_proxy_cluster.c(549): proxy: initialized single connection worker 1 in child 9967 for (2620:52:0:102f:221:5eff:fe96:8180) [Fri May 24 06:37:47 2013] [debug] mod_proxy_cluster.c(601): Created: worker for ajp://[2620:52:0:102f:221:5eff:fe96:8180]:8009 1 (status): 129 {code} Omitting the zone from the CONFIG message seems to be doing no harm. Solaris up and running: :-) {code:title=SOLARIS without any zone in the message|borderStyle=solid|borderColor=#ccc| titleBGColor=#F7D6C1} [Fri May 24 08:25:15 2013] [debug] mod_manager.c(1923): manager_trans CONFIG (/) [Fri May 24 08:25:15 2013] [debug] mod_manager.c(2598): manager_handler CONFIG (/) processing: "JVMRoute=FakeNode&Host=%5B2620%3A52%3A0%3A105f%3A%3Affff%3A60%5D&Maxattempts=1&Port=8009&Type=ajp&ping=100\r\n" [Fri May 24 08:25:15 2013] [debug] mod_manager.c(2647): manager_handler CONFIG OK [Fri May 24 08:25:15 2013] [debug] mod_proxy_cluster.c(1010): update_workers_node starting [Fri May 24 08:25:15 2013] [debug] mod_proxy_cluster.c(655): add_balancer_node: Create balancer balancer://qacluster [Fri May 24 08:25:15 2013] [debug] mod_proxy_cluster.c(1010): update_workers_node starting [Fri May 24 08:25:15 2013] [debug] mod_proxy_cluster.c(655): add_balancer_node: Create balancer balancer://qacluster [Fri May 24 08:25:15 2013] [debug] mod_proxy_cluster.c(426): Created: worker for ajp://[2620:52:0:105f::ffff:60]:8009 [Fri May 24 08:25:15 2013] [debug] mod_proxy_cluster.c(532): proxy: initialized worker 1 in child 19207 for (2620:52:0:105f::ffff:60) min=0 max=25 smax=25 [Fri May 24 08:25:15 2013] [debug] mod_proxy_cluster.c(601): Created: worker for ajp://[2620:52:0:105f::ffff:60]:8009 1 (status): 1 [Fri May 24 08:25:15 2013] [debug] mod_proxy_cluster.c(1025): update_workers_node done [Fri May 24 08:25:15 2013] [debug] proxy_util.c(2011): proxy: ajp: has acquired connection for (2620:52:0:105f::ffff:60) [Fri May 24 08:25:15 2013] [debug] proxy_util.c(2067): proxy: connecting ajp://[2620:52:0:105f::ffff:60]:8009/ to 2620:52:0:105f::ffff:60:8009 [Fri May 24 08:25:15 2013] [debug] mod_proxy_cluster.c(426): Created: worker for ajp://[2620:52:0:105f::ffff:60]:8009 [Fri May 24 08:25:15 2013] [debug] mod_proxy_cluster.c(532): proxy: initialized worker 1 in child 19208 for (2620:52:0:105f::ffff:60) min=0 max=25 smax=25 [Fri May 24 08:25:15 2013] [debug] mod_proxy_cluster.c(601): Created: worker for ajp://[2620:52:0:105f::ffff:60]:8009 1 (status): 1 [Fri May 24 08:25:15 2013] [debug] mod_proxy_cluster.c(1025): update_workers_node done [Fri May 24 08:25:15 2013] [debug] mod_proxy_cluster.c(1010): update_workers_node starting [Fri May 24 08:25:15 2013] [debug] mod_proxy_cluster.c(1025): update_workers_node done [Fri May 24 08:25:15 2013] [debug] proxy_util.c(2193): proxy: connected / to 2620:52:0:105f::ffff:60:8009 [Fri May 24 08:25:15 2013] [debug] proxy_util.c(2444): proxy: ajp: fam 26 socket created to connect to 2620:52:0:105f::ffff:60 {code} Without *%something* in the Host attribute of the CONFIG message, there is no nasty *DNS lookup failure* and everything seems to be cool (not yet thoroughly tested though). The aforementioned log was produced with this fake message: {code} { echo "CONFIG / HTTP/1.0"; echo "Content-length: 108"; echo ""; echo "JVMRoute=FakeNode&Host=%5B2620%3A52%3A0%3A105f%3A%3Affff%3A60%5D&Maxattempts=1&Port=8009&Type=ajp&ping=100"; sleep 1; } | telnet 2620:52:0:105f::ffff:60 6666 {code} What do you think about it?
Michal Babacek <mbabacek> made a comment on jira MODCLUSTER-339 Regarding the idea of removing the zone id, how about this: [https://github.com/modcluster/mod_cluster/pull/20/] ?
Not in EAP6.1 = not EWS 2.0.1 = unknown bug.
(In reply to Jean-frederic Clere from comment #8) > Not in EAP6.1 = not EWS 2.0.1 = unknown bug. ??? Um...the bug is present both in EAP6.1 and EWS 2.0.1 :-)
Yes the bug is both in EWS 2.0.1 and EAP 6.1 has we share the same sources.
(In reply to Jean-frederic Clere from comment #10) > Yes the bug is both in EWS 2.0.1 and EAP 6.1 has we share the same sources. I am lost. Why did you send this comment then? Jean-frederic Clere 2013-05-28 11:41:30 EDT Not in EAP6.1 = not EWS 2.0.1 = unknown bug. Flags: devel_ack- ?
Not fixed in EAP6.1 = not fixed EWS 2.0.1 = known bug.
i set requires_doc_text to ?
It needs to be in release notes
Cause: Java returns IPv6 with a zone like "2001:db8:0:f101::1%2" when returning a node address the modcluser subsystem sends the IPv6 has it sees it in Java. Solaris apr_sockaddr_info_get() doesn't support that format and tries (and fails) to resolve the IP as hostname. Consequence: httpd mod_cluster won't work with nodes with IPv6 addresses. Fix: Use Ipv4 address for nodes when httpd runs on Solaris. Result:
The work-around is to use a address="hostname" in the connector in the web subsystem.
Michal Babacek <mbabacek> made a comment on jira MODCLUSTER-339 [~jfclere] I have been investigating further and you might find these notes useful: h4. IPv6 works if we remove % and zone id The "fix", or rather a workaround, in [/pull/20/|https://github.com/modcluster/mod_cluster/pull/20/] really made IPv6 work on Soalris 11 SPARC64. I tested with attached [^mod_manager.so] (built from [/pull/20/|https://github.com/modcluster/mod_cluster/pull/20/] sources for sparc64, *apxs* from httpd-2.2.23). Here is the debug log from the successful test: [^error_log_pull20]. h4. Actual apr_sockaddr_info_get source code I wondered what is the actual difference between Solaris's and Fedora's {{apr_sockaddr_info_get}}, but I am bewildered with all these macros. What I did is to run a preprocessor, so as I can compare the actual C code that is to be compiled on Fedora and Solaris. {noformat} /tmp/native/httpd/httpd-2.2.23/srclib/apr gcc -E -P -g -Wall -Wmissing-prototypes -Wstrict-prototypes -Wmissing-declarations -m64 -DSSL_EXPERIMENTAL -DSSL_ENGINE -DHAVE_CONFIG_H -DSOLARIS2=11 -D_POSIX_PTHREAD_SEMANTICS -D_REENTRANT -I./include -I/tmp/native/httpd/httpd-2.2.23/srclib/apr/include/arch/unix -I./include/arch/unix -I/tmp/native/httpd/httpd-2.2.23/srclib/apr/include/arch/unix -I/tmp/native/httpd/httpd-2.2.23/srclib/apr/include -o network_io/unix/sockaddr.lo -c network_io/unix/sockaddr.c {noformat} One may find resulting files attached as [^sockaddr.lo_fedora18_x86_64], [^sockaddr.lo_solaris11_sparc64]. I took a look at differences in * {{static apr_status_t find_addresses(apr_sockaddr_t **sa, const char *hostname, apr_int32_t family, apr_port_t port, apr_int32_t flags, apr_pool_t *p)}} * {{call_resolver(apr_sockaddr_t **sa, const char *hostname, apr_int32_t family, apr_port_t port, apr_int32_t flags, apr_pool_t *p)}} but it all boils down to the system's: {{getaddrinfo(hostname, servname, &hints, &ai_list);}} that, as far as I was able to look up, [supports %zoneid syntax|http://docs.oracle.com/cd/E23823_01/html/816-5170/getaddrinfo-3socket.html#scrolltoc]... So, I can't really see how could {{apr_sockaddr_info_get}} fail us? There is not much code in it: Solaris 11 SPARC64: {code} apr_status_t apr_sockaddr_info_get(apr_sockaddr_t **sa, const char *hostname, apr_int32_t family, apr_port_t port, apr_int32_t flags, apr_pool_t *p) { apr_int32_t masked; *sa = 0L; if ((masked = flags & (0x01 | 0x02))) { if (!hostname || family != 0 || masked == (0x01 | 0x02)) { return 22; } } return find_addresses(sa, hostname, family, port, flags, p); } {code} the only difference from Fedora build being on line 7, {{*sa = ((void *)0);}}. uh...
Thank you, Jean-Frederic. Documenting this as a known issue for 2.0.1.
Jean-Frederic Clere <jfclere> updated the status of jira MODCLUSTER-339 to Resolved
Michal Babacek <mbabacek> updated the status of jira MODCLUSTER-339 to Closed
Michal Babacek <mbabacek> made a comment on jira MODCLUSTER-339 Verified with mod_cluster 1.2.6 :-)
Let's switch it to ON_QA...
...and let's flip it to VERIFIED :-) Solaris 10 sparc [Wed Jun 11 06:14:34 2014] [debug] mod_manager.c(2623): manager_handler CONFIG (/) processing: "JVMRoute=jboss-eap-6.3&Host=%5B2620%3A52%3A0%3A105f%3A0%3A0%3Affff%3Af8%252%5D&Maxattempts=1&Port=8009&StickySessionForce=No&Type=ajp&ping=10" [Wed Jun 11 06:14:34 2014] [debug] mod_manager.c(2672): manager_handler CONFIG OK EWS 2.1.0.ER2
Changed Doc Type to Bug Fix. Modified Doc Text: Updated EWS version to 2.1.0 Added sentence: This issue is fixed in EWS 2.1.0/