Bug 900828 (JBEWS-92)
| Summary: | Tomcat with mod_cluster refuses to shut down | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | [JBoss] JBoss Enterprise Web Server 2 | Reporter: | Michal Karm Babacek <mbabacek> | ||||||||||||
| Component: | unspecified | Assignee: | Permaine Cheung <pcheung> | ||||||||||||
| Status: | CLOSED NEXTRELEASE | QA Contact: | |||||||||||||
| Severity: | urgent | Docs Contact: | |||||||||||||
| Priority: | urgent | ||||||||||||||
| Version: | 2.0.0 | CC: | fgoldefu, jfclere, lfuka, lthon, mbabacek, pcheung, rhatlapa, weli | ||||||||||||
| Target Milestone: | --- | ||||||||||||||
| Target Release: | TBD EWS | ||||||||||||||
| Hardware: | Unspecified | ||||||||||||||
| OS: | Unspecified | ||||||||||||||
| URL: | http://jira.jboss.org/jira/browse/JBEWS-92 | ||||||||||||||
| Whiteboard: | tomcat | ||||||||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
| Doc Text: | Story Points: | --- | |||||||||||||
| Clone Of: | Environment: |
Tomcat6 confirmed
|
|||||||||||||
| Last Closed: | 2012-11-05 12:50:04 UTC | Type: | Bug | ||||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||||
| Documentation: | --- | CRM: | |||||||||||||
| Verified Versions: | Category: | --- | |||||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
| Embargoed: | |||||||||||||||
| Attachments: |
|
||||||||||||||
|
Description
Michal Karm Babacek
2012-08-30 07:37:19 UTC
Link: Added: This issue relates to JBPAPP-9551 I can't reproduce it. Please retest I can't reproduce the issue. I have a live box for you with hanging Tomcat.
It might be just some silly error related to the way it was started/configured:
{noformat}
/root/workspace/jboss-ews-2.0/tomcat6/bin/startup.sh 870367228
{noformat}
(i) Note: The weird number is actually only a label with which I can track and possibly kill this particular instance (grepping ps aux...).
(i) Also note it is being started with roots privileges.
Now I have:
{noformat}
root 2094 4.4 8.0 3431968 400412 pts/0 Sl 14:58 0:27 /root/jdk1.7.0_last//bin/java -d64 -Djava.util.logging.config.file=/root/workspace/jboss-ews-2.0/tomcat6/conf/logging.properties -Djava.library.path=/root/workspace/jboss-ews-2.0/tomcat6/lib -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.endorsed.dirs=/root/workspace/jboss-ews-2.0/tomcat6/endorsed -classpath /root/workspace/jboss-ews-2.0/tomcat6/bin/bootstrap.jar -Dcatalina.base=/root/workspace/jboss-ews-2.0/tomcat6 -Dcatalina.home=/root/workspace/jboss-ews-2.0/tomcat6 -Djava.io.tmpdir=/root/workspace/jboss-ews-2.0/tomcat6/temp org.apache.catalina.startup.Bootstrap 870367228 start
{noformat}
Attempts like {{/root/workspace/jboss-ews-2.0/tomcat6/bin/shutdown.sh}} are fruitless:
{noformat}
Using CATALINA_BASE: /root/workspace/jboss-ews-2.0/tomcat6
Using CATALINA_HOME: /root/workspace/jboss-ews-2.0/tomcat6
Using CATALINA_TMPDIR: /root/workspace/jboss-ews-2.0/tomcat6/temp
Using JRE_HOME: /root/jdk1.7.0_last/
Using CLASSPATH: /root/workspace/jboss-ews-2.0/tomcat6/bin/bootstrap.jar
Sep 19, 2012 3:09:37 PM org.apache.catalina.startup.Catalina stopServer
SEVERE: Catalina.stop:
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
at java.net.Socket.connect(Socket.java:579)
at java.net.Socket.connect(Socket.java:528)
at java.net.Socket.<init>(Socket.java:425)
at java.net.Socket.<init>(Socket.java:208)
at org.apache.catalina.startup.Catalina.stopServer(Catalina.java:422)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.catalina.startup.Bootstrap.stopServer(Bootstrap.java:338)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:416)
{noformat}
I will give you access to that box via email.
Attachment: Added: full-thread-dump Lowering to Major as the reproducibility is still questionable. I have reproduced it with a tomcat6 installed in $HOME/tc6.0.x/output/build with the following script:
+++
$HOME/tc6.0.x/output/build
while true
do
bin/startup.sh
while true
do
curl -v --cookie JSESSIONID=data.jvm1 http://localhost:8080/examples/jsp/num/numguess.jsp 2>&1 >/dev/null
if [ $? -eq 0 ]; then
bin/shutdown.sh
break
fi
done
sleep 2
ps -ef | grep tc6.0.x | grep -v grep
if [ $? -eq 0 ]; then
break
fi
done
+++
the whole problem it due to the fact tomcat6 doesn't send: lifecycleEvent: StandardServer[8005] before_destroy lifecycleEvent: StandardServer[8005] after_destroy I think a fix is to use the: lifecycleEvent: StandardServer[8005] after_stop to destroy the listener doing the advertise. Attachment: Added: mod_cluster-container-tomcat6.jar Hmm, I have just tried the attached jar out with the following result:
- having tomcat open in terminal and hitting ^C shuts it down all right
- using {{./shutdown.sh}} still leads to the old
{noformat}
Using CATALINA_BASE: /root/workspace/jboss-ews-2.0/tomcat-6-19
Using CATALINA_HOME: /root/workspace/jboss-ews-2.0/tomcat-6-19
Using CATALINA_TMPDIR: /root/workspace/jboss-ews-2.0/tomcat-6-19/temp
Using JRE_HOME: /root/jdk1.7.0_last/
Using CLASSPATH: /root/workspace/jboss-ews-2.0/tomcat-6-19/bin/bootstrap.jar
Sep 20, 2012 10:21:24 AM org.apache.catalina.startup.Catalina stopServer
SEVERE: Catalina.stop:
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:391)
at java.net.Socket.connect(Socket.java:579)
at java.net.Socket.connect(Socket.java:528)
at java.net.Socket.<init>(Socket.java:425)
at java.net.Socket.<init>(Socket.java:208)
at org.apache.catalina.startup.Catalina.stopServer(Catalina.java:422)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.catalina.startup.Bootstrap.stopServer(Bootstrap.java:338)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:416)
{noformat}
I will come back to it later...
patch for the problem. Attachment: Added: patch.JBPAPP-9788 Its still bug in ER11 Sure but could someone test the stuff with the patch please. I was testing with your patch on MS Windows 2008R2 by all mod_cluster tests and it works! We need to verify also on RHEL and Solaris, but i don't expect problems. The patch is for Tomcat6 and how about Tomcat7 ? Can you check Tomcat7 sources if there is a same problem ? Tomcat6 and Tomcat7 sources are completely different. It works for Tomcat7 I have checked it. Look to the comment I made the 20. OK, so, we will see our testing in RHEL and Solaris If I apply your patch, I am unable to even start the tomcat (I have tested it on RHEL 5 and 6, and also it happens on Solaris machines).
I am getting this exception in catalina.out (without patch it is started correctly):
{noformat}
SEVERE: Begin event threw error
java.util.ServiceConfigurationError: No org.jboss.modcluster.container.catalina.LifecycleListenerFactory service provider found.
at org.jboss.modcluster.container.catalina.standalone.ModClusterListener$1.run(ModClusterListener.java:105)
at org.jboss.modcluster.container.catalina.standalone.ModClusterListener$1.run(ModClusterListener.java:99)
at java.security.AccessController.doPrivileged(Native Method)
at org.jboss.modcluster.container.catalina.standalone.ModClusterListener.loadFactory(ModClusterListener.java:108)
at org.jboss.modcluster.container.catalina.standalone.ModClusterListener.<init>(ModClusterListener.java:95)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at java.lang.Class.newInstance0(Class.java:355)
at java.lang.Class.newInstance(Class.java:308)
at org.apache.tomcat.util.digester.ObjectCreateRule.begin(ObjectCreateRule.java:206)
at org.apache.tomcat.util.digester.Rule.begin(Rule.java:153)
at org.apache.tomcat.util.digester.Digester.startElement(Digester.java:1356)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:501)
at com.sun.org.apache.xerces.internal.parsers.AbstractXMLDocumentParser.emptyElement(AbstractXMLDocumentParser.java:179)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:1343)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2756)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:648)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:511)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:808)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:737)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:119)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1205)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:522)
at org.apache.tomcat.util.digester.Digester.parse(Digester.java:1642)
at org.apache.catalina.startup.Catalina.load(Catalina.java:524)
at org.apache.catalina.startup.Catalina.start(Catalina.java:582)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)
Caused by: java.util.ServiceConfigurationError: No org.jboss.modcluster.container.catalina.LifecycleListenerFactory service provider found.
at org.jboss.modcluster.container.catalina.standalone.ModClusterListener$1.run(ModClusterListener.java:105)
at org.jboss.modcluster.container.catalina.standalone.ModClusterListener$1.run(ModClusterListener.java:99)
at java.security.AccessController.doPrivileged(Native Method)
at org.jboss.modcluster.container.catalina.standalone.ModClusterListener.loadFactory(ModClusterListener.java:108)
at org.jboss.modcluster.container.catalina.standalone.ModClusterListener.<init>(ModClusterListener.java:95)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
{noformat}
you must be doing something wrong as previous comment said it is working on windoze (and for me on Fedora17). Please add the patch to 1.2.2.Final [~jfclere] even after applying the patch there are still some running tomcats remaining I need a thread dump or stack trace of one of those or a "easy" way to reproduce. Attaching stack traces of Tomcat with the patch applied 1. before running {{shutdown.sh}} 2. after running {{shutdown.sh}} as obtained using {{jstack $PID}}. I'm sending you login informations for that particular machine via email, in case you are interested.
Attachment: Added: before_shutdown.txt Attachment: Added: after_shutdown.txt The patch is not in /opt/workspace/jboss-ews-2.0/tomcat6/lib/mod_cluster-container-tomcat6.jar (the classes are from september 12). So it is not fixed in ER11 :-( -That's weird, the {{/opt/workspace/jboss-ews-2.0/tomcat6/lib/mod_cluster-container-tomcat6.jar}} file is the same file as the {{mod_cluster-container-tomcat6.jar}} attachment in this issue (at least according to {{md5sum}} and {{sha1sum}})...- Wrong, I was looking at a bad file. You're right, it isn't fixed in ER11.
it should be in CR2. Mladen, I've built mod_cluster-1.2.2-4.Final_redhat_2.ep6.el5 mod_cluster-1.2.2-4.Final_redhat_2.ep6.el6 Can you please update that for the other platforms? Thanks! (I suspect you don't need the commit id in git, but in case you do, it is the jb-eap-6-rhel-6 branch of mod_cluster, commit id abe36072f0595ab6bf5e05822135bcac12e770f6) The problem with shutting down the tomcat remains in certain cases. If I deploy too many contexts (based on mod_cluster listener property Maxcontext) it causes release of the shutdown port without actually stopping whole tomcat. Because this is not so much a problem of shutting down as problem of releasing the shutdown port before really shutting down the tomcat server, creating new JIRA for it [https://issues.jboss.org/browse/JBPAPP-10147] looking to JBPAPP-10147 it is MODCLUSTER-325 which is a minor bug. The clean up time is a parameter if you want a fast clean up you may have to adjust the corresponding parameter, additionally is you look to the catalina.out you will see that the clean up time out log messages in it. Note that the patch was missing in CR1 so please make sure it is in CR2. Fixed in CR2. The problem remains on Solaris machines, it looks like there wasn't applied the patch. Even after applying the patch from this JIRA the problem remains, but in CR1 it worked with this patch was applied. that is a weird exception:
java.io.InterruptedIOException: operation interrupted
at java.net.PlainDatagramSocketImpl.receive0(Native Method)
at java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:145)
at java.net.DatagramSocket.receive(DatagramSocket.java:725)
at org.jboss.modcluster.advertise.impl.AdvertiseListenerImpl$AdvertiseListenerWorker.run(AdvertiseListenerImpl.java:354)
at java.lang.Thread.run(Thread.java:662)
I don't see why you don't get it with CR1 + patch.
Release Notes Docs Status: Added: Documented as Known Issue Writer: Added: mhusnain Release Notes Text: Added: Using one tomcat6 node with one http balancer (mod_cluster) and one hundred simple JSP application that are deployed as one hundred different contexts. Each context is accessed by a client once and the response code is HTTP 200. When tomcat is subsequently shut down using the shutdown.sh script, the connectors are stopped but the tomcat process does not shut down for an indefinite period of time. Workaround: ??? Documented as a Known Issue for 2.0. Is there a known workaround to add? I have a fix that MUST go in CR3. [~mhusnain] This issue shouldn't be in release notes, this should be fixed in CR3 Please the tag 1.2.3.Final. mod_cluster-1.2.3-1.Final_redhat_1.ep6.el6 mod_cluster-native-1.2.3-2.Final.ep6.el6 mod_cluster-1.2.3-1.Final_redhat_1.ep6.el5 mod_cluster-native-1.2.3-2.Final.ep6.el5 are built. Resolved in CR3. Release Notes Docs Status: Removed: Documented as Known Issue Writer: Removed: mhusnain Release Notes Text: Removed: Using one tomcat6 node with one http balancer (mod_cluster) and one hundred simple JSP application that are deployed as one hundred different contexts. Each context is accessed by a client once and the response code is HTTP 200. When tomcat is subsequently shut down using the shutdown.sh script, the connectors are stopped but the tomcat process does not shut down for an indefinite period of time. Workaround: ??? Docs QE Status: Removed: NEW |