+++ This bug was initially created as a clone of Bug #1347860 +++ Description of problem: The systemd unit terminates the tomcat process prematurely, therefore not allowing it to shutdown gracefully. The direct effect of this is that sessions in the session manager are not persisted to disk and restored after a restart, resulting in a loss of session data (unless you are replicating data within a cluster). Version-Release number of selected component (if applicable): tomcat-8.0.32-5.fc23.noarch How reproducible: Always Steps to Reproduce: 1. yum install tomcat 2. cp reproducer.war /usr/share/tomcat/webapps/ 3. service tomcat start 4. curl http://localhost:8080/reproducer/getSession.jsp 5. service tomcat stop 6. ls /usr/share/tomcat/work/Catalina/localhost/reproducer/SESSIONS.ser Actual results: The session created by the curl request is not persisted into SESSIONS.ser. Expected results: The session created by the curl request is persisted to SESSIONS.ser Additional info: When the process successfully completes (with FINE logging on the StandardManager (org.apache.catalina.session.StandardManager.level = FINE)), you see the following along with other shutdown messages: ~~~ Jun 17, 2016 4:45:08 PM org.apache.catalina.session.StandardManager doUnload FINE: Unloading persisted sessions Jun 17, 2016 4:45:08 PM org.apache.catalina.session.StandardManager doUnload FINE: Saving persisted sessions to SESSIONS.ser Jun 17, 2016 4:45:08 PM org.apache.catalina.session.StandardManager doUnload FINE: Unloading 4 sessions Jun 17, 2016 4:45:08 PM org.apache.catalina.session.StandardManager doUnload FINE: Expiring 4 persisted sessions Jun 17, 2016 4:45:08 PM org.apache.catalina.session.StandardManager doUnload FINE: Unloading complete ~~~ When you call the service stop, it terminates before the shutdown can complete (almost immediately after calling stop): ~~~ INFO: Server startup in 3501 ms Jun 17, 2016 4:58:20 PM org.apache.catalina.core.StandardServer await INFO: A valid shutdown command was received via the shutdown port. Stopping the Server instance. Jun 17, 2016 4:58:20 PM org.apache.coyote.AbstractProtocol pause INFO: Pausing ProtocolHandler ["http-bio-8080"] ~~~ --- Additional comment from Coty Sutherland on 2016-06-17 17:07:02 EDT --- I'm able to get a graceful shutdown if I switch the unit type to Forking instead of simple, but then the start hangs :(
I don't know much about systemd service units, but I poked around at the service on my machine until I got something working. What I ended up with is here, please take a look and give me some feedback :) https://github.com/csutherl/fedora-tomcat/commit/cbb79ee
I guess another option for this would be KillMode=none, but Type=forking at least provides feedback when things fail and tells you to look at status instead of silently failing.
Forking is what we actually want to avoid. Not sure how forking can help in this situation. I'll try to debug systemd behavior. If it is sending term after execstop or kill signal. If it is waiting for TimeoutStopSec or not. We definetely should have systemd beeing able to kill tomcat finally, after some timeout.
> Forking is what we actually want to avoid. For what reason? Simple just fires the script and doesn't care about a return. Forking is more closely what SysV was doing and actually provides feedback when the process exits abnormally. The only benefit I see to using simple is that it kills the process when it doesn't stop in time (forking may also do that, but I haven't tested). > If it is waiting for TimeoutStopSec or not. I've already done that. I think the problem is that the tomcat stop call forks off and does it's thing, which returns immediately and sigterms the process before it can complete a shutdown. I've tried TimeoutStopSec, TimeoutSec, and I even tried putting a sleep in the server script after the stop to allow it time to finish; none of that worked for me :( Hopefully you will have better luck because I'm stuck, but I'll keep poking around at it also.
OK. I've done some pretty extensive tests on this to see how all sorts of different systemd settings respond (nothing really did anything because the stop command returns immediately). I've already stated what I think the problem is above (c#4). I tested that theory a bit more by running the script outside of systemd (it worked as expected), along with a few other tests. My conclusion is that we need to fire off the stop process in the background and provide ample time for it to complete (one second is enough for a vanilla install with minimal deployments). Here is my proposal (I agree it's a bit hacky, but it's what I came up with and works): https://github.com/csutherl/fedora-tomcat/commit/89eb646 The change will `run stop` in the background and then sleep for two seconds by default (or SHUTDOWN_WAIT if that is defined). After that time passes the ExecStop call returns and systemd SIGTERMs the remaining processes if they haven't stopped already. I've verified that this allows graceful stopping and restarting and that the processes are terminated from a hanging shutdown call. Bonus: This also restores the functionality of SHUTDOWN_WAIT which has a TODO in the tomcat.conf comments and I kept the unit type as simple to satisfy your assertion that forking will not suffice. Thoughts?
Ok, looks rather good, the only issue I see that it will sleep for this time (e.g. 30 seconds) even if tomcat is already stoped on first several seconds. I prefer having such timeout feature implemented on systemd side. As an option I'll try to remove ExecStop command - maybe TERM signal will do the same as receiving SHUTDOWN word via shutdown socket (current stop is just sending it).
> I prefer having such timeout feature implemented on systemd side. Me too :) > As an option I'll try to remove ExecStop command - maybe TERM signal will do the same as receiving SHUTDOWN word via shutdown socket If removing ExecStop is an option and allowing tomcat to shutdown via SIGTERM then I think that is the way to go. I've tested and validated that tomcat shuts down gracefully when it get's a SIGTERM (I enabled org.apache.level = FINE and compared messages in the log to a vanilla tomcat tarball bin/shutdown.sh call). After the TimeoutStopSec time passes, it receives a SIGKILL and all remaining processes immediately die.
I also confirmed with the tomcat community that using SIGTERM to gracefully shutdown tomcat is fine (it's not functionally different than the Bootstrap.stop() call) so I think that's the way we should go, if you agree.
Yes, great work! Thanks!
Here is a commit if you'd like :) https://github.com/csutherl/fedora-tomcat/commit/20c470d I tried to find a way to implement the SHUTDOWN_WAIT functionality, but it looks like you can't use a variable in the TimeoutStopSec, so I guess we'll have to do without that. We could reduce the setting of TimeoutStopSec from the default 90 seconds down to thirty (or something) just so that there is an example of how to set it in the service script, if you think that is required. Otherwise, just remove ExecStop and we're good to go!
https://pkgs.fedoraproject.org/cgit/rpms/tomcat.git/commit/?id=5d682aa
tomcat-8.0.36-2.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-0a4dccdd23
tomcat-8.0.36-2.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-2b0c16fd82
tomcat-8.0.36-2.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-f4a443888b
tomcat-8.0.36-2.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-f4a443888b
tomcat-8.0.36-2.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-0a4dccdd23
tomcat-8.0.36-2.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-2b0c16fd82
tomcat-8.0.36-2.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report.
tomcat-8.0.36-2.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.
tomcat-8.0.36-2.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.