Hide Forgot
+++ This bug was initially created as a clone of Bug #678715 +++ In the start_instance() function in file pki/base/common/scripts/functions A loop was recently added to wait for the instance to fully initialize by using netstat to check socket availability. There are a couple of minor problems which need fixing. 1) No check for previous status The instance is started like this: $PKI_INSTANCE_INITSCRIPT start rv=$? then the loop is entered. But if the initscript failed ($rv -ne 0) there is no point in looping for 30 seconds waiting for it to come up. The function should immediately return the failed error code. 2) If the loop is exhausted and no sockets were detected the function should return a failure error code. --- Additional comment from jdennis on 2011-02-18 18:54:53 EST --- Created attachment 479632 [details] better netstat loop behaviour
Created attachment 480025 [details] better netstat loop behaviour
IPA_v2_RHEL_6_1_ERRATA_BRANCH: # cd pki # svn status | grep -v ^$ | grep -v ^P | grep -v ^X | grep -v ^? M base/common/scripts/functions # svn commit Sending base/common/scripts/functions Transmitting file data . Committed revision 1862. Resolves #679174 - netstat loop fixes needed
Can you please add steps to verify this issue? thanks
You to get the CA into state where it can't initialize and can't come up such that it reports an initialization failure. Previously the initscript would wait a very long time before it decided it ain't going to happen, with the fix in place the initscript reports the failure immediately. Sorry, can't help you with how to screw up the CA so bad it won't initialize, but somewhere along the way we did have such a situation.
The tomcat6 initscript returns immediately after launching the JVM, as long as the JVM starts it reports success, but that does not mean the tomcat instance hosted by the JVM fully initialized. To discover if the instance fully initialized and was ready to serve we used the existence of a listening port as evidence. Therefore we added a loop which used netstat to check for the port. It looped pausing each second until it saw the port. It did this for 30 seconds, if no port appeared after 30 seconds it reported failure. But the tomcat6 initscript can immediately report failure in some circumstances. Thus there is no point in looping for 30 for a port you know for a fact will never appear because the JVM launch failed. We should immediately break out the loop in this case and immediately report the failure instead of waiting 30 seconds. One way you might immediately case a failure of the JVM launch is by removing the pkiuser from the system because the tomcat6 initscript is instructed to run the JVM as that user, if the user does not exist it should immediately fail.
After discussion with john, an easy way to verify this issue is to stop the ipa services, remove the pkiuser and start the services again ... starting of the CA should fail immediately, rather that keep trying for 30 seconds. starting CA immediately returned: Starting CA Service Starting pki-ca: chown: invalid user: `pkiuser:pkiuser' chown: invalid user: `pkiuser:pkiuser' Error code 4 [FAILED] Failed to start CA Service version pki-ca-9.0.3-10.el6.noarch ipa-server-2.0.0-21.el6.x86_64
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2011-0627.html