From Bugzilla Helper: User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; it-IT; rv:1.7.5) Gecko/20041110 Firefox/1.0 Description of problem: Sometimes on boot the postgresql service is reported as failed even though it has been properly started. This happens only if the startup of previous services loads the server, expecially the hard disks, so that the postgresql startup becomes pretty slow. Version-Release number of selected component (if applicable): postgresql-8.0.3-1 How reproducible: Sometimes Steps to Reproduce: Not easy. You need a series of heavy services starting before postgresql. Actual Results: Service reported as failed, but it's started correctly instead. Expected Results: Service started and reported as ok. Additional info: This is not the same as previous similar bugs already present in the bug database, even though it looks similar. I've checked in the /etc/init.d/postgresql and found the cause of the problem being around these lines: $SU -l postgres -c "$PGENGINE/postmaster -p '$PGPORT' -D '$PGDATA' ${PGOPTS} &" >> "$PGLOG" 2>&1 < /dev/null sleep 2 pid=`pidof -s "$PGENGINE/postmaster"` The fixed sleep isn't enough if the server gets loaded. So I added the following code snippet just after the lines above (original indentation lost, sorry): while [ "$pid" == "" -a $try -le 10 ]; do sleep 2 pid=`pidof -s "$PGENGINE/postmaster"` done This forces the script to wait another 20 seconds, while checking every 2 seconds, if the first try fails. The 2 seconds check avoid waiting too much, thus slowing down the boot process, if the postgresql server starts quickly. Of course it's still just an hack, busy waiting isn't a good solution, but it's better than having an unnecessary fixed long wait. Hope it helps.
That script looks a bit shy of a load ... don't you need some more code to maintain the $try variable? But I follow your idea, and it seems reasonable. I'll throw something like this into the next update.
Ops, sorry, clearly I forgot to copy the line with the $try variable increment. I was wondering if there was a better solution, like for example checking also the exit status of the postgresql background process to further speed up the check on failure, but I don't know if it works with process executed by su, and unfortunately I had no time to check.
Checking through my bug list, this is a duplicate of another report; since there is more discussion in the other entry, I'm going to close this one as a duplicate. Please add to bug #166117 if you have any additional comments. *** This bug has been marked as a duplicate of 166117 ***