Bug 165824

Summary: Postgresql service reported as failed during boot
Product: [Fedora] Fedora Reporter: 260795 <d.sbragion>
Component: postgresqlAssignee: Tom Lane <tgl>
Status: CLOSED DUPLICATE QA Contact: David Lawrence <dkl>
Severity: low Docs Contact:
Priority: medium    
Version: 4CC: hhorak
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-09-30 18:53:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description 260795 2005-08-12 16:12:59 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; it-IT; rv:1.7.5) Gecko/20041110 Firefox/1.0

Description of problem:
Sometimes on boot the postgresql service is reported as failed even though it has been properly started. This happens only if the startup of previous services loads the server, expecially the hard disks, so that the postgresql startup becomes pretty slow.

Version-Release number of selected component (if applicable):
postgresql-8.0.3-1

How reproducible:
Sometimes

Steps to Reproduce:
Not easy. You need a series of heavy services starting before postgresql.

Actual Results:  Service reported as failed, but it's started correctly instead.

Expected Results:  Service started and reported as ok.

Additional info:

This is not the same as previous similar bugs already present in the bug database, even though it looks similar. I've checked in the /etc/init.d/postgresql and found the cause of the problem being around these lines:

$SU -l postgres -c "$PGENGINE/postmaster -p '$PGPORT' -D '$PGDATA' ${PGOPTS} &" >> "$PGLOG" 2>&1 < /dev/null
sleep 2
pid=`pidof -s "$PGENGINE/postmaster"`

The fixed sleep isn't enough if the server gets loaded. So I added the following code snippet just after the lines above (original indentation lost, sorry):

while [ "$pid" == "" -a $try -le 10 ]; do
 sleep 2
 pid=`pidof -s "$PGENGINE/postmaster"`
done

This forces the script to wait another 20 seconds, while checking every 2 seconds, if the first try fails. The 2 seconds check avoid waiting too much, thus slowing down the boot process, if the postgresql server starts quickly. Of course it's still just an hack, busy waiting isn't a good solution, but it's better than having an unnecessary fixed long wait.

Hope it helps.

Comment 1 Tom Lane 2005-08-12 18:42:34 UTC
That script looks a bit shy of a load ... don't you need some more code to
maintain the $try variable?  But I follow your idea, and it seems reasonable.
I'll throw something like this into the next update.

Comment 2 260795 2005-08-13 15:48:13 UTC
Ops, sorry, clearly I forgot to copy the line with the $try variable increment. 
I was wondering if there was a better solution, like for example checking also 
the exit status of the postgresql background process to further speed up the 
check on failure, but I don't know if it works with process executed by su, and 
unfortunately I had no time to check.

Comment 3 Tom Lane 2005-09-30 18:53:48 UTC
Checking through my bug list, this is a duplicate of another report; since there
is more discussion in the other entry, I'm going to close this one as a
duplicate.  Please add to bug #166117 if you have any additional comments.

*** This bug has been marked as a duplicate of 166117 ***