Bug 560634 - the initscript does not wait for server startup, plus problem reporting status
Summary: the initscript does not wait for server startup, plus problem reporting status
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: postgresql
Version: 15
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Tom Lane
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 560954 (view as bug list)
Depends On: 696427
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-02-01 13:16 UTC by Karel Volný
Modified: 2012-08-07 20:24 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-08-07 20:24:37 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Karel Volný 2010-02-01 13:16:12 UTC
Description of problem:
If for some reason (e.g. bug #548403) the startup takes long, the initscript finishes reporting success even if the server is not ready.
Querying service status then reports success, so there is no (easy) way how to determine if the service is ready to use or not.

Version-Release number of selected component (if applicable):


How reproducible:
always in environments where the startup takes longer than circa 2 seconds

Steps to Reproduce:
1. run the following script (warning, it deletes the database!)

#!/bin/bash

service postgresql stop
rm -rf /var/lib/pgsql
service postgresql initdb
sed -i -e 's/ident/trust/' /var/lib/pgsql/data/pg_hba.conf
service postgresql start
service postgresql status
echo $?
ls -l /tmp/.s*
su -c 'createdb CVE20093230' - postgres

  
Actual results:
Stopping postgresql service:                               [FAILED]
Initializing database:                                     [  OK  ]
Starting postgresql service:                               [  OK  ]
postmaster (pid 20950) is running...
0
ls: /tmp/.s*: No such file or directory
createdb: could not connect to database postgres: could not connect to server: No such file or directory
        Is the server running locally and accepting
        connections on Unix domain socket "/tmp/.s.PGSQL.5432"?


Expected results:
Stopping postgresql service:                               [FAILED]
Initializing database:                                     [  OK  ]
Starting postgresql service:                               [  OK  ]
postmaster (pid 21008 21007 21006 21005 21003 20950) is running...
0
srwxrwxrwx 1 postgres postgres  0 Feb  1 08:10 /tmp/.s.PGSQL.5432
-rw------- 1 postgres postgres 26 Feb  1 08:10 /tmp/.s.PGSQL.5432.lock


(of course if the server would not be ready, the 'service postgresql status' shouldn't report success, but in this example, I suppose that 'service postgresql start' finishes only after the server is ready)

Additional info:
spin off bug #557749, see comment #5 to that bug (I haven't tested on Rawhide actually ...)

Comment 2 Tom Lane 2010-02-02 15:34:33 UTC
*** Bug 560954 has been marked as a duplicate of this bug. ***

Comment 3 Bug Zapper 2010-03-15 14:19:44 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 13 development cycle.
Changing version to '13'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 4 Bug Zapper 2011-06-02 16:42:36 UTC
This message is a reminder that Fedora 13 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 13.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '13'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 13's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 13 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 5 Karel Volný 2011-06-15 18:00:29 UTC
I cannot reproduce the problem any more with Fedora 15, however, I think it is still not fixed completely

the initscript code in question is:

        echo -n "$PSQL_START"
        test x"$PG_OOM_ADJ" != x && echo "$PG_OOM_ADJ" > /proc/self/oom_adj
        $SU -l postgres -c "$PGENGINE/postmaster -p '$PGPORT' -D '$PGDATA' ${PGOPTS} &" >> "$PGLOG" 2>&1 < /dev/null
        sleep 2                                                                                                                                                        
        pid=`head -n 1 "$PGDATA/postmaster.pid" 2>/dev/null`
        if [ "x$pid" != x ]
        then
                success "$PSQL_START"
                touch "$lockfile"
                echo $pid > "$pidfile"
                echo
        else
                failure "$PSQL_START"
                echo
                script_result=1
        fi

- so there is 2 sec delay allowing for the process to come up, but if it takes longer, it may report failure while postmaster actually starts

in the case there is some problem writing the pid file, it may leave "dead" process behind - I think it'be nice to wait conditionally, not for fixed two seconds, and to try to force kill the process if it is not ready after (some longer) timeout

Comment 6 Tom Lane 2011-06-15 18:23:52 UTC
Given that Fedora is moving to systemd, the level of interest in fixing the initscript's behavior is probably going to drop fast.  I am not sure what expectations systemd has for startup to wait or not wait for the daemon to be ready, but maybe it will provide a better environment for this.  In SysV-land we're kind of between a rock and a hard place, since neither waiting indefinitely nor reporting failure is particularly desirable (and killing a postmaster that may be about to come up is right out).

Comment 7 Karel Volný 2011-06-15 18:35:54 UTC
I thought systemd still uses those initscripts ... if this is going to be reworked then let's close this bz when it is ready

Comment 8 Tom Lane 2011-06-15 18:56:54 UTC
(In reply to comment #7)
> I thought systemd still uses those initscripts ...

Yeah, it does as of F-15, but see bug #696427.

Comment 9 Karel Volný 2011-06-16 12:12:01 UTC
(In reply to comment #8)
> Yeah, it does as of F-15, but see bug #696427.

cool, so once you're done with bug #696427 I'll re-test this - adding to dependecies to keep track on changes

Comment 10 Tom Lane 2011-07-28 22:26:07 UTC
This should be dealt with as of postgresql-9.0.4-8.fc16.

Comment 11 Fedora End Of Life 2012-08-07 20:24:40 UTC
This message is a notice that Fedora 15 is now at end of life. Fedora
has stopped maintaining and issuing updates for Fedora 15. It is
Fedora's policy to close all bug reports from releases that are no
longer maintained. At this time, all open bugs with a Fedora 'version'
of '15' have been closed as WONTFIX.

(Please note: Our normal process is to give advanced warning of this
occurring, but we forgot to do that. A thousand apologies.)

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, feel free to reopen
this bug and simply change the 'version' to a later Fedora version.

Bug Reporter: Thank you for reporting this issue and we are sorry that
we were unable to fix it before Fedora 15 reached end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora, you are encouraged to click on
"Clone This Bug" (top right of this page) and open it against that
version of Fedora.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

The process we are following is described here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping


Note You need to log in before you can comment on or make changes to this bug.