Bug 146745

Summary: Dead postgres can't be killed
Product: [Fedora] Fedora Reporter: Zenon Panoussis <redhatbugs>
Component: postgresqlAssignee: Tom Lane <tgl>
Status: CLOSED NOTABUG QA Contact: David Lawrence <dkl>
Severity: medium Docs Contact:
Priority: medium    
Version: 3CC: hhorak
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-02-01 22:22:06 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Zenon Panoussis 2005-02-01 06:12:34 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.3)
Gecko/20041020

Description of problem:
The init script for postgres expects it to be running, and blocks
reboot if postgres never started in the first place.

How reproducible:
Always

Steps to Reproduce:
1. Make a mess of your filesystem so that you get the "Press ^D to
continue or enter root password for a shell" prompt.
2. Enter root password and get a shell. Type 'reboot<enter>'.
3. The reboot process will stall at "Stopping postgres". While
stopping all other services which never got started in the first place
just produces a FAILED, postgres returns no exit code and the reboot
process gets definitely stuck.
4. Press the reset button.

Comment 1 Tom Lane 2005-02-01 07:29:56 UTC
"Make a mess of your filesystem" isn't a reproducible step IMHO.  In any case, if you've 
managed to hose your filesystem, why would you expect Postgres or any other userland 
application to be able to deal with that?

The init script demonstrably does work properly when PG wasn't running, so I think you 
are blaming the messenger here ...

Comment 2 Zenon Panoussis 2005-02-01 15:13:15 UTC
Oh, it doesn't take hosing, just a reset that goes wrong. 
I just reproduced it like this:

1. Check that postgres is running and completely idle, i.e. 
   nothing has actually accessed a database for a long time:
   # ls /var/run/postma*
2. Make sure your system won't be able to boot next time:
   # echo "/dev/hdz1  /blah  ext3  defaults  1 2" >>/etc/fstab
3. Avoid damage:
   # sync
4. Press the reset button. At the next boot, give the root password 
   for a shell. Fix fstab and try to reboot:
   # reboot

You'll get stuck at "Stopping postgres" and the only way to get on
from there is to press the reset button again. 

I subsequently repeated the whole procedure, but now with 

4. # rm -f /var/run/postmaster*
   # reboot

This got stuck in the exact same way, so it's not a simple matter of a
rogue pidfile. 


Comment 3 Zenon Panoussis 2005-02-01 15:23:57 UTC
BTW, messing up fstab and rebooting isn't even a necessary step. The
thing is to get the system in a state where (a) postgress has died
abruptly and (b) K15postgresql gets run. You might be able to get
there with a simple 'killall -9 postmaster; reboot', but I didn't try
that.


Comment 4 Tom Lane 2005-02-01 16:34:00 UTC
How long did you wait exactly before deciding it was hung forever? 
There are some paths through the pg_ctl script that have sixty-second
timeouts ...

Comment 5 Zenon Panoussis 2005-02-01 22:22:06 UTC
Not long enough. So I tried it again with 'killall -9 postmaster;
reboot'. It got indeed stuck for about one minute, then returned "OK".