Bug 504239

Summary: Hangs on service stop if NetworkManager has been stopped first
Product: [Fedora] Fedora Reporter: Michael Cronenworth <mike>
Component: postgresqlAssignee: Tom Lane <tgl>
Status: CLOSED NOTABUG QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 10CC: hhorak, tgl
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-06-08 14:59:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Michael Cronenworth 2009-06-04 22:27:46 UTC
Description of problem: My postgresql service will hang on a "service postgresql stop" command if NetworkManager has been stopped ahead of time. This causes a shutdown to hang as the very first thing that happens is NetworkManager is stopped -- I really wish this behaviour was changed as it makes zero sense to stop your networking connections with other daemons still running.


Version-Release number of selected component (if applicable): postgresql-8.3.7-1.fc10.i386


How reproducible: Always


Steps to Reproduce:
1. Have a NetworkManager managed network connection.
2. service postgresql start
3. service NetworkManager stop
4. service postgresql stop
  
Actual results: postgresql service hangs on stop.


Expected results: postgresql service stops.


Additional info: If I break (CTRL+C) out of the command and start NetworkManager again, then postgresql will stop. This has been happening to me for a while, but I just got around to investigating the cause.

Comment 1 Tom Lane 2009-06-05 15:50:31 UTC
Hmm, I suspect the problem is this bit of the init script:

# Check that networking is up.
# Pretty much need it for postmaster.
[ "${NETWORKING}" = "no" ] && exit 1

There doesn't seem to be any good reason to enforce that for all initscript actions, only for "start".
Would you check whether things behave the way you want if that code is moved into the start()
function, a few lines further down?  Personally I hate NetworkManager and don't use it on any of my machines, so I'm not in a good position to test this for myself.

Comment 2 Michael Cronenworth 2009-06-05 20:02:38 UTC
That particular line is not the problem. Moving it into start() still hung the stop command.

I left the "service postgresql stop" command in it's hung state, and performed "ps -efw" to see what it is hung on.

runuser -l postgres -c "/usr/bin/pg_ctl stop -D '/var/lib/pgsql/data' -s -m fast"

I then attempted to "su - postres" and this hung. I forgot that my system is setup for LDAP user/group lookups... Changing my system to not be LDAP-bound fixed the hang. The hang being an attempt to connect to my LDAP server to lookup the postgres user. Doh. I'm not sure how to work around this at the moment without giving it some thought.

Comment 3 Tom Lane 2009-06-05 22:27:30 UTC
Hmm ... there really isn't much that I can do about something like that from the Postgres end of things.  I'm thinking the real problem stems from your original comment about NetworkManager getting stopped first during shutdown.  AFAICS that shouldn't be true --- NetworkManager's shutdown should occur relatively late in the system shutdown cycle, certainly well after postgres is killed.  Can you confirm the contents of your /etc/rc5.d/ directory?  What I see is K36postgresql and K84NetworkManager, so postgres should shut down first ...

Comment 4 Michael Cronenworth 2009-06-08 14:59:18 UTC
On the trouble system I have K36postgresql and S99NetworkManager. This system was first installed with FC4 so it's had its share of upgrades.

I looked at a fresh F10 machine and it has S27NetworkManager... Seems that my trouble system never got the right update applied.

I looked in the NetworkManager scripts and saw /sbin/chkconfig NetworkManager resetpriorities in the postinstall list. I ran this manually and my trouble system now has S27NetworkManager as well. Rebooting with postgresql started and LDAP auth no longer results in a hang.

Sorry for the noise.

Comment 5 Tom Lane 2009-06-08 15:07:34 UTC
OK, thanks for the followup.