Bug 813934

Summary: User on forums reporting that Postgres gets into a bad state after a deploy
Product: OKD Reporter: Nam Duong <nduong>
Component: PodAssignee: Ram Ranganathan <ramr>
Status: CLOSED CURRENTRELEASE QA Contact: libra bugs <libra-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.xCC: bmeng, dmcphers, jialiu, jofernan, mfisher, mpatel, rmillner, xtian
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: cartridge-postgresql-8.4 > 0.8.1-1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-05-14 17:22:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Nam Duong 2012-04-18 19:41:48 UTC
Description of problem:
See forum post: https://www.redhat.com/openshift/community/forums/openshift/postgresql-gets-stuck-in-the-database-system-is-shutting-down
User:  sontek
Apptype: python
DB: postgresql

Problem:  After a git push, app cannot connect to the database.  Getting "OperationalError: (OperationalError) FATAL: the database system is shutting down"

Reproducible:  2 times so far, many times, git push doesn't exhibit this behavior.

Workaround: rhc-app force-stop -a {appName} resolves the problem.

Env:  action scripts not being used.  The user is not manually stopping the database.


NOTE:
Since this is difficult to reproduce (I tried to git push my app many times, as well as tried rhc app restart -a {appName} ), I reproduced this by manually ssh'ing into the gear and running:
$OPENSHIFT_DB_CTL_SCRIPT stop
This puts the db in that 'stopping' state.  I didn't have the patience to wait after 5+ minutes to see if it actually stops.  Took the route the user took (force_stop).

Comment 1 Ram Ranganathan 2012-05-01 00:20:17 UTC
Can't reproduce this bug But put in a resiliency fix to do an immediate shutdown if a normal shutdown doesn't succeed. Fixed with git commit: e35247690ff2ec9b50f60144d9894a23441cdec3

Comment 2 Xiaoli Tian 2012-05-02 10:42:15 UTC
We have not find the better way to reproduce, while checking the code, found  a typo  in the  above commit id : should be  pg_ctl -m immediate.


+ if `pgrep -x postgres -u $(id -u) > /dev/null 2>&1`; then
	
+     pg_ctl stop -D "$CART_INSTANCE_DIR/data" -m immedate -w >> $pglogfile 2>&1
                                                   ^^^^^^^^
                                                   
	
+  fi

Comment 3 Ram Ranganathan 2012-05-02 17:28:07 UTC
Good catch - thanks. Fixed typo with git commit 9670ff3421c4ded7841b19869a18cb927bdebb2b

Comment 4 Johnny Liu 2012-05-03 06:21:51 UTC
Verified this bug with devenv_1752, and PASS.

Typo is fixed.
Due to this bug is difficult to reproduce, I do git push, and stop db manually via ssh into app for several times. 
It always works fine.