Bug 1494620 - Postgresql container goes to timeout error while deploying using CNS
Summary: Postgresql container goes to timeout error while deploying using CNS
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Software Collections
Classification: Red Hat
Component: rh-postgresql95-container
Version: rh-postgresql95
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.2
Assignee: Petr Kubat
QA Contact: qe-baseos-daemons
URL:
Whiteboard:
Depends On:
Blocks: 1724792
TreeView+ depends on / blocked
 
Reported: 2017-09-22 16:27 UTC by Thom Carlin
Modified: 2019-06-28 16:04 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-08-03 13:08:31 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Thom Carlin 2017-09-22 16:27:13 UTC
Description of problem:

Postgresql container times out with no obvious reason.


Version-Release number of selected component (if applicable):

Image:		registry.access.redhat.com/rhscl/postgresql-95-rhel7@sha256:fcf2069f146179e98bf6f204e9fb4453663580389986ed002e9f8b75d11173d9


How reproducible:

100% when:
* Using Container Native Storage (CNS)
* Postgresql Username exists
* Postgresql Database does not

Steps to Reproduce:
1. Setup OCP environment with CNS
2. Deploy postgresql using glusterfs (CNS)
3. Teardown postgresql
3. Redeploy postgresql

Actual results:



Expected results:


Additional info:

Some other symptoms
oc get pods <postgres-deploy-podname>
--> Scaling <postgres-deploy-podname> to 1
--> Waiting up to 10m0s for pods in rc <postgres-rc-name> to become ready
error: update acceptor rejected <postgres-rc-name>: pods for rc "<postgres-rc-name>" took longer than 600 seconds to become ready

oc get rc <postgres-rc-name>
NAME      DESIRED   CURRENT   READY     AGE
<postgres-rc-name>    0         0         0         <age>

oc logs rc/<postgres-rc-name>
error: timed out waiting for the condition

Also seen:
createuser: creation of new role failed: ERROR:  role "gogs" already exists
FATAL:  the database system is starting up
LOG:  database system was not properly shut down; automatic recovery in progress
FATAL:  the database system is starting up
LOG:  invalid record length at 0/17077A0
LOG:  redo is not required
FATAL:  the database system is starting up
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
LOG:  received fast shutdown request
LOG:  aborting any active transactions
LOG:  autovacuum launcher shutting down
LOG:  shutting down
LOG:  database system is shut down
LOG:  database system was shut down at 2017-09-21 14:38:36 UTC
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
FATAL:  database "gogs" does not exist

============================================================================
If POSTGRESQL_USER exists but POSTGRESQL_DATABASE does not, the container 
/usr/share/container-scripts/postgresql/common.sh appears to not completely perform all the tasks in postinitdb_actions for simple_db.

In create_users(), createuser "$POSTGRESQL_USER" fails but 
createdb --owner="$POSTGRESQL_USER" "$POSTGRESQL_DATABASE"
doesn't run
/usr/bin/run-postgresql has a "set -eu" which causes an error exit

============================================================================
workaround appear to be:
during deploy phase, determine the pod being deployed
oc rsh <postgres-deploy-pod>
sh-4.2$ createuser gogs # Is not necessary
createuser: creation of new role failed: ERROR:  role "gogs" already exists
sh-4.2$ createdb --owner="gogs" "gogs"
sh-4.2$ exit
exit

============================================================================
We have a consistent reproducer deploying https://github.com/jbossdemocentral/coolstore-microservice using OCP and CNS

Comment 2 Pavel Raiskup 2017-09-25 07:19:46 UTC
Thanks for the report, Thom.

> In create_users(), createuser "$POSTGRESQL_USER" fails but 
> createdb --owner="$POSTGRESQL_USER" "$POSTGRESQL_DATABASE"
> doesn't run

Why this fails?  What's the error output?

Comment 3 Thom Carlin 2017-09-25 11:43:42 UTC
Errors:
createuser: creation of new role failed: ERROR:  role "gogs" already exists
followed by many, many:
FATAL:  database "gogs" does not exist

Comment 4 Pavel Raiskup 2017-09-25 14:13:13 UTC
Thom, how it comes that the "gogs" already exists?  Who created that?

Comment 5 Thom Carlin 2017-09-25 15:25:27 UTC
Pavel, I'm not sure, I'm running the "provision_demo.sh" script.  Will provide more information shortly after I handle a couple of other fires...

Comment 6 Thom Carlin 2017-09-25 17:23:44 UTC
Here's the latest postgresql log:
more postgresql-Mon.log 
LOG:  database system was shut down at 2017-09-22 19:46:33 UTC
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
LOG:  received fast shutdown request
LOG:  aborting any active transactions
LOG:  autovacuum launcher shutting down
LOG:  shutting down
LOG:  database system is shut down
LOG:  database system was shut down at 2017-09-25 17:19:53 UTC
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
FATAL:  database "gogs" does not exist
FATAL:  database "gogs" does not exist
FATAL:  database "gogs" does not exist
LOG:  incomplete startup packet
FATAL:  database "gogs" does not exist
LOG:  incomplete startup packet
FATAL:  database "gogs" does not exist
LOG:  incomplete startup packet
FATAL:  database "gogs" does not exist
LOG:  incomplete startup packet
FATAL:  database "gogs" does not exist
LOG:  incomplete startup packet
FATAL:  database "gogs" does not exist
LOG:  incomplete startup packet
FATAL:  database "gogs" does not exist
LOG:  incomplete startup packet
FATAL:  database "gogs" does not exist
LOG:  incomplete startup packet
FATAL:  database "gogs" does not exist
LOG:  incomplete startup packet
FATAL:  database "gogs" does not exist
LOG:  incomplete startup packet
FATAL:  database "gogs" does not exist

Comment 7 Pavel Raiskup 2017-09-26 04:46:14 UTC
Have a look at the code in /bin/run-postgresql:

| if [ ! -f "$PGDATA/postgresql.conf" ]; then
|   initialize_database
|   NEED_TO_CREATE_USERS=yes
| fi
|
| pg_ctl -w start -o "-h ''"
| if [ "${NEED_TO_CREATE_USERS:-}" == "yes" ]; then
|   create_users
| fi

The `create_users` function is run if and only if the `initialize_database`
function has been called, which happened only if "$PGDATA/postgresql.conf"
was not existent.

The question is not why the re-deployment failed, but why the first deployment
failed to initialize the DB together with "$PGDATA/postgresql.conf".


Note You need to log in before you can comment on or make changes to this bug.