Bug 1276326
| Summary: | [openshift3/postgresql-92-rhel7] Postgresql pod is CrashLoopBackOff if using persistent storage | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | wewang <wewang> | |
| Component: | ImageStreams | Assignee: | Ben Parees <bparees> | |
| Status: | CLOSED ERRATA | QA Contact: | DeShuai Ma <dma> | |
| Severity: | low | Docs Contact: | ||
| Priority: | medium | |||
| Version: | 3.1.0 | CC: | aos-bugs, bleanhar, bparees, eparis, hhorak, jokerman, mmccomas, mnagy, praiskup, pruan, wewang, wzheng, xtian | |
| Target Milestone: | --- | |||
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1281665 1281671 (view as bug list) | Environment: | ||
| Last Closed: | 2015-12-08 17:03:02 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1281665, 1281671, 1282733 | |||
|
Description
wewang
2015-10-29 13:10:21 UTC
Need more info to troubleshoot this. Are there failed docker containers that indicate why the pod is crashing? Anything in oc get events for the pod? The problem can be reproduced sometimes, and now I test in aep ,happened similar problem
version:
AEP 3.1 - FCC2 - RPM installation
openshift v3.0.2.905
kubernetes v1.2.0-alpha.1-1107-g4c8e6f4
etcd 2.1.2
[root@dhcp-128-91 today6]# oc process -f postgresql-replica-rhel7.json| oc create -f -
persistentvolumeclaim "postgresql-data-claim" created
service "postgresql-master" created
service "postgresql-slave" created
deploymentconfig "postgresql-master" created
deploymentconfig "postgresql-slave" created
[root@dhcp-128-91 today6]# oc get pods
NAME READY STATUS RESTARTS AGE
postgresql-master-1-1qu8i 1/1 Running 0 36s
postgresql-slave-1-0pp2f 1/1 Running 0 31s
postgresql-slave-1-deploy 1/1 Running 0 39s
[root@dhcp-128-91 today6]# oc get pods
NAME READY STATUS RESTARTS AGE
postgresql-master-1-1qu8i 1/1 Running 0 57s
postgresql-slave-1-0pp2f 1/1 Running 2 52s
[root@dhcp-128-91 today6]# oc get pods
NAME READY STATUS RESTARTS AGE
postgresql-master-1-1qu8i 1/1 Running 0 1m
postgresql-slave-1-0pp2f 0/1 CrashLoopBackOff 3 1m
[root@dhcp-128-91 today6]# oc get pods
NAME READY STATUS RESTARTS AGE
postgresql-master-1-1qu8i 1/1 Running 0 1m
postgresql-slave-1-0pp2f 1/1 Running 3 1m
[root@dhcp-128-91 today6]# oc get pods
NAME READY STATUS RESTARTS AGE
postgresql-master-1-1qu8i 1/1 Running 0 1m
postgresql-slave-1-0pp2f 1/1 Running 3 1m
[root@dhcp-128-91 today6]# oc get event
FIRSTSEEN LASTSEEN COUNT NAME KIND SUBOBJECT REASON SOURCE MESSAGE
2m 2m 1 postgresql-master-1-1qu8i Pod Scheduled {scheduler } Successfully assigned postgresql-master-1-1qu8i to openshift-145.lab.eng.nay.redhat.com
2m 2m 1 postgresql-master-1-1qu8i Pod implicitly required container POD Pulled {kubelet openshift-145.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on machine
2m 2m 1 postgresql-master-1-1qu8i Pod implicitly required container POD Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id 7f8d3cc1c5ae
2m 2m 1 postgresql-master-1-1qu8i Pod implicitly required container POD Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id 7f8d3cc1c5ae
2m 2m 1 postgresql-master-1-1qu8i Pod spec.containers{postgresql-master} Pulled {kubelet openshift-145.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/rhscl/postgresql-94-rhel7" already present on machine
2m 2m 1 postgresql-master-1-1qu8i Pod spec.containers{postgresql-master} Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id 3f26ef152238
2m 2m 1 postgresql-master-1-1qu8i Pod spec.containers{postgresql-master} Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id 3f26ef152238
2m 2m 1 postgresql-master-1-deploy Pod Scheduled {scheduler } Successfully assigned postgresql-master-1-deploy to openshift-146.lab.eng.nay.redhat.com
2m 2m 1 postgresql-master-1-deploy Pod implicitly required container POD Pulled {kubelet openshift-146.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on machine
2m 2m 1 postgresql-master-1-deploy Pod implicitly required container POD Created {kubelet openshift-146.lab.eng.nay.redhat.com} Created with docker id ef18b0585d61
2m 2m 1 postgresql-master-1-deploy Pod implicitly required container POD Started {kubelet openshift-146.lab.eng.nay.redhat.com} Started with docker id ef18b0585d61
2m 2m 1 postgresql-master-1-deploy Pod spec.containers{deployment} Pulled {kubelet openshift-146.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/aep3_beta/aep-deployer:v3.0.2.905" already present on machine
2m 2m 1 postgresql-master-1-deploy Pod spec.containers{deployment} Created {kubelet openshift-146.lab.eng.nay.redhat.com} Created with docker id c1d8075cf4a7
2m 2m 1 postgresql-master-1-deploy Pod spec.containers{deployment} Started {kubelet openshift-146.lab.eng.nay.redhat.com} Started with docker id c1d8075cf4a7
1m 1m 1 postgresql-master-1-deploy Pod implicitly required container POD Killing {kubelet openshift-146.lab.eng.nay.redhat.com} Killing with docker id ef18b0585d61
1m 1m 1 postgresql-master-1-deploy Pod FailedSync {kubelet openshift-146.lab.eng.nay.redhat.com} Error syncing pod, skipping: failed to delete containers ([exit status 1])
2m 2m 1 postgresql-master-1 ReplicationController failedUpdate {deployer } Error updating deployment wewangp3/postgresql-master-1 status to Pending
2m 2m 1 postgresql-master-1 ReplicationController SuccessfulCreate {replication-controller } Created pod: postgresql-master-1-1qu8i
1m 1m 1 postgresql-slave-1-0pp2f Pod Scheduled {scheduler } Successfully assigned postgresql-slave-1-0pp2f to openshift-145.lab.eng.nay.redhat.com
1m 1m 1 postgresql-slave-1-0pp2f Pod implicitly required container POD Pulled {kubelet openshift-145.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on machine
1m 1m 1 postgresql-slave-1-0pp2f Pod implicitly required container POD Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id 29d7e948d0e7
1m 1m 1 postgresql-slave-1-0pp2f Pod implicitly required container POD Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id 29d7e948d0e7
1m 1m 4 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Pulled {kubelet openshift-145.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/rhscl/postgresql-94-rhel7" already present on machine
1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id acd90fe1c3df
1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id acd90fe1c3df
1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id 0e0fbea392e1
1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id 0e0fbea392e1
1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id afe099c9932d
1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id afe099c9932d
1m 1m 2 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Backoff {kubelet openshift-145.lab.eng.nay.redhat.com} Back-off restarting failed docker container
1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id 6e40137716f4
1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id 6e40137716f4
2m 2m 1 postgresql-slave-1-deploy Pod Scheduled {scheduler } Successfully assigned postgresql-slave-1-deploy to openshift-145.lab.eng.nay.redhat.com
2m 2m 1 postgresql-slave-1-deploy Pod implicitly required container POD Pulled {kubelet openshift-145.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on machine
2m 2m 1 postgresql-slave-1-deploy Pod implicitly required container POD Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id 371cd050eca9
2m 2m 1 postgresql-slave-1-deploy Pod implicitly required container POD Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id 371cd050eca9
2m 2m 1 postgresql-slave-1-deploy Pod spec.containers{deployment} Pulled {kubelet openshift-145.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/aep3_beta/aep-deployer:v3.0.2.905" already present on machine
2m 2m 1 postgresql-slave-1-deploy Pod spec.containers{deployment} Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id 926a6d2d9683
2m 2m 1 postgresql-slave-1-deploy Pod spec.containers{deployment} Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id 926a6d2d9683
1m 1m 1 postgresql-slave-1-deploy Pod implicitly required container POD Killing {kubelet openshift-145.lab.eng.nay.redhat.com} Killing with docker id 371cd050eca9
1m 1m 1 postgresql-slave-1-deploy Pod FailedSync {kubelet openshift-145.lab.eng.nay.redhat.com} Error syncing pod, skipping: failed to delete containers ([exit status 1])
2m 2m 1 postgresql-slave-1 ReplicationController failedUpdate {deployer } Error updating deployment wewangp3/postgresql-slave-1 status to Pending
1m 1m 1 postgresql-slave-1 ReplicationController SuccessfulCreate {replication-controller } Created pod: postgresql-slave-1-0pp2f
[root@dhcp-128-91 today6]# oc get pods
NAME READY STATUS RESTARTS AGE
postgresql-master-1-1qu8i 1/1 Running 0 2m
postgresql-slave-1-0pp2f 1/1 Running 3 2m
[root@dhcp-128-91 today6]# oc env pod postgresql-master-1-1qu8i --list
# pods postgresql-master-1-1qu8i, container postgresql-master
POSTGRESQL_MASTER_USER=master
POSTGRESQL_MASTER_PASSWORD=lRXDHiSFbtal
POSTGRESQL_USER=user
POSTGRESQL_PASSWORD=FvTFTrUdsTgI
POSTGRESQL_DATABASE=userdb
POSTGRESQL_ADMIN_PASSWORD=8fLbnnHY6dNR
[root@dhcp-128-91 today6]# oc rsh postgresql-master-1-1qu8i
bash-4.2$ psql -h postgresql-master-1-1qu8i -d userdb -U user
Password for user user:
psql (9.4.5)
Type "help" for help.
userdb=> CREATE TABLE tbl (col1 VARCHAR(20), col2 VARCHAR(20));
CREATE TABLE
userdb=> INSERT INTO tbl VALUES ('foo1', 'bar1');
INSERT 0 1
userdb=> SELECT * FROM tbl;
col1 | col2
------+------
foo1 | bar1
(1 row)
userdb=> \q
bash-4.2$ exit
exit
[root@dhcp-128-91 today6]# oc rsh postgresql-slave-1-0pp2f
bash-4.2$ psql -h postgresql-master-1-1qu8i -d userdb -U user
psql: could not translate host name "postgresql-master-1-1qu8i" to address: Name or service not known
bash-4.2$
rhscl/postgresql-94-rhel7 image_id:2425b18dbc45 There are some logs ,hope to help you to track the problem:
[root@dhcp-128-91 today6]# oc describe pod postgresql-slave-1-0pp2f
Name: postgresql-slave-1-0pp2f
Namespace: wewangp3
Image(s): registry.access.redhat.com/rhscl/postgresql-94-rhel7
Node: openshift-145.lab.eng.nay.redhat.com/10.66.79.145
Start Time: Mon, 02 Nov 2015 15:01:56 +0800
Labels: deployment=postgresql-slave-1,deploymentconfig=postgresql-slave,name=postgresql-slave
Status: Running
Reason:
Message:
IP: 10.1.2.90
Replication Controllers: postgresql-slave-1 (1/1 replicas created)
Containers:
postgresql-slave:
Container ID: docker://6e40137716f483fb022624d407c6f617ab199fff43f10c0dd48cb459a49b179a
Image: registry.access.redhat.com/rhscl/postgresql-94-rhel7
Image ID: docker://2425b18dbc4503bcc273eb43f3cbe72866da20fe0b0b6517a7bc2cf9e53ba569
QoS Tier:
memory: BestEffort
cpu: BestEffort
State: Running
Started: Mon, 02 Nov 2015 15:02:49 +0800
Last Termination State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 02 Nov 2015 15:02:19 +0800
Finished: Mon, 02 Nov 2015 15:02:19 +0800
Ready: True
Restart Count: 3
Environment Variables:
POSTGRESQL_MASTER_SERVICE_NAME: postgresql-master
POSTGRESQL_MASTER_USER: master
POSTGRESQL_MASTER_PASSWORD: lRXDHiSFbtal
POSTGRESQL_USER: user
POSTGRESQL_PASSWORD: FvTFTrUdsTgI
POSTGRESQL_DATABASE: userdb
Conditions:
Type Status
Ready True
Volumes:
postgresql-data:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-39vmg:
Type: Secret (a secret that should populate this volume)
SecretName: default-token-39vmg
Events:
FirstSeen LastSeen Count From SubobjectPath Reason Message
───────── ──────── ───── ──── ───────────── ────── ───────
38m 38m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} implicitly required container POD Pulled Container image "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on machine
38m 38m 1 {scheduler } Scheduled Successfully assigned postgresql-slave-1-0pp2f to openshift-145.lab.eng.nay.redhat.com
38m 38m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} implicitly required container POD Created Created with docker id 29d7e948d0e7
38m 38m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} implicitly required container POD Started Started with docker id 29d7e948d0e7
38m 38m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Started Started with docker id acd90fe1c3df
38m 38m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Created Created with docker id acd90fe1c3df
37m 37m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Created Created with docker id 0e0fbea392e1
37m 37m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Started Started with docker id 0e0fbea392e1
37m 37m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Created Created with docker id afe099c9932d
37m 37m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Started Started with docker id afe099c9932d
37m 37m 2 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Backoff Back-off restarting failed docker container
37m 37m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Started Started with docker id 6e40137716f4
37m 37m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Created Created with docker id 6e40137716f4
38m 37m 4 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Pulled Container image "registry.access.redhat.com/rhscl/postgresql-94-rhel7" already present on machine
[root@dhcp-128-91 today6]# oc get pods
NAME READY STATUS RESTARTS AGE
postgresql-master-1-1qu8i 1/1 Running 0 39m
postgresql-slave-1-0pp2f 1/1 Running 3 38m
[root@dhcp-128-91 today6]# oc logs postgresql-master-1-1qu8i
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.utf8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /var/lib/pgsql/data/userdata ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
creating template1 database in /var/lib/pgsql/data/userdata/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating collations ... ok
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
loading PL/pgSQL server-side language ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... ok
copying template1 to postgres ... ok
syncing data to disk ... ok
Success. You can now start the database server using:
postgres -D /var/lib/pgsql/data/userdata
or
pg_ctl -D /var/lib/pgsql/data/userdata -l logfile start
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
waiting for server to start....LOG: redirecting log output to logging collector process
HINT: Future log output will appear in directory "pg_log".
done
server started
waiting for server to shut down.... done
server stopped
waiting for server to start....LOG: redirecting log output to logging collector process
HINT: Future log output will appear in directory "pg_log".
done
server started
ALTER ROLE
ALTER ROLE
ALTER ROLE
ALTER ROLE
waiting for server to shut down.... done
server stopped
LOG: redirecting log output to logging collector process
HINT: Future log output will appear in directory "pg_log".
[root@dhcp-128-91 today6]# oc logs postgresql-slave-1-0pp2f
Initializing PostgreSQL slave ...
LOG: redirecting log output to logging collector process
HINT: Future log output will appear in directory "pg_log".
For postgresql-92-rhel7, pods also keeps in CrashLoopBackOff with below log: [root@openshift-124 ~]# docker logs 7944a6a2b897 waiting for server to start....FATAL: data directory "/var/lib/pgsql/data/userdata" has wrong ownership HINT: The server must be started by the user that owns the data directory. pg_ctl: could not start server Examine the log output. .... stopped waiting Image version: openshift3/postgresql-92-rhel7 c10e6b2e643e Below are my steps: 1. Create pv on master 2. Create project locally 3. oc process -f https://raw.githubusercontent.com/openshift/origin/master/examples/db-templates/postgresql-persistent-template.json | oc create -f - 4. Check the pods status $ oc get pods NAME READY STATUS RESTARTS AGE postgresql-master-1-qf6kg 0/1 CrashLoopBackOff 5 4m postgresql-slave-1-yssxp 0/1 CrashLoopBackOff 6 4m When I test rhscl/postgresql-94-rhel7 with nfs server ,still have the "CrashLoopBackOff" problem:
rhscl/postgresql-94-rhel7 image_id: 2425b18dbc45
[root@dhcp-128-91 today5]# oc new-app --image-stream=postgresql -e POSTGRESQL_USER=user -e POSTGRESQL_PASSWORD=pass -e POSTGRESQL_DATABASE=db
--> Found image 2425b18 (10 days old) in image stream "postgresql" under tag :latest for "postgresql"
* This image will be deployed in deployment config "postgresql"
* Port 5432/tcp will be load balanced by service "postgresql"
--> Creating resources with label app=postgresql ...
DeploymentConfig "postgresql" created
Service "postgresql" created
--> Success
Run 'oc status' to view your app.
[root@dhcp-128-91 today5]# oc get pods
NAME READY STATUS RESTARTS AGE
postgresql-1-deploy 1/1 Running 0 30s
postgresql-1-iujiv 0/1 Pending 0 26s
[root@dhcp-128-91 today5]# oc get pods
NAME READY STATUS RESTARTS AGE
postgresql-1-iujiv 1/1 Running 0 33s
[root@dhcp-128-91 today5]# oc env pod postgresql-1-iujiv --list
# pods postgresql-1-iujiv, container postgresql
POSTGRESQL_DATABASE=db
POSTGRESQL_PASSWORD=pass
POSTGRESQL_USER=user
[root@dhcp-128-91 today5]# oc rsh postgresql-1-iujiv
bash-4.2$ error: error executing remote command: Error executing command in container: API error (500): Container 04d40d223ef5e7944122ba15cb65ef55691a07340ad9cea2ef9004527d3453a4 is not running
[root@dhcp-128-91 today5]# oc get pods
NAME READY STATUS RESTARTS AGE
postgresql-1-iujiv 0/1 CrashLoopBackOff 3 1m
[root@dhcp-128-91 today5]# oc get pods
NAME READY STATUS RESTARTS AGE
postgresql-1-iujiv 0/1 CrashLoopBackOff 5 3m
[root@dhcp-128-91 today5]# oc get pods
NAME READY STATUS RESTARTS AGE
postgresql-1-iujiv 0/1 CrashLoopBackOff 6 5m
[root@dhcp-128-91 today5]# oc logs postgresql-1-iujiv
Error from server: Internal error occurred: Pod "postgresql-1-iujiv" in namespace "wewangproject1": container "postgresql" is in waiting state.
[root@dhcp-128-91 today5]# oc describe pod postgresql-1-iujiv
Name: postgresql-1-iujiv
Namespace: wewangproject1
Image(s): rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7:latest
Node: openshift-148.lab.sjc.redhat.com/10.14.6.148
Start Time: Mon, 02 Nov 2015 17:05:30 +0800
Labels: app=postgresql,deployment=postgresql-1,deploymentconfig=postgresql
Status: Running
Reason:
Message:
IP: 10.1.0.94
Replication Controllers: postgresql-1 (1/1 replicas created)
Containers:
postgresql:
Container ID: docker://b67a5b14bf79c0ebaee1ede78cc93711d560e5603833377b2d1a5b696b1a6b34
Image: rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7:latest
Image ID: docker://2425b18dbc4503bcc273eb43f3cbe72866da20fe0b0b6517a7bc2cf9e53ba569
QoS Tier:
memory: BestEffort
cpu: BestEffort
State: Waiting
Reason: CrashLoopBackOff
Last Termination State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 02 Nov 2015 17:09:18 +0800
Finished: Mon, 02 Nov 2015 17:09:23 +0800
Ready: False
Restart Count: 6
Environment Variables:
POSTGRESQL_DATABASE: db
POSTGRESQL_PASSWORD: pass
POSTGRESQL_USER: user
Conditions:
Type Status
Ready False
Volumes:
postgresql-volume-1:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-v59j3:
Type: Secret (a secret that should populate this volume)
SecretName: default-token-v59j3
Events:
FirstSeen LastSeen Count From SubobjectPath Reason Message
───────── ──────── ───── ──── ───────────── ────── ───────
6m 6m 1 {scheduler } Scheduled Successfully assigned postgresql-1-iujiv to openshift-148.lab.sjc.redhat.com
6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required container POD Pulled Container image "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on machine
6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required container POD Created Created with docker id e58b1c62bcca
6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required container POD Started Started with docker id e58b1c62bcca
5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Created Created with docker id 9ebdd63a45b6
5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Started Started with docker id 9ebdd63a45b6
5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Started Started with docker id e7d1dd561bfd
5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Created Created with docker id e7d1dd561bfd
5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Created Created with docker id 04d40d223ef5
5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Started Started with docker id 04d40d223ef5
4m 4m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Created Created with docker id 342c4c821e5c
4m 4m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Started Started with docker id 342c4c821e5c
3m 3m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Created Created with docker id 33c3d6085f78
3m 3m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Started Started with docker id 33c3d6085f78
5m 2m 6 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Pulling pulling image "rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7:latest"
2m 2m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Started Started with docker id b67a5b14bf79
5m 2m 6 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Pulled Successfully pulled image "rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7:latest"
2m 2m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Created Created with docker id b67a5b14bf79
5m 26s 23 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Backoff Back-off restarting failed docker container
[root@openshift-148 ~]# docker logs 25ada60d0c08 waiting for server to start....FATAL: data directory "/var/lib/pgsql/data/userdata" has group or world access DETAIL: Permissions should be u=rwx (0700). .... stopped waiting pg_ctl: could not start server Examine the log output. postgresql-92-rhel7 also have the problem:CrashLoopBackOff openshift3/postgresql-92-rhel7 image_id:c10e6b2e643e (In reply to wewang from comment #11) > When I test rhscl/postgresql-94-rhel7 witout nfs server ,still have the > "CrashLoopBackOff" problem: > > rhscl/postgresql-94-rhel7 image_id: 2425b18dbc45 > > [root@dhcp-128-91 today5]# oc new-app --image-stream=postgresql -e > POSTGRESQL_USER=user -e POSTGRESQL_PASSWORD=pass -e POSTGRESQL_DATABASE=db > --> Found image 2425b18 (10 days old) in image stream "postgresql" under tag > :latest for "postgresql" > * This image will be deployed in deployment config "postgresql" > * Port 5432/tcp will be load balanced by service "postgresql" > --> Creating resources with label app=postgresql ... > DeploymentConfig "postgresql" created > Service "postgresql" created > --> Success > Run 'oc status' to view your app. > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-deploy 1/1 Running 0 30s > postgresql-1-iujiv 0/1 Pending 0 26s > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 1/1 Running 0 33s > [root@dhcp-128-91 today5]# oc env pod postgresql-1-iujiv --list > # pods postgresql-1-iujiv, container postgresql > POSTGRESQL_DATABASE=db > POSTGRESQL_PASSWORD=pass > POSTGRESQL_USER=user > [root@dhcp-128-91 today5]# oc rsh postgresql-1-iujiv > bash-4.2$ error: error executing remote command: Error executing command in > container: API error (500): Container > 04d40d223ef5e7944122ba15cb65ef55691a07340ad9cea2ef9004527d3453a4 is not > running > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 0/1 CrashLoopBackOff 3 1m > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 0/1 CrashLoopBackOff 5 3m > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 0/1 CrashLoopBackOff 6 5m > > [root@dhcp-128-91 today5]# oc logs postgresql-1-iujiv > Error from server: Internal error occurred: Pod "postgresql-1-iujiv" in > namespace "wewangproject1": container "postgresql" is in waiting state. > > [root@dhcp-128-91 today5]# oc describe pod postgresql-1-iujiv > Name: postgresql-1-iujiv > Namespace: wewangproject1 > Image(s): > rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest > Node: openshift-148.lab.sjc.redhat.com/10.14.6.148 > Start Time: Mon, 02 Nov 2015 17:05:30 +0800 > Labels: app=postgresql,deployment=postgresql-1,deploymentconfig=postgresql > Status: Running > Reason: > Message: > IP: 10.1.0.94 > Replication Controllers: postgresql-1 (1/1 replicas created) > Containers: > postgresql: > Container ID: > docker://b67a5b14bf79c0ebaee1ede78cc93711d560e5603833377b2d1a5b696b1a6b34 > Image: > rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest > Image ID: > docker://2425b18dbc4503bcc273eb43f3cbe72866da20fe0b0b6517a7bc2cf9e53ba569 > QoS Tier: > memory: BestEffort > cpu: BestEffort > State: Waiting > Reason: CrashLoopBackOff > Last Termination State: Terminated > Reason: Error > Exit Code: 1 > Started: Mon, 02 Nov 2015 17:09:18 +0800 > Finished: Mon, 02 Nov 2015 17:09:23 +0800 > Ready: False > Restart Count: 6 > Environment Variables: > POSTGRESQL_DATABASE: db > POSTGRESQL_PASSWORD: pass > POSTGRESQL_USER: user > Conditions: > Type Status > Ready False > Volumes: > postgresql-volume-1: > Type: EmptyDir (a temporary directory that shares a pod's lifetime) > Medium: > default-token-v59j3: > Type: Secret (a secret that should populate this volume) > SecretName: default-token-v59j3 > Events: > FirstSeen LastSeen Count From SubobjectPath Reason Message > ───────── ──────── ───── ──── ───────────── ────── ─────── > 6m 6m 1 {scheduler } Scheduled Successfully assigned > postgresql-1-iujiv to openshift-148.lab.sjc.redhat.com > 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required > container POD Pulled Container image > "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on > machine > 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required > container POD Created Created with docker id e58b1c62bcca > 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required > container POD Started Started with docker id e58b1c62bcca > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 9ebdd63a45b6 > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 9ebdd63a45b6 > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id e7d1dd561bfd > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id e7d1dd561bfd > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 04d40d223ef5 > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 04d40d223ef5 > 4m 4m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 342c4c821e5c > 4m 4m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 342c4c821e5c > 3m 3m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 33c3d6085f78 > 3m 3m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 33c3d6085f78 > 5m 2m 6 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Pulling pulling image > "rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest" > 2m 2m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id b67a5b14bf79 > 5m 2m 6 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Pulled Successfully pulled image > "rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest" > 2m 2m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id b67a5b14bf79 > 5m 26s 23 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Backoff Back-off restarting failed docker > container (In reply to wewang from comment #11) > When I test rhscl/postgresql-94-rhel7 with nfs server ,still have the > "CrashLoopBackOff" problem: > > rhscl/postgresql-94-rhel7 image_id: 2425b18dbc45 > > [root@dhcp-128-91 today5]# oc new-app --image-stream=postgresql -e > POSTGRESQL_USER=user -e POSTGRESQL_PASSWORD=pass -e POSTGRESQL_DATABASE=db > --> Found image 2425b18 (10 days old) in image stream "postgresql" under tag > :latest for "postgresql" > * This image will be deployed in deployment config "postgresql" > * Port 5432/tcp will be load balanced by service "postgresql" > --> Creating resources with label app=postgresql ... > DeploymentConfig "postgresql" created > Service "postgresql" created > --> Success > Run 'oc status' to view your app. > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-deploy 1/1 Running 0 30s > postgresql-1-iujiv 0/1 Pending 0 26s > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 1/1 Running 0 33s > [root@dhcp-128-91 today5]# oc env pod postgresql-1-iujiv --list > # pods postgresql-1-iujiv, container postgresql > POSTGRESQL_DATABASE=db > POSTGRESQL_PASSWORD=pass > POSTGRESQL_USER=user > [root@dhcp-128-91 today5]# oc rsh postgresql-1-iujiv > bash-4.2$ error: error executing remote command: Error executing command in > container: API error (500): Container > 04d40d223ef5e7944122ba15cb65ef55691a07340ad9cea2ef9004527d3453a4 is not > running > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 0/1 CrashLoopBackOff 3 1m > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 0/1 CrashLoopBackOff 5 3m > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 0/1 CrashLoopBackOff 6 5m > > [root@dhcp-128-91 today5]# oc logs postgresql-1-iujiv > Error from server: Internal error occurred: Pod "postgresql-1-iujiv" in > namespace "wewangproject1": container "postgresql" is in waiting state. > > [root@dhcp-128-91 today5]# oc describe pod postgresql-1-iujiv > Name: postgresql-1-iujiv > Namespace: wewangproject1 > Image(s): > rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest > Node: openshift-148.lab.sjc.redhat.com/10.14.6.148 > Start Time: Mon, 02 Nov 2015 17:05:30 +0800 > Labels: app=postgresql,deployment=postgresql-1,deploymentconfig=postgresql > Status: Running > Reason: > Message: > IP: 10.1.0.94 > Replication Controllers: postgresql-1 (1/1 replicas created) > Containers: > postgresql: > Container ID: > docker://b67a5b14bf79c0ebaee1ede78cc93711d560e5603833377b2d1a5b696b1a6b34 > Image: > rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest > Image ID: > docker://2425b18dbc4503bcc273eb43f3cbe72866da20fe0b0b6517a7bc2cf9e53ba569 > QoS Tier: > memory: BestEffort > cpu: BestEffort > State: Waiting > Reason: CrashLoopBackOff > Last Termination State: Terminated > Reason: Error > Exit Code: 1 > Started: Mon, 02 Nov 2015 17:09:18 +0800 > Finished: Mon, 02 Nov 2015 17:09:23 +0800 > Ready: False > Restart Count: 6 > Environment Variables: > POSTGRESQL_DATABASE: db > POSTGRESQL_PASSWORD: pass > POSTGRESQL_USER: user > Conditions: > Type Status > Ready False > Volumes: > postgresql-volume-1: > Type: EmptyDir (a temporary directory that shares a pod's lifetime) > Medium: > default-token-v59j3: > Type: Secret (a secret that should populate this volume) > SecretName: default-token-v59j3 > Events: > FirstSeen LastSeen Count From SubobjectPath Reason Message > ───────── ──────── ───── ──── ───────────── ────── ─────── > 6m 6m 1 {scheduler } Scheduled Successfully assigned > postgresql-1-iujiv to openshift-148.lab.sjc.redhat.com > 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required > container POD Pulled Container image > "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on > machine > 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required > container POD Created Created with docker id e58b1c62bcca > 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required > container POD Started Started with docker id e58b1c62bcca > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 9ebdd63a45b6 > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 9ebdd63a45b6 > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id e7d1dd561bfd > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id e7d1dd561bfd > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 04d40d223ef5 > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 04d40d223ef5 > 4m 4m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 342c4c821e5c > 4m 4m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 342c4c821e5c > 3m 3m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 33c3d6085f78 > 3m 3m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 33c3d6085f78 > 5m 2m 6 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Pulling pulling image > "rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest" > 2m 2m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id b67a5b14bf79 > 5m 2m 6 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Pulled Successfully pulled image > "rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest" > 2m 2m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id b67a5b14bf79 > 5m 26s 23 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Backoff Back-off restarting failed docker > container sorry, comments should be:When I test rhscl/postgresql-94-rhel7 without nfs server ,still have the "CrashLoopBackOff" problem: # docker logs 25ada60d0c08 waiting for server to start.... FATAL: data directory "/var/lib/pgsql/data/userdata" has group or world access DETAIL: Permissions should be u=rwx (0700). .... stopped waiting pg_ctl: could not start server Examine the log output. Says that the NFS server has incorrect permissions. Not an AE/OS problem. The NFS export must be owned by postgres and must have 700 permissions... https://github.com/openshift/openshift-docs/pull/1123/files#diff-2ae4205bccc596fefd833f7d63d81bb5R193 however says 777, which doesn't seem as though it will work here. 777 should be ok for the mount dir, postgres should be creating its own files under that dir with 700 ownership. @ben fair enough, I see that the mount point in the rc is /va/rlib/pgsql/data and postgres isn't complaining about that one. However clearly /var/lib/pgsql/data/userdata is not being created 700. I believe that leaves 2 possibilities. 1) postgres has a bug and is creating it incorrectly (seems unlikely) 2) /var/lib/pgsql/data/userdate was created (incorrectly) in the PV before you tried to use that PV here. To resolve #2 I would suggest either: a) delete everything the PV, chown the root dir postgress, retry b) chown -R postgres; chmod -R 7000; retry. # cat /etc/exports
/mnt/nfs-export/postgres *(rw)
# systemctl status nfs-server
nfs-server.service - NFS server and services
[snip]
Main PID: 2157 (code=exited, status=0/SUCCESS)
[snip]
# showmount -e localhost
Export list for localhost:
/mnt/nfs-export/postgres *
# ls -lad /mnt/nfs-export/postgres/
drwxrwxrwx. 2 root root 6 Nov 2 16:16 /mnt/nfs-export/postgres/
# ls -la /mnt/nfs-export/postgres/
total 0
drwxrwxrwx. 2 root root 6 Nov 2 16:16 .
drwxr-xr-x. 3 root root 21 Nov 2 14:32 ..
# cat pv.yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: postgres-data-pv
spec:
capacity:
storage: 5Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
nfs:
path: /mnt/nfs-export/postgres/
server: 127.0.0.1
# oc create -f pv.yaml
# wget https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/image/db-templates/postgresql-replica-rhel7.json
# sed -i -e 's|/rhscl/|/rhscl_beta/|' postgresql-replica-rhel7.json
# oc new-app postgresql-replica-rhel7.json
# oc get pod
NAME READY STATUS RESTARTS AGE
postgresql-master-1-fjz17 1/1 Running 0 2m
postgresql-slave-1-xqj6k 1/1 Running 2 2m
Note that my NFS export is world writable. It must either be world writable or writable by the user assigned to this project. Note also that my NFS export is EMPTY. You cannot have old files laying around in there. All this to say, I believe this is working properly. Make sure your NFS mount is clean and writable by the eventual ssc user and it works. our creating share file steps are follow: mkdir /nfs chown -R /nfs nfsnobody:nfsnobody chmod 777 /nfs echo '/nfs *(rw)' >> /etc/exports exportfs -a setsebool -P virt_use_nfs 1 The slave image in that template does not even use persistent storage, it uses emptydir storage. also the image referenced in that template (registry.access.redhat.com/rhscl/postgresql-94-rhel7) is not actually available. So i'm having trouble following the recreate steps on this. However like Eric, I was able to setup an openshift cluster with an NFS persistent volume and, after building the latest postgresql-94-rhel7 image locally and editing the template to use the local image, was able to start up the master and slave pods without issue. I think we need to simplify this. Are you able to run the postgresql image standalone (not using the replica template)? lowering severity as: 1) it's associated with replication scenarios which are samples and not part of the product 2) it appears to be working correctly in 2 other environments. @ben, you'll see I had to: sed -i -e 's|/rhscl/|/rhscl_beta/|' postgresql-replica-rhel7.json instead of rebuild the image myself. But yeah.... @ben , when we test on OSE env, will pull the image from rcm and make a tag to registy, so the image are correct.I test this today, it works fine. see the comments here. :https://bugzilla.redhat.com/show_bug.cgi?id=1260571#c26 , you can mark this bug to ON_QA . Thanks! Based on: [root@openshift-152 ~]# docker logs 141b952577b4 /usr/bin/container-entrypoint: line 3: exec: postgres-master: not found Seems like the rhscl postgres 9.4 image is missing the postgres-master script. SCL team owns this. (In reply to Ben Parees from comment #33) > Based on: > [root@openshift-152 ~]# docker logs 141b952577b4 > /usr/bin/container-entrypoint: line 3: exec: postgres-master: not found > > Seems like the rhscl postgres 9.4 image is missing the postgres-master > script. > > SCL team owns this. This sounds like different issue. What code exactly depends on 'postgres-master' command? This has been changed as part of https://github.com/openshift/postgresql/pull/77 while the documented (and default) command should be 'run-postgresql-master'. (In reply to Pavel Raiskup from comment #34) > This sounds like different issue. I meant -- different than the original BZ number purpose. Probably would be worth having new bug for it? > What code exactly depends on 'postgres-master' command? This has been > changed as part of https://github.com/openshift/postgresql/pull/77 while the > documented (and default) command should be 'run-postgresql-master'. Ping? Do we need to restore postgres-master executable (as a symlink to run-postgresql-master)? Adding needinfo for Wen -- can you provide the info requesting by Pavel in comment #34? It seems to me like something is running with old replication configuration. From https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/image/db-templates/postgresql-replica-rhel7.json | "containers": [ | { | "name": "postgresql-master", | "image": "registry.access.redhat.com/rhscl/postgresql-94-rhel7", | "args": [ | "postgres-master" | ], Newly you should use "args": ["run-postgresql-master"] or: "args": ["run-postgresql-slave"] I used :
"image": "ci.dev.openshift.redhat.com:5000/rhscl/postgresql-94-rhel7",
"args": [
"run-postgresql-master"
],
default slave context is :
"image": "ci.dev.openshift.redhat.com:5000/rhscl/postgresql-94-rhel7",
"args": [
"postgres-slave"
],
I updated like this or not ,also not working
"args": [
"run-postgresql-slave"
],
(In reply to wewang from comment #43) > I updated like this or not ,also not working > "args": [ > "run-postgresql-slave" > ], What is the result of having it set like this ^^? Anyway, the file ttps://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/image/db-templates/postgresql-replica-rhel7.json still does not have this fix, right? Sorry, wrong bug. About ""run-postgresql-slave",you can see Comment #40 the errors are as follow: # docker ps -a |grep postgresql-slave-1-722lm ffa90272efab ci.dev.openshift.redhat.com:5000/rhscl/postgresql-94-rhel7 "container-entrypoint" 33 seconds ago Exited (1) 32 seconds ago k8s_postgresql-slave.b2787d91_postgresql-slave-1-722lm_wewang3_8e88577e-8840-11e5-88a6-0eeb114f7c11_45532283 d210f3b3b950 ci.dev.openshift.redhat.com:5000/rhscl/postgresql-94-rhel7 "container-entrypoint" About a minute ago Exited (1) About a minute ago k8s_postgresql-slave.b2787d91_postgresql-slave-1-722lm_wewang3_8e88577e-8840-11e5-88a6-0eeb114f7c11_feafd32f c1ec106d139e ci.dev.openshift.redhat.com:5000/rhscl/postgresql-94-rhel7 "container-entrypoint" About a minute ago Exited (1) About a minute ago k8s_postgresql-slave.b2787d91_postgresql-slave-1-722lm_wewang3_8e88577e-8840-11e5-88a6-0eeb114f7c11_02538bd3 b2fa2e6fc745 openshift/origin-pod:latest "/pod" 2 minutes ago Up 2 minutes k8s_POD.da11fee8_postgresql-slave-1-722lm_wewang3_8e88577e-8840-11e5-88a6-0eeb114f7c11_aeaab028 # docker logs ffa90272efab Initializing PostgreSQL slave ... pg_basebackup: directory "/var/lib/pgsql/data/userdata" exists but is not empty Today I test in aep again, I used "args": ["run-postgresql-master"] and : "args": ["run-postgresql-slave"] version of psql openshift3/postgresql-92-rhel7 59a40b28bfc5 [root@dhcp-128-91 aep]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-master-1-hkx1x 1/1 Running 0 47s postgresql-slave-1-na8ng 0/1 CrashLoopBackOff 2 47s # oc logs postgresql-master-1-hkx1x waiting for server to start....FATAL: data directory "/var/lib/pgsql/data/userdata" has wrong ownership HINT: The server must be started by the user that owns the data directory. pg_ctl: could not start server Examine the log output. .... stopped waiting I'm working on a fix already, taking this (hope you don't mind) This should work after https://github.com/openshift/postgresql/pull/82 is merged and with latest OpenShift master. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2015:2580 |