Version-Release number of selected component (if applicable): openshift v3.0.2.903-114-g2849767 kubernetes v1.2.0-alpha.1-1107-g4c8e6f4 rhscl/postgresql-94-rhel7 2425b18dbc45 How reproducible: always Steps to Reproduce: 1. Create nfs server 2. Creat pv in master 3. Create project 4. Create resources oc new-app https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/image/db-templates/postgresql-replica-rhel7.json 4. Check the pod [root@dhcp-128-91 test]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-master-1-tmr7g 0/1 CrashLoopBackOff 7 12m postgresql-slave-1-pbiwc 0/1 CrashLoopBackOff 8 12m Actual results: pod is CrashLoopBackOff Expected results: pod is ruuning
Need more info to troubleshoot this. Are there failed docker containers that indicate why the pod is crashing? Anything in oc get events for the pod?
The problem can be reproduced sometimes, and now I test in aep ,happened similar problem version: AEP 3.1 - FCC2 - RPM installation openshift v3.0.2.905 kubernetes v1.2.0-alpha.1-1107-g4c8e6f4 etcd 2.1.2 [root@dhcp-128-91 today6]# oc process -f postgresql-replica-rhel7.json| oc create -f - persistentvolumeclaim "postgresql-data-claim" created service "postgresql-master" created service "postgresql-slave" created deploymentconfig "postgresql-master" created deploymentconfig "postgresql-slave" created [root@dhcp-128-91 today6]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-master-1-1qu8i 1/1 Running 0 36s postgresql-slave-1-0pp2f 1/1 Running 0 31s postgresql-slave-1-deploy 1/1 Running 0 39s [root@dhcp-128-91 today6]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-master-1-1qu8i 1/1 Running 0 57s postgresql-slave-1-0pp2f 1/1 Running 2 52s [root@dhcp-128-91 today6]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-master-1-1qu8i 1/1 Running 0 1m postgresql-slave-1-0pp2f 0/1 CrashLoopBackOff 3 1m [root@dhcp-128-91 today6]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-master-1-1qu8i 1/1 Running 0 1m postgresql-slave-1-0pp2f 1/1 Running 3 1m [root@dhcp-128-91 today6]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-master-1-1qu8i 1/1 Running 0 1m postgresql-slave-1-0pp2f 1/1 Running 3 1m [root@dhcp-128-91 today6]# oc get event FIRSTSEEN LASTSEEN COUNT NAME KIND SUBOBJECT REASON SOURCE MESSAGE 2m 2m 1 postgresql-master-1-1qu8i Pod Scheduled {scheduler } Successfully assigned postgresql-master-1-1qu8i to openshift-145.lab.eng.nay.redhat.com 2m 2m 1 postgresql-master-1-1qu8i Pod implicitly required container POD Pulled {kubelet openshift-145.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on machine 2m 2m 1 postgresql-master-1-1qu8i Pod implicitly required container POD Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id 7f8d3cc1c5ae 2m 2m 1 postgresql-master-1-1qu8i Pod implicitly required container POD Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id 7f8d3cc1c5ae 2m 2m 1 postgresql-master-1-1qu8i Pod spec.containers{postgresql-master} Pulled {kubelet openshift-145.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/rhscl/postgresql-94-rhel7" already present on machine 2m 2m 1 postgresql-master-1-1qu8i Pod spec.containers{postgresql-master} Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id 3f26ef152238 2m 2m 1 postgresql-master-1-1qu8i Pod spec.containers{postgresql-master} Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id 3f26ef152238 2m 2m 1 postgresql-master-1-deploy Pod Scheduled {scheduler } Successfully assigned postgresql-master-1-deploy to openshift-146.lab.eng.nay.redhat.com 2m 2m 1 postgresql-master-1-deploy Pod implicitly required container POD Pulled {kubelet openshift-146.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on machine 2m 2m 1 postgresql-master-1-deploy Pod implicitly required container POD Created {kubelet openshift-146.lab.eng.nay.redhat.com} Created with docker id ef18b0585d61 2m 2m 1 postgresql-master-1-deploy Pod implicitly required container POD Started {kubelet openshift-146.lab.eng.nay.redhat.com} Started with docker id ef18b0585d61 2m 2m 1 postgresql-master-1-deploy Pod spec.containers{deployment} Pulled {kubelet openshift-146.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/aep3_beta/aep-deployer:v3.0.2.905" already present on machine 2m 2m 1 postgresql-master-1-deploy Pod spec.containers{deployment} Created {kubelet openshift-146.lab.eng.nay.redhat.com} Created with docker id c1d8075cf4a7 2m 2m 1 postgresql-master-1-deploy Pod spec.containers{deployment} Started {kubelet openshift-146.lab.eng.nay.redhat.com} Started with docker id c1d8075cf4a7 1m 1m 1 postgresql-master-1-deploy Pod implicitly required container POD Killing {kubelet openshift-146.lab.eng.nay.redhat.com} Killing with docker id ef18b0585d61 1m 1m 1 postgresql-master-1-deploy Pod FailedSync {kubelet openshift-146.lab.eng.nay.redhat.com} Error syncing pod, skipping: failed to delete containers ([exit status 1]) 2m 2m 1 postgresql-master-1 ReplicationController failedUpdate {deployer } Error updating deployment wewangp3/postgresql-master-1 status to Pending 2m 2m 1 postgresql-master-1 ReplicationController SuccessfulCreate {replication-controller } Created pod: postgresql-master-1-1qu8i 1m 1m 1 postgresql-slave-1-0pp2f Pod Scheduled {scheduler } Successfully assigned postgresql-slave-1-0pp2f to openshift-145.lab.eng.nay.redhat.com 1m 1m 1 postgresql-slave-1-0pp2f Pod implicitly required container POD Pulled {kubelet openshift-145.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on machine 1m 1m 1 postgresql-slave-1-0pp2f Pod implicitly required container POD Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id 29d7e948d0e7 1m 1m 1 postgresql-slave-1-0pp2f Pod implicitly required container POD Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id 29d7e948d0e7 1m 1m 4 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Pulled {kubelet openshift-145.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/rhscl/postgresql-94-rhel7" already present on machine 1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id acd90fe1c3df 1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id acd90fe1c3df 1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id 0e0fbea392e1 1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id 0e0fbea392e1 1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id afe099c9932d 1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id afe099c9932d 1m 1m 2 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Backoff {kubelet openshift-145.lab.eng.nay.redhat.com} Back-off restarting failed docker container 1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id 6e40137716f4 1m 1m 1 postgresql-slave-1-0pp2f Pod spec.containers{postgresql-slave} Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id 6e40137716f4 2m 2m 1 postgresql-slave-1-deploy Pod Scheduled {scheduler } Successfully assigned postgresql-slave-1-deploy to openshift-145.lab.eng.nay.redhat.com 2m 2m 1 postgresql-slave-1-deploy Pod implicitly required container POD Pulled {kubelet openshift-145.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on machine 2m 2m 1 postgresql-slave-1-deploy Pod implicitly required container POD Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id 371cd050eca9 2m 2m 1 postgresql-slave-1-deploy Pod implicitly required container POD Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id 371cd050eca9 2m 2m 1 postgresql-slave-1-deploy Pod spec.containers{deployment} Pulled {kubelet openshift-145.lab.eng.nay.redhat.com} Container image "registry.access.redhat.com/aep3_beta/aep-deployer:v3.0.2.905" already present on machine 2m 2m 1 postgresql-slave-1-deploy Pod spec.containers{deployment} Created {kubelet openshift-145.lab.eng.nay.redhat.com} Created with docker id 926a6d2d9683 2m 2m 1 postgresql-slave-1-deploy Pod spec.containers{deployment} Started {kubelet openshift-145.lab.eng.nay.redhat.com} Started with docker id 926a6d2d9683 1m 1m 1 postgresql-slave-1-deploy Pod implicitly required container POD Killing {kubelet openshift-145.lab.eng.nay.redhat.com} Killing with docker id 371cd050eca9 1m 1m 1 postgresql-slave-1-deploy Pod FailedSync {kubelet openshift-145.lab.eng.nay.redhat.com} Error syncing pod, skipping: failed to delete containers ([exit status 1]) 2m 2m 1 postgresql-slave-1 ReplicationController failedUpdate {deployer } Error updating deployment wewangp3/postgresql-slave-1 status to Pending 1m 1m 1 postgresql-slave-1 ReplicationController SuccessfulCreate {replication-controller } Created pod: postgresql-slave-1-0pp2f [root@dhcp-128-91 today6]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-master-1-1qu8i 1/1 Running 0 2m postgresql-slave-1-0pp2f 1/1 Running 3 2m [root@dhcp-128-91 today6]# oc env pod postgresql-master-1-1qu8i --list # pods postgresql-master-1-1qu8i, container postgresql-master POSTGRESQL_MASTER_USER=master POSTGRESQL_MASTER_PASSWORD=lRXDHiSFbtal POSTGRESQL_USER=user POSTGRESQL_PASSWORD=FvTFTrUdsTgI POSTGRESQL_DATABASE=userdb POSTGRESQL_ADMIN_PASSWORD=8fLbnnHY6dNR [root@dhcp-128-91 today6]# oc rsh postgresql-master-1-1qu8i bash-4.2$ psql -h postgresql-master-1-1qu8i -d userdb -U user Password for user user: psql (9.4.5) Type "help" for help. userdb=> CREATE TABLE tbl (col1 VARCHAR(20), col2 VARCHAR(20)); CREATE TABLE userdb=> INSERT INTO tbl VALUES ('foo1', 'bar1'); INSERT 0 1 userdb=> SELECT * FROM tbl; col1 | col2 ------+------ foo1 | bar1 (1 row) userdb=> \q bash-4.2$ exit exit [root@dhcp-128-91 today6]# oc rsh postgresql-slave-1-0pp2f bash-4.2$ psql -h postgresql-master-1-1qu8i -d userdb -U user psql: could not translate host name "postgresql-master-1-1qu8i" to address: Name or service not known bash-4.2$
rhscl/postgresql-94-rhel7 image_id:2425b18dbc45
There are some logs ,hope to help you to track the problem: [root@dhcp-128-91 today6]# oc describe pod postgresql-slave-1-0pp2f Name: postgresql-slave-1-0pp2f Namespace: wewangp3 Image(s): registry.access.redhat.com/rhscl/postgresql-94-rhel7 Node: openshift-145.lab.eng.nay.redhat.com/10.66.79.145 Start Time: Mon, 02 Nov 2015 15:01:56 +0800 Labels: deployment=postgresql-slave-1,deploymentconfig=postgresql-slave,name=postgresql-slave Status: Running Reason: Message: IP: 10.1.2.90 Replication Controllers: postgresql-slave-1 (1/1 replicas created) Containers: postgresql-slave: Container ID: docker://6e40137716f483fb022624d407c6f617ab199fff43f10c0dd48cb459a49b179a Image: registry.access.redhat.com/rhscl/postgresql-94-rhel7 Image ID: docker://2425b18dbc4503bcc273eb43f3cbe72866da20fe0b0b6517a7bc2cf9e53ba569 QoS Tier: memory: BestEffort cpu: BestEffort State: Running Started: Mon, 02 Nov 2015 15:02:49 +0800 Last Termination State: Terminated Reason: Error Exit Code: 1 Started: Mon, 02 Nov 2015 15:02:19 +0800 Finished: Mon, 02 Nov 2015 15:02:19 +0800 Ready: True Restart Count: 3 Environment Variables: POSTGRESQL_MASTER_SERVICE_NAME: postgresql-master POSTGRESQL_MASTER_USER: master POSTGRESQL_MASTER_PASSWORD: lRXDHiSFbtal POSTGRESQL_USER: user POSTGRESQL_PASSWORD: FvTFTrUdsTgI POSTGRESQL_DATABASE: userdb Conditions: Type Status Ready True Volumes: postgresql-data: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: default-token-39vmg: Type: Secret (a secret that should populate this volume) SecretName: default-token-39vmg Events: FirstSeen LastSeen Count From SubobjectPath Reason Message ───────── ──────── ───── ──── ───────────── ────── ─────── 38m 38m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} implicitly required container POD Pulled Container image "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on machine 38m 38m 1 {scheduler } Scheduled Successfully assigned postgresql-slave-1-0pp2f to openshift-145.lab.eng.nay.redhat.com 38m 38m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} implicitly required container POD Created Created with docker id 29d7e948d0e7 38m 38m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} implicitly required container POD Started Started with docker id 29d7e948d0e7 38m 38m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Started Started with docker id acd90fe1c3df 38m 38m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Created Created with docker id acd90fe1c3df 37m 37m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Created Created with docker id 0e0fbea392e1 37m 37m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Started Started with docker id 0e0fbea392e1 37m 37m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Created Created with docker id afe099c9932d 37m 37m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Started Started with docker id afe099c9932d 37m 37m 2 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Backoff Back-off restarting failed docker container 37m 37m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Started Started with docker id 6e40137716f4 37m 37m 1 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Created Created with docker id 6e40137716f4 38m 37m 4 {kubelet openshift-145.lab.eng.nay.redhat.com} spec.containers{postgresql-slave} Pulled Container image "registry.access.redhat.com/rhscl/postgresql-94-rhel7" already present on machine [root@dhcp-128-91 today6]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-master-1-1qu8i 1/1 Running 0 39m postgresql-slave-1-0pp2f 1/1 Running 3 38m [root@dhcp-128-91 today6]# oc logs postgresql-master-1-1qu8i The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale "en_US.utf8". The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". Data page checksums are disabled. fixing permissions on existing directory /var/lib/pgsql/data/userdata ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok creating template1 database in /var/lib/pgsql/data/userdata/base/1 ... ok initializing pg_authid ... ok initializing dependencies ... ok creating system views ... ok loading system objects' descriptions ... ok creating collations ... ok creating conversions ... ok creating dictionaries ... ok setting privileges on built-in objects ... ok creating information schema ... ok loading PL/pgSQL server-side language ... ok vacuuming database template1 ... ok copying template1 to template0 ... ok copying template1 to postgres ... ok syncing data to disk ... ok Success. You can now start the database server using: postgres -D /var/lib/pgsql/data/userdata or pg_ctl -D /var/lib/pgsql/data/userdata -l logfile start WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. waiting for server to start....LOG: redirecting log output to logging collector process HINT: Future log output will appear in directory "pg_log". done server started waiting for server to shut down.... done server stopped waiting for server to start....LOG: redirecting log output to logging collector process HINT: Future log output will appear in directory "pg_log". done server started ALTER ROLE ALTER ROLE ALTER ROLE ALTER ROLE waiting for server to shut down.... done server stopped LOG: redirecting log output to logging collector process HINT: Future log output will appear in directory "pg_log". [root@dhcp-128-91 today6]# oc logs postgresql-slave-1-0pp2f Initializing PostgreSQL slave ... LOG: redirecting log output to logging collector process HINT: Future log output will appear in directory "pg_log".
For postgresql-92-rhel7, pods also keeps in CrashLoopBackOff with below log: [root@openshift-124 ~]# docker logs 7944a6a2b897 waiting for server to start....FATAL: data directory "/var/lib/pgsql/data/userdata" has wrong ownership HINT: The server must be started by the user that owns the data directory. pg_ctl: could not start server Examine the log output. .... stopped waiting Image version: openshift3/postgresql-92-rhel7 c10e6b2e643e Below are my steps: 1. Create pv on master 2. Create project locally 3. oc process -f https://raw.githubusercontent.com/openshift/origin/master/examples/db-templates/postgresql-persistent-template.json | oc create -f - 4. Check the pods status $ oc get pods NAME READY STATUS RESTARTS AGE postgresql-master-1-qf6kg 0/1 CrashLoopBackOff 5 4m postgresql-slave-1-yssxp 0/1 CrashLoopBackOff 6 4m
When I test rhscl/postgresql-94-rhel7 with nfs server ,still have the "CrashLoopBackOff" problem: rhscl/postgresql-94-rhel7 image_id: 2425b18dbc45 [root@dhcp-128-91 today5]# oc new-app --image-stream=postgresql -e POSTGRESQL_USER=user -e POSTGRESQL_PASSWORD=pass -e POSTGRESQL_DATABASE=db --> Found image 2425b18 (10 days old) in image stream "postgresql" under tag :latest for "postgresql" * This image will be deployed in deployment config "postgresql" * Port 5432/tcp will be load balanced by service "postgresql" --> Creating resources with label app=postgresql ... DeploymentConfig "postgresql" created Service "postgresql" created --> Success Run 'oc status' to view your app. [root@dhcp-128-91 today5]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-1-deploy 1/1 Running 0 30s postgresql-1-iujiv 0/1 Pending 0 26s [root@dhcp-128-91 today5]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-1-iujiv 1/1 Running 0 33s [root@dhcp-128-91 today5]# oc env pod postgresql-1-iujiv --list # pods postgresql-1-iujiv, container postgresql POSTGRESQL_DATABASE=db POSTGRESQL_PASSWORD=pass POSTGRESQL_USER=user [root@dhcp-128-91 today5]# oc rsh postgresql-1-iujiv bash-4.2$ error: error executing remote command: Error executing command in container: API error (500): Container 04d40d223ef5e7944122ba15cb65ef55691a07340ad9cea2ef9004527d3453a4 is not running [root@dhcp-128-91 today5]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-1-iujiv 0/1 CrashLoopBackOff 3 1m [root@dhcp-128-91 today5]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-1-iujiv 0/1 CrashLoopBackOff 5 3m [root@dhcp-128-91 today5]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-1-iujiv 0/1 CrashLoopBackOff 6 5m [root@dhcp-128-91 today5]# oc logs postgresql-1-iujiv Error from server: Internal error occurred: Pod "postgresql-1-iujiv" in namespace "wewangproject1": container "postgresql" is in waiting state. [root@dhcp-128-91 today5]# oc describe pod postgresql-1-iujiv Name: postgresql-1-iujiv Namespace: wewangproject1 Image(s): rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7:latest Node: openshift-148.lab.sjc.redhat.com/10.14.6.148 Start Time: Mon, 02 Nov 2015 17:05:30 +0800 Labels: app=postgresql,deployment=postgresql-1,deploymentconfig=postgresql Status: Running Reason: Message: IP: 10.1.0.94 Replication Controllers: postgresql-1 (1/1 replicas created) Containers: postgresql: Container ID: docker://b67a5b14bf79c0ebaee1ede78cc93711d560e5603833377b2d1a5b696b1a6b34 Image: rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7:latest Image ID: docker://2425b18dbc4503bcc273eb43f3cbe72866da20fe0b0b6517a7bc2cf9e53ba569 QoS Tier: memory: BestEffort cpu: BestEffort State: Waiting Reason: CrashLoopBackOff Last Termination State: Terminated Reason: Error Exit Code: 1 Started: Mon, 02 Nov 2015 17:09:18 +0800 Finished: Mon, 02 Nov 2015 17:09:23 +0800 Ready: False Restart Count: 6 Environment Variables: POSTGRESQL_DATABASE: db POSTGRESQL_PASSWORD: pass POSTGRESQL_USER: user Conditions: Type Status Ready False Volumes: postgresql-volume-1: Type: EmptyDir (a temporary directory that shares a pod's lifetime) Medium: default-token-v59j3: Type: Secret (a secret that should populate this volume) SecretName: default-token-v59j3 Events: FirstSeen LastSeen Count From SubobjectPath Reason Message ───────── ──────── ───── ──── ───────────── ────── ─────── 6m 6m 1 {scheduler } Scheduled Successfully assigned postgresql-1-iujiv to openshift-148.lab.sjc.redhat.com 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required container POD Pulled Container image "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on machine 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required container POD Created Created with docker id e58b1c62bcca 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required container POD Started Started with docker id e58b1c62bcca 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Created Created with docker id 9ebdd63a45b6 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Started Started with docker id 9ebdd63a45b6 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Started Started with docker id e7d1dd561bfd 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Created Created with docker id e7d1dd561bfd 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Created Created with docker id 04d40d223ef5 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Started Started with docker id 04d40d223ef5 4m 4m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Created Created with docker id 342c4c821e5c 4m 4m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Started Started with docker id 342c4c821e5c 3m 3m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Created Created with docker id 33c3d6085f78 3m 3m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Started Started with docker id 33c3d6085f78 5m 2m 6 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Pulling pulling image "rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7:latest" 2m 2m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Started Started with docker id b67a5b14bf79 5m 2m 6 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Pulled Successfully pulled image "rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7:latest" 2m 2m 1 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Created Created with docker id b67a5b14bf79 5m 26s 23 {kubelet openshift-148.lab.sjc.redhat.com} spec.containers{postgresql} Backoff Back-off restarting failed docker container
[root@openshift-148 ~]# docker logs 25ada60d0c08 waiting for server to start....FATAL: data directory "/var/lib/pgsql/data/userdata" has group or world access DETAIL: Permissions should be u=rwx (0700). .... stopped waiting pg_ctl: could not start server Examine the log output.
postgresql-92-rhel7 also have the problem:CrashLoopBackOff openshift3/postgresql-92-rhel7 image_id:c10e6b2e643e
(In reply to wewang from comment #11) > When I test rhscl/postgresql-94-rhel7 witout nfs server ,still have the > "CrashLoopBackOff" problem: > > rhscl/postgresql-94-rhel7 image_id: 2425b18dbc45 > > [root@dhcp-128-91 today5]# oc new-app --image-stream=postgresql -e > POSTGRESQL_USER=user -e POSTGRESQL_PASSWORD=pass -e POSTGRESQL_DATABASE=db > --> Found image 2425b18 (10 days old) in image stream "postgresql" under tag > :latest for "postgresql" > * This image will be deployed in deployment config "postgresql" > * Port 5432/tcp will be load balanced by service "postgresql" > --> Creating resources with label app=postgresql ... > DeploymentConfig "postgresql" created > Service "postgresql" created > --> Success > Run 'oc status' to view your app. > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-deploy 1/1 Running 0 30s > postgresql-1-iujiv 0/1 Pending 0 26s > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 1/1 Running 0 33s > [root@dhcp-128-91 today5]# oc env pod postgresql-1-iujiv --list > # pods postgresql-1-iujiv, container postgresql > POSTGRESQL_DATABASE=db > POSTGRESQL_PASSWORD=pass > POSTGRESQL_USER=user > [root@dhcp-128-91 today5]# oc rsh postgresql-1-iujiv > bash-4.2$ error: error executing remote command: Error executing command in > container: API error (500): Container > 04d40d223ef5e7944122ba15cb65ef55691a07340ad9cea2ef9004527d3453a4 is not > running > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 0/1 CrashLoopBackOff 3 1m > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 0/1 CrashLoopBackOff 5 3m > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 0/1 CrashLoopBackOff 6 5m > > [root@dhcp-128-91 today5]# oc logs postgresql-1-iujiv > Error from server: Internal error occurred: Pod "postgresql-1-iujiv" in > namespace "wewangproject1": container "postgresql" is in waiting state. > > [root@dhcp-128-91 today5]# oc describe pod postgresql-1-iujiv > Name: postgresql-1-iujiv > Namespace: wewangproject1 > Image(s): > rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest > Node: openshift-148.lab.sjc.redhat.com/10.14.6.148 > Start Time: Mon, 02 Nov 2015 17:05:30 +0800 > Labels: app=postgresql,deployment=postgresql-1,deploymentconfig=postgresql > Status: Running > Reason: > Message: > IP: 10.1.0.94 > Replication Controllers: postgresql-1 (1/1 replicas created) > Containers: > postgresql: > Container ID: > docker://b67a5b14bf79c0ebaee1ede78cc93711d560e5603833377b2d1a5b696b1a6b34 > Image: > rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest > Image ID: > docker://2425b18dbc4503bcc273eb43f3cbe72866da20fe0b0b6517a7bc2cf9e53ba569 > QoS Tier: > memory: BestEffort > cpu: BestEffort > State: Waiting > Reason: CrashLoopBackOff > Last Termination State: Terminated > Reason: Error > Exit Code: 1 > Started: Mon, 02 Nov 2015 17:09:18 +0800 > Finished: Mon, 02 Nov 2015 17:09:23 +0800 > Ready: False > Restart Count: 6 > Environment Variables: > POSTGRESQL_DATABASE: db > POSTGRESQL_PASSWORD: pass > POSTGRESQL_USER: user > Conditions: > Type Status > Ready False > Volumes: > postgresql-volume-1: > Type: EmptyDir (a temporary directory that shares a pod's lifetime) > Medium: > default-token-v59j3: > Type: Secret (a secret that should populate this volume) > SecretName: default-token-v59j3 > Events: > FirstSeen LastSeen Count From SubobjectPath Reason Message > ───────── ──────── ───── ──── ───────────── ────── ─────── > 6m 6m 1 {scheduler } Scheduled Successfully assigned > postgresql-1-iujiv to openshift-148.lab.sjc.redhat.com > 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required > container POD Pulled Container image > "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on > machine > 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required > container POD Created Created with docker id e58b1c62bcca > 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required > container POD Started Started with docker id e58b1c62bcca > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 9ebdd63a45b6 > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 9ebdd63a45b6 > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id e7d1dd561bfd > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id e7d1dd561bfd > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 04d40d223ef5 > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 04d40d223ef5 > 4m 4m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 342c4c821e5c > 4m 4m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 342c4c821e5c > 3m 3m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 33c3d6085f78 > 3m 3m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 33c3d6085f78 > 5m 2m 6 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Pulling pulling image > "rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest" > 2m 2m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id b67a5b14bf79 > 5m 2m 6 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Pulled Successfully pulled image > "rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest" > 2m 2m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id b67a5b14bf79 > 5m 26s 23 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Backoff Back-off restarting failed docker > container
(In reply to wewang from comment #11) > When I test rhscl/postgresql-94-rhel7 with nfs server ,still have the > "CrashLoopBackOff" problem: > > rhscl/postgresql-94-rhel7 image_id: 2425b18dbc45 > > [root@dhcp-128-91 today5]# oc new-app --image-stream=postgresql -e > POSTGRESQL_USER=user -e POSTGRESQL_PASSWORD=pass -e POSTGRESQL_DATABASE=db > --> Found image 2425b18 (10 days old) in image stream "postgresql" under tag > :latest for "postgresql" > * This image will be deployed in deployment config "postgresql" > * Port 5432/tcp will be load balanced by service "postgresql" > --> Creating resources with label app=postgresql ... > DeploymentConfig "postgresql" created > Service "postgresql" created > --> Success > Run 'oc status' to view your app. > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-deploy 1/1 Running 0 30s > postgresql-1-iujiv 0/1 Pending 0 26s > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 1/1 Running 0 33s > [root@dhcp-128-91 today5]# oc env pod postgresql-1-iujiv --list > # pods postgresql-1-iujiv, container postgresql > POSTGRESQL_DATABASE=db > POSTGRESQL_PASSWORD=pass > POSTGRESQL_USER=user > [root@dhcp-128-91 today5]# oc rsh postgresql-1-iujiv > bash-4.2$ error: error executing remote command: Error executing command in > container: API error (500): Container > 04d40d223ef5e7944122ba15cb65ef55691a07340ad9cea2ef9004527d3453a4 is not > running > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 0/1 CrashLoopBackOff 3 1m > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 0/1 CrashLoopBackOff 5 3m > [root@dhcp-128-91 today5]# oc get pods > NAME READY STATUS RESTARTS AGE > postgresql-1-iujiv 0/1 CrashLoopBackOff 6 5m > > [root@dhcp-128-91 today5]# oc logs postgresql-1-iujiv > Error from server: Internal error occurred: Pod "postgresql-1-iujiv" in > namespace "wewangproject1": container "postgresql" is in waiting state. > > [root@dhcp-128-91 today5]# oc describe pod postgresql-1-iujiv > Name: postgresql-1-iujiv > Namespace: wewangproject1 > Image(s): > rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest > Node: openshift-148.lab.sjc.redhat.com/10.14.6.148 > Start Time: Mon, 02 Nov 2015 17:05:30 +0800 > Labels: app=postgresql,deployment=postgresql-1,deploymentconfig=postgresql > Status: Running > Reason: > Message: > IP: 10.1.0.94 > Replication Controllers: postgresql-1 (1/1 replicas created) > Containers: > postgresql: > Container ID: > docker://b67a5b14bf79c0ebaee1ede78cc93711d560e5603833377b2d1a5b696b1a6b34 > Image: > rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest > Image ID: > docker://2425b18dbc4503bcc273eb43f3cbe72866da20fe0b0b6517a7bc2cf9e53ba569 > QoS Tier: > memory: BestEffort > cpu: BestEffort > State: Waiting > Reason: CrashLoopBackOff > Last Termination State: Terminated > Reason: Error > Exit Code: 1 > Started: Mon, 02 Nov 2015 17:09:18 +0800 > Finished: Mon, 02 Nov 2015 17:09:23 +0800 > Ready: False > Restart Count: 6 > Environment Variables: > POSTGRESQL_DATABASE: db > POSTGRESQL_PASSWORD: pass > POSTGRESQL_USER: user > Conditions: > Type Status > Ready False > Volumes: > postgresql-volume-1: > Type: EmptyDir (a temporary directory that shares a pod's lifetime) > Medium: > default-token-v59j3: > Type: Secret (a secret that should populate this volume) > SecretName: default-token-v59j3 > Events: > FirstSeen LastSeen Count From SubobjectPath Reason Message > ───────── ──────── ───── ──── ───────────── ────── ─────── > 6m 6m 1 {scheduler } Scheduled Successfully assigned > postgresql-1-iujiv to openshift-148.lab.sjc.redhat.com > 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required > container POD Pulled Container image > "registry.access.redhat.com/aep3_beta/aep-pod:v3.0.2.905" already present on > machine > 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required > container POD Created Created with docker id e58b1c62bcca > 6m 6m 1 {kubelet openshift-148.lab.sjc.redhat.com} implicitly required > container POD Started Started with docker id e58b1c62bcca > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 9ebdd63a45b6 > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 9ebdd63a45b6 > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id e7d1dd561bfd > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id e7d1dd561bfd > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 04d40d223ef5 > 5m 5m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 04d40d223ef5 > 4m 4m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 342c4c821e5c > 4m 4m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 342c4c821e5c > 3m 3m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id 33c3d6085f78 > 3m 3m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id 33c3d6085f78 > 5m 2m 6 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Pulling pulling image > "rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest" > 2m 2m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Started Started with docker id b67a5b14bf79 > 5m 2m 6 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Pulled Successfully pulled image > "rcm-img-docker01.build.eng.bos.redhat.com:5001/rhscl/postgresql-94-rhel7: > latest" > 2m 2m 1 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Created Created with docker id b67a5b14bf79 > 5m 26s 23 {kubelet openshift-148.lab.sjc.redhat.com} > spec.containers{postgresql} Backoff Back-off restarting failed docker > container sorry, comments should be:When I test rhscl/postgresql-94-rhel7 without nfs server ,still have the "CrashLoopBackOff" problem:
# docker logs 25ada60d0c08 waiting for server to start.... FATAL: data directory "/var/lib/pgsql/data/userdata" has group or world access DETAIL: Permissions should be u=rwx (0700). .... stopped waiting pg_ctl: could not start server Examine the log output. Says that the NFS server has incorrect permissions. Not an AE/OS problem. The NFS export must be owned by postgres and must have 700 permissions...
https://github.com/openshift/openshift-docs/pull/1123/files#diff-2ae4205bccc596fefd833f7d63d81bb5R193 however says 777, which doesn't seem as though it will work here.
777 should be ok for the mount dir, postgres should be creating its own files under that dir with 700 ownership.
@ben fair enough, I see that the mount point in the rc is /va/rlib/pgsql/data and postgres isn't complaining about that one. However clearly /var/lib/pgsql/data/userdata is not being created 700. I believe that leaves 2 possibilities. 1) postgres has a bug and is creating it incorrectly (seems unlikely) 2) /var/lib/pgsql/data/userdate was created (incorrectly) in the PV before you tried to use that PV here. To resolve #2 I would suggest either: a) delete everything the PV, chown the root dir postgress, retry b) chown -R postgres; chmod -R 7000; retry.
# cat /etc/exports /mnt/nfs-export/postgres *(rw) # systemctl status nfs-server nfs-server.service - NFS server and services [snip] Main PID: 2157 (code=exited, status=0/SUCCESS) [snip] # showmount -e localhost Export list for localhost: /mnt/nfs-export/postgres * # ls -lad /mnt/nfs-export/postgres/ drwxrwxrwx. 2 root root 6 Nov 2 16:16 /mnt/nfs-export/postgres/ # ls -la /mnt/nfs-export/postgres/ total 0 drwxrwxrwx. 2 root root 6 Nov 2 16:16 . drwxr-xr-x. 3 root root 21 Nov 2 14:32 .. # cat pv.yaml apiVersion: v1 kind: PersistentVolume metadata: name: postgres-data-pv spec: capacity: storage: 5Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Recycle nfs: path: /mnt/nfs-export/postgres/ server: 127.0.0.1 # oc create -f pv.yaml # wget https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/image/db-templates/postgresql-replica-rhel7.json # sed -i -e 's|/rhscl/|/rhscl_beta/|' postgresql-replica-rhel7.json # oc new-app postgresql-replica-rhel7.json # oc get pod NAME READY STATUS RESTARTS AGE postgresql-master-1-fjz17 1/1 Running 0 2m postgresql-slave-1-xqj6k 1/1 Running 2 2m
Note that my NFS export is world writable. It must either be world writable or writable by the user assigned to this project. Note also that my NFS export is EMPTY. You cannot have old files laying around in there.
All this to say, I believe this is working properly. Make sure your NFS mount is clean and writable by the eventual ssc user and it works.
our creating share file steps are follow: mkdir /nfs chown -R /nfs nfsnobody:nfsnobody chmod 777 /nfs echo '/nfs *(rw)' >> /etc/exports exportfs -a setsebool -P virt_use_nfs 1
The slave image in that template does not even use persistent storage, it uses emptydir storage. also the image referenced in that template (registry.access.redhat.com/rhscl/postgresql-94-rhel7) is not actually available. So i'm having trouble following the recreate steps on this. However like Eric, I was able to setup an openshift cluster with an NFS persistent volume and, after building the latest postgresql-94-rhel7 image locally and editing the template to use the local image, was able to start up the master and slave pods without issue. I think we need to simplify this. Are you able to run the postgresql image standalone (not using the replica template)?
lowering severity as: 1) it's associated with replication scenarios which are samples and not part of the product 2) it appears to be working correctly in 2 other environments.
@ben, you'll see I had to: sed -i -e 's|/rhscl/|/rhscl_beta/|' postgresql-replica-rhel7.json instead of rebuild the image myself. But yeah....
@ben , when we test on OSE env, will pull the image from rcm and make a tag to registy, so the image are correct.I test this today, it works fine. see the comments here. :https://bugzilla.redhat.com/show_bug.cgi?id=1260571#c26 , you can mark this bug to ON_QA .
Thanks!
Based on: [root@openshift-152 ~]# docker logs 141b952577b4 /usr/bin/container-entrypoint: line 3: exec: postgres-master: not found Seems like the rhscl postgres 9.4 image is missing the postgres-master script. SCL team owns this.
(In reply to Ben Parees from comment #33) > Based on: > [root@openshift-152 ~]# docker logs 141b952577b4 > /usr/bin/container-entrypoint: line 3: exec: postgres-master: not found > > Seems like the rhscl postgres 9.4 image is missing the postgres-master > script. > > SCL team owns this. This sounds like different issue. What code exactly depends on 'postgres-master' command? This has been changed as part of https://github.com/openshift/postgresql/pull/77 while the documented (and default) command should be 'run-postgresql-master'.
(In reply to Pavel Raiskup from comment #34) > This sounds like different issue. I meant -- different than the original BZ number purpose. Probably would be worth having new bug for it? > What code exactly depends on 'postgres-master' command? This has been > changed as part of https://github.com/openshift/postgresql/pull/77 while the > documented (and default) command should be 'run-postgresql-master'. Ping? Do we need to restore postgres-master executable (as a symlink to run-postgresql-master)?
Adding needinfo for Wen -- can you provide the info requesting by Pavel in comment #34? It seems to me like something is running with old replication configuration.
From https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/image/db-templates/postgresql-replica-rhel7.json | "containers": [ | { | "name": "postgresql-master", | "image": "registry.access.redhat.com/rhscl/postgresql-94-rhel7", | "args": [ | "postgres-master" | ], Newly you should use "args": ["run-postgresql-master"] or: "args": ["run-postgresql-slave"]
I used : "image": "ci.dev.openshift.redhat.com:5000/rhscl/postgresql-94-rhel7", "args": [ "run-postgresql-master" ], default slave context is : "image": "ci.dev.openshift.redhat.com:5000/rhscl/postgresql-94-rhel7", "args": [ "postgres-slave" ], I updated like this or not ,also not working "args": [ "run-postgresql-slave" ],
(In reply to wewang from comment #43) > I updated like this or not ,also not working > "args": [ > "run-postgresql-slave" > ], What is the result of having it set like this ^^? Anyway, the file ttps://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/image/db-templates/postgresql-replica-rhel7.json still does not have this fix, right?
Sorry, wrong bug.
About ""run-postgresql-slave",you can see Comment #40 the errors are as follow: # docker ps -a |grep postgresql-slave-1-722lm ffa90272efab ci.dev.openshift.redhat.com:5000/rhscl/postgresql-94-rhel7 "container-entrypoint" 33 seconds ago Exited (1) 32 seconds ago k8s_postgresql-slave.b2787d91_postgresql-slave-1-722lm_wewang3_8e88577e-8840-11e5-88a6-0eeb114f7c11_45532283 d210f3b3b950 ci.dev.openshift.redhat.com:5000/rhscl/postgresql-94-rhel7 "container-entrypoint" About a minute ago Exited (1) About a minute ago k8s_postgresql-slave.b2787d91_postgresql-slave-1-722lm_wewang3_8e88577e-8840-11e5-88a6-0eeb114f7c11_feafd32f c1ec106d139e ci.dev.openshift.redhat.com:5000/rhscl/postgresql-94-rhel7 "container-entrypoint" About a minute ago Exited (1) About a minute ago k8s_postgresql-slave.b2787d91_postgresql-slave-1-722lm_wewang3_8e88577e-8840-11e5-88a6-0eeb114f7c11_02538bd3 b2fa2e6fc745 openshift/origin-pod:latest "/pod" 2 minutes ago Up 2 minutes k8s_POD.da11fee8_postgresql-slave-1-722lm_wewang3_8e88577e-8840-11e5-88a6-0eeb114f7c11_aeaab028 # docker logs ffa90272efab Initializing PostgreSQL slave ... pg_basebackup: directory "/var/lib/pgsql/data/userdata" exists but is not empty
Today I test in aep again, I used "args": ["run-postgresql-master"] and : "args": ["run-postgresql-slave"] version of psql openshift3/postgresql-92-rhel7 59a40b28bfc5 [root@dhcp-128-91 aep]# oc get pods NAME READY STATUS RESTARTS AGE postgresql-master-1-hkx1x 1/1 Running 0 47s postgresql-slave-1-na8ng 0/1 CrashLoopBackOff 2 47s # oc logs postgresql-master-1-hkx1x waiting for server to start....FATAL: data directory "/var/lib/pgsql/data/userdata" has wrong ownership HINT: The server must be started by the user that owns the data directory. pg_ctl: could not start server Examine the log output. .... stopped waiting
I'm working on a fix already, taking this (hope you don't mind)
This should work after https://github.com/openshift/postgresql/pull/82 is merged and with latest OpenShift master.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2015:2580