Description of problem: The postgresql image with persistent storage does not work when using a CEPH PV Version-Release number of selected component (if applicable): 3.1 How reproducible: Add CEPH PV and deploy the postgresql image Steps to Reproduce: 1. Install OSE 2. add CEPH PV 3. deploy postgresql-persistent template 4. add fsGroup security setting otherwise it will bail out with an mkdir error Actual results: waiting for server to start....FATAL: data directory "/var/lib/pgsql/data/userdata" has group or world access DETAIL: Permissions should be u=rwx (0700). stopped waiting pg_ctl: could not start server Expected results: postgresql should start Additional info: 1) This needs to be added (Step 4 above) or the following error will occur: "mkdir: cannot create directory '/var/lib/pgsql/data/userdata': Permission denied" securityContext: fsGroup: 26 2) both 9.2 and 9.4 (from RH-SCL) show this behaviour 3) Initially reported by customer but has also been reproduced by with the steps above 4) according to the postgresql maintainer postgreqsql attempts to fix permissions (chmod(pg_data, S_IRWXU) and correctly checking the return value for zero). The permission check is doing a "if (stat_buf.st_mode & (S_IRWXG | S_IRWXO)) then error out" 5) full log from startup: The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale "en_US.utf8". The default database encoding has accordingly been set to "UTF8". The default text search configuration will be set to "english". Data page checksums are disabled. fixing permissions on existing directory /var/lib/pgsql/data/userdata ... ok creating subdirectories ... ok selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting dynamic shared memory implementation ... posix creating configuration files ... ok creating template1 database in /var/lib/pgsql/data/userdata/base/1 ... ok initializing pg_authid ... ok initializing dependencies ... ok creating system views ... ok loading system objects' descriptions ... ok creating collations ... ok creating conversions ... ok creating dictionaries ... ok setting privileges on built-in objects ... ok creating information schema ... ok loading PL/pgSQL server-side language ... ok vacuuming database template1 ... ok copying template1 to template0 ... ok copying template1 to postgres ... ok WARNING: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. syncing data to disk ... ok Success. You can now start the database server using: postgres -D /var/lib/pgsql/data/userdata or pg_ctl -D /var/lib/pgsql/data/userdata -l logfile start waiting for server to start....FATAL: data directory "/var/lib/pgsql/data/userdata" has group or world access DETAIL: Permissions should be u=rwx (0700). stopped waiting pg_ctl: could not start server Examine the log output.
additional info 6) can't be reproduced with an openstack based PV
Please see this comment for more information: https://bugzilla.redhat.com/show_bug.cgi?id=1298938#c7 *** This bug has been marked as a duplicate of bug 1298938 ***
Is this ceph rbd or ceph fs ?.. could you post the PV yaml ?
That's rbd. # more ceph-pv.yaml apiVersion: v1 kind: PersistentVolume metadata: name: "ceph-pv" spec: capacity: storage: "2Gi" accessModes: - "ReadWriteOnce" rbd: monitors: - "192.168.0.8:6789" pool: rbd image: ose user: admin secretRef: name: ceph-secret fsType: ext4 readOnly: false persistentVolumeReclaimPolicy: "Recycle"
Thanks for the info Harald. It looks like what is happening is that setting the fsGroup sets permissions properly to give postgresql permission to edit the directory; postgresql sets the permissions it needs but openshift resets the permissions again to the defaults. Harald could you confirm this by watching the permissions on the drive ? So: - figure out the path to the drive ('mount | grep rbd' after the pod starts) - run something like: while true; do sleep 1; ls -ld </path/to/global/mount/of/rbd/drive>; done I don't think is a short term fix but as a work-around you could launch the project once as you do above. Then once the drive is formatted and fsGroup is applied to the drive, disable setting of fsGroup and relaunch the application. The second time around postgresql should have permissions to "fix" the data directory permissions. Or you can just chmod/chown the directory from command line.
Hi Sami, I can confirm that behaviour. When debugging the issue for the initial report, I did mount the rbd on another host and changed the perms as postgresql would expect them to be. But a bit later they've been reverted back automatically. br Hari