Bug 1279744 - postgresql-92-rhel7 cannot startup on AEP env
postgresql-92-rhel7 cannot startup on AEP env
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage (Show other bugs)
3.0.0
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Paul Morie
Liang Xia
: UpcomingRelease
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-11-10 02:53 EST by Wang Haoran
Modified: 2016-01-26 14:17 EST (History)
7 users (show)

See Also:
Fixed In Version: atomic-openshift-3.1.0.4-1.git.0.064715c.el7aos
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-01-26 14:17:10 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Wang Haoran 2015-11-10 02:53:48 EST
Description of problem:

the postgresql-92-rhel7 image with id c10e6b2e643e cannot startup on aep env , bug can start on ec2 instance.
Version-Release number of selected component (if applicable):


How reproducible:

always
Steps to Reproduce:
1.create a project on AEP env
2.oc process -f https://raw.githubusercontent.com/openshift/origin/master/examples/db-templates/postgresql-ephemeral-template.json | oc create -f -
3.check the pod status

Actual results:
the pod status:
[vagrant@ose test]$ oc get pod
NAME                 READY     STATUS             RESTARTS   AGE
postgresql-1-qtnin   0/1       CrashLoopBackOff   14         44m

log:
[vagrant@ose test]$ oc logs -f postgresql-1-qtnin
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
 
The database cluster will be initialized with locale "en_US.utf8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
 
fixing permissions on existing directory /var/lib/pgsql/data/userdata ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 32MB
creating configuration files ... ok
creating template1 database in /var/lib/pgsql/data/userdata/base/1 ... ok
initializing pg_authid ... ok
initializing dependencies ... ok
creating system views ... ok
loading system objects' descriptions ... ok
creating collations ... ok
creating conversions ... ok
creating dictionaries ... ok
setting privileges on built-in objects ... ok
creating information schema ... ok
loading PL/pgSQL server-side language ... ok
vacuuming database template1 ... ok
copying template1 to template0 ... ok
 
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
copying template1 to postgres ... ok
 
Success. You can now start the database server using:
 
    postgres -D /var/lib/pgsql/data/userdata
or
    pg_ctl -D /var/lib/pgsql/data/userdata -l logfile start
 
waiting for server to start.... done
server started
waiting for server to shut down.... done
server stopped
waiting for server to start....FATAL:  data directory "/var/lib/pgsql/data/userdata" has group or world access
DETAIL:  Permissions should be u=rwx (0700).
.... stopped waiting
pg_ctl: could not start server
Examine the log output.

Expected results:

should start successfully
Additional info:
Comment 1 Wang Haoran 2015-11-10 03:03:42 EST
Version-Release number of selected component (if applicable):
openshift v3.1.0.3
kubernetes v1.1.0-origin-1107-g4c8e6f4
oc v3.1.0.3
openshift3/postgresql-92-rhel7    c10e6b2e643e
Comment 4 Ben Parees 2015-11-10 14:58:44 EST
There is different behavior occuring for creating directories in an EmptyDir volume vs Ephemeral storage.  

EmptyDir volumes are getting a different default permission set and group ownership:

/var/lb/pgsql/data (not a typo) is mounted as EmptyDir:
bash-4.2$ mkdir /var/lb/pgsql/data/newdata
bash-4.2$ ls -l /var/lb/pgsql/data/
total 8
drwxr-sr-x. 2 1000040000 1000040000 4096 Nov 10 19:21 newdata



/var/lib/pgsql/data is ephemeral storage:

bash-4.2$ mkdir /var/lib/pgsql/data/newdata
bash-4.2$ ls -l /var/lib/pgsql/data/
total 8
drwxr-xr-x.  2 1000040000 root 4096 Nov 10 19:18 newdata


Having group-write permission on the created directory is causing postgres to throw an error (probably mysql and mongo too).

We can possibly fix this in the images (modify the permissions after creating the dirs), but i'd like the storage team to take a look first to see if this is really how we want EmptyDir to behave (I assume/hope it doesn't match to how NFS behaves...)
Comment 5 Mark Turansky 2015-11-10 15:03:41 EST
Reassigning to Paul Morie, as this is his feature and he understands the code.
Comment 6 Ben Parees 2015-11-10 16:02:24 EST
Confirmed setting the fsGroup and supplementalGroups to RunAsAny allows the postgres image to work with an EmptyDir again.
Comment 7 Dan McPherson 2015-11-10 19:26:21 EST
The issue seen should be fixed with:

https://github.com/openshift/origin/pull/5839

please open a new bug if not.  I am leaving this bug to track the longer term issue raised here.
Comment 8 Wang Haoran 2015-11-11 00:08:14 EST
Verified with version atomic-openshift-3.1.0.4-1.git.0.064715c.el7aos
[root@openshift-137 ~]# oc get  scc
NAME         PRIV      CAPS      HOSTDIR   SELINUX     RUNASUSER          FSGROUP    SUPGROUP   PRIORITY
anyuid       false     []        false     MustRunAs   RunAsAny           RunAsAny   RunAsAny   10
hostaccess   false     []        true      MustRunAs   MustRunAsRange     RunAsAny   RunAsAny   <none>
hostmount    false     []        true      MustRunAs   MustRunAsRange     RunAsAny   RunAsAny   <none>
nonroot      false     []        false     MustRunAs   MustRunAsNonRoot   RunAsAny   RunAsAny   <none>
privileged   true      []        true      RunAsAny    RunAsAny           RunAsAny   RunAsAny   <none>
restricted   false     []        false     MustRunAs   MustRunAsRange     RunAsAny   RunAsAny   <none>
Comment 10 errata-xmlrpc 2016-01-26 14:17:10 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:0070

Note You need to log in before you can comment on or make changes to this bug.