| Summary: | Postgres8 Monitoring Service fails | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | W. de Heiden <wdh> | ||||||
| Component: | selinux-policy | Assignee: | Miroslav Grepl <mgrepl> | ||||||
| Status: | CLOSED DUPLICATE | QA Contact: | Milos Malik <mmalik> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 6.2 | CC: | agk, ccaulfie, cluster-maint, dwalsh, jkortus, joherr, lhh, lnovich, mmalik, mtruneck, nkinder, rpeterso, sbradley, teigland, tmarshal | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | selinux-policy-3.7.19-160.el6 | Doc Type: | Bug Fix | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 875794 (view as bug list) | Environment: | |||||||
| Last Closed: | 2013-08-07 11:33:25 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 875794 | ||||||||
| Attachments: |
|
||||||||
I found a third issues as well. Simulating a hard crash (power off) in a virtual environment Postgres doesn't failover. Due to the hard crash, the PID-file remains causing Postgres to start on the failover node: FATAL: lock file "postmaster.pid" already exists HINT: Is another postmaster (PID 1755) running in data directory "/var/lib/pgsql/data"? FATAL: bogus data in lock file "postmaster.pid": "" FATAL: bogus data in lock file "postmaster.pid": "" Removing the PID-file (/var/lib/pgsql/data/postmaster.pid) manually makes it possible to start Postgres. This is quite unacceptable, making a HA setup, Postgres should failover automaticly. Both PID-file issues seem to be SELinux related. Disabling SELinux or setting it Permissive solves it: - No more repeatingly stoppping/starting Postgres. - After hard powering off a server it will failover correctly. What avc messages are you seeing? ausearch -m avc -ts recent time->Tue May 22 09:59:06 2012
type=SYSCALL msg=audit(1337673546.661:198): arch=c000003e syscall=2 success=no exit=-13 a0=1312bb0 a1=241 a2=1b6 a3=0 items=0 ppid=15715 pid=15717 auid=0 uid=26 gid=26 euid=26 suid=26 fsuid=26 egid=26 sgid=26 fsgid=26 tty=(none) ses=1 comm="postmaster" exe="/usr/bin/postgres" subj=unconfined_u:system_r:postgresql_t:s0 key=(null)
type=AVC msg=audit(1337673546.661:198): avc: denied { write } for pid=15717 comm="postmaster" name="postgres-8:postgres_server.pid" dev=dm-0 ino=142382 scontext=unconfined_u:system_r:postgresql_t:s0 tcontext=unconfined_u:object_r:rgmanager_var_run_t:s0 tclass=file
----
time->Tue May 22 10:00:16 2012
type=SYSCALL msg=audit(1337673616.469:211): arch=c000003e syscall=137 success=yes exit=0 a0=2305714 a1=7f8abb5fd440 a2=37011b3dc8 a3=7f8abb5fd2b8 items=0 ppid=1 pid=1670 auid=4294967295 uid=175 gid=175 euid=175 suid=175 fsuid=175 egid=175 sgid=175 fsgid=175 tty=(none) ses=4294967295 comm="rhev-agentd.py" exe="/usr/bin/python" subj=system_u:system_r:rhev_agentd_t:s0 key=(null)
type=AVC msg=audit(1337673616.469:211): avc: denied { search } for pid=1670 comm="rhev-agentd.py" name="/" dev=sysfs ino=1 scontext=system_u:system_r:rhev_agentd_t:s0 tcontext=system_u:object_r:sysfs_t:s0 tclass=dir
----
time->Tue May 22 10:00:16 2012
type=SYSCALL msg=audit(1337673616.469:212): arch=c000003e syscall=4 success=yes exit=0 a0=2305714 a1=7f8abb5fd3b0 a2=7f8abb5fd3b0 a3=7f8abb5fd2b8 items=0 ppid=1 pid=1670 auid=4294967295 uid=175 gid=175 euid=175 suid=175 fsuid=175 egid=175 sgid=175 fsgid=175 tty=(none) ses=4294967295 comm="rhev-agentd.py" exe="/usr/bin/python" subj=system_u:system_r:rhev_agentd_t:s0 key=(null)
type=AVC msg=audit(1337673616.469:212): avc: denied { getattr } for pid=1670 comm="rhev-agentd.py" path="/sys/kernel/config" dev=configfs ino=42210 scontext=system_u:system_r:rhev_agentd_t:s0 tcontext=system_u:object_r:configfs_t:s0 tclass=dir
All was generated using a brand new cluster installation using the most recent updates. Attached is a file with all the error messages from ausearch -m avc -ts recent
Created attachment 585957 [details]
"ausearch -m avc -ts recent"
Output from ausearch -m avc -ts recent
The problem is we see postgres-8:postgres_server.pid as rgmanager_var_run_t. I will backport some fixes from Fedora which could fix this problem. I added a fix. I'm getting following denials with default postgresql service (selinux-policy-3.7.19-194.el6.noarch). Some of these may be caused by the fact that the config will be (by default) generated to /etc/cluster/postgres-8/.../postgresql.conf.
audit2allow result:
#============= postgresql_t ==============
allow postgresql_t cluster_conf_t:dir search;
allow postgresql_t cluster_conf_t:file { read getattr open };
#!!!! The source type 'postgresql_t' can write to a 'file' of the following types:
# postgresql_tmp_t, postgresql_log_t, hugetlbfs_t, postgresql_lock_t, postgresql_db_t, security_t, faillog_t, lastlog_t, pcscd_var_run_t, postgresql_var_run_t, root_t, security_t, krb5_host_rcache_t
allow postgresql_t rgmanager_var_run_t:file { write getattr open };
raw denials:
time->Wed Jan 16 11:16:25 2013
type=SYSCALL msg=audit(1358356585.171:202): arch=c000003e syscall=4 success=no exit=-13 a0=dc4ad0 a1=7fff2662d4d0 a2=7fff2662d4d0 a3=7365726774736f70 items=0 ppid=11164 pid=11166 auid=0 uid=26 gid=26 euid=26 suid=26 fsuid=26 egid=26 sgid=26 fsgid=26 tty=(none) ses=2 comm="postmaster" exe="/usr/bin/postgres" subj=unconfined_u:system_r:postgresql_t:s0 key=(null)
type=AVC msg=audit(1358356585.171:202): avc: denied { search } for pid=11166 comm="postmaster" name="cluster" dev=dm-0 ino=24637 scontext=unconfined_u:system_r:postgresql_t:s0 tcontext=system_u:object_r:cluster_conf_t:s0 tclass=dir
----
time->Wed Jan 16 11:18:03 2013
type=SYSCALL msg=audit(1358356683.748:216): arch=c000003e syscall=4 success=yes exit=0 a0=1f0dad0 a1=7fffea491800 a2=7fffea491800 a3=7365726774736f70 items=0 ppid=12123 pid=12125 auid=0 uid=26 gid=26 euid=26 suid=26 fsuid=26 egid=26 sgid=26 fsgid=26 tty=(none) ses=2 comm="postmaster" exe="/usr/bin/postgres" subj=unconfined_u:system_r:postgresql_t:s0 key=(null)
type=AVC msg=audit(1358356683.748:216): avc: denied { getattr } for pid=12125 comm="postmaster" path="/etc/cluster/postgres-8/postgres-8:pgsql/postgresql.conf" dev=dm-0 ino=65346 scontext=unconfined_u:system_r:postgresql_t:s0 tcontext=unconfined_u:object_r:cluster_conf_t:s0 tclass=file
type=AVC msg=audit(1358356683.748:216): avc: denied { search } for pid=12125 comm="postmaster" name="postgres-8" dev=dm-0 ino=65343 scontext=unconfined_u:system_r:postgresql_t:s0 tcontext=unconfined_u:object_r:cluster_conf_t:s0 tclass=dir
type=AVC msg=audit(1358356683.748:216): avc: denied { search } for pid=12125 comm="postmaster" name="cluster" dev=dm-0 ino=24637 scontext=unconfined_u:system_r:postgresql_t:s0 tcontext=system_u:object_r:cluster_conf_t:s0 tclass=dir
----
time->Wed Jan 16 11:18:03 2013
type=SYSCALL msg=audit(1358356683.748:217): arch=c000003e syscall=2 success=yes exit=3 a0=1f0dad0 a1=0 a2=1b6 a3=0 items=0 ppid=12123 pid=12125 auid=0 uid=26 gid=26 euid=26 suid=26 fsuid=26 egid=26 sgid=26 fsgid=26 tty=(none) ses=2 comm="postmaster" exe="/usr/bin/postgres" subj=unconfined_u:system_r:postgresql_t:s0 key=(null)
type=AVC msg=audit(1358356683.748:217): avc: denied { open } for pid=12125 comm="postmaster" name="postgresql.conf" dev=dm-0 ino=65346 scontext=unconfined_u:system_r:postgresql_t:s0 tcontext=unconfined_u:object_r:cluster_conf_t:s0 tclass=file
type=AVC msg=audit(1358356683.748:217): avc: denied { read } for pid=12125 comm="postmaster" name="postgresql.conf" dev=dm-0 ino=65346 scontext=unconfined_u:system_r:postgresql_t:s0 tcontext=unconfined_u:object_r:cluster_conf_t:s0 tclass=file
----
time->Wed Jan 16 11:18:04 2013
type=SYSCALL msg=audit(1358356684.139:218): arch=c000003e syscall=2 success=yes exit=5 a0=1f0db40 a1=241 a2=1b6 a3=0 items=0 ppid=12123 pid=12125 auid=0 uid=26 gid=26 euid=26 suid=26 fsuid=26 egid=26 sgid=26 fsgid=26 tty=(none) ses=2 comm="postmaster" exe="/usr/bin/postgres" subj=unconfined_u:system_r:postgresql_t:s0 key=(null)
type=AVC msg=audit(1358356684.139:218): avc: denied { open } for pid=12125 comm="postmaster" name="postgres-8:pgsql.pid" dev=dm-0 ino=65345 scontext=unconfined_u:system_r:postgresql_t:s0 tcontext=unconfined_u:object_r:rgmanager_var_run_t:s0 tclass=file
type=AVC msg=audit(1358356684.139:218): avc: denied { write } for pid=12125 comm="postmaster" name="postgres-8:pgsql.pid" dev=dm-0 ino=65345 scontext=unconfined_u:system_r:postgresql_t:s0 tcontext=unconfined_u:object_r:rgmanager_var_run_t:s0 tclass=file
----
time->Wed Jan 16 11:18:04 2013
type=SYSCALL msg=audit(1358356684.140:219): arch=c000003e syscall=5 success=yes exit=0 a0=5 a1=7fffea490fd0 a2=7fffea490fd0 a3=0 items=0 ppid=12123 pid=12125 auid=0 uid=26 gid=26 euid=26 suid=26 fsuid=26 egid=26 sgid=26 fsgid=26 tty=(none) ses=2 comm="postmaster" exe="/usr/bin/postgres" subj=unconfined_u:system_r:postgresql_t:s0 key=(null)
type=AVC msg=audit(1358356684.140:219): avc: denied { getattr } for pid=12125 comm="postmaster" path="/var/run/cluster/postgres-8/postgres-8:pgsql.pid" dev=dm-0 ino=65345 scontext=unconfined_u:system_r:postgresql_t:s0 tcontext=unconfined_u:object_r:rgmanager_var_run_t:s0 tclass=file
cluster.conf snip:
<rm>
<resources>
<postgres-8 config_file="/var/lib/pgsql/data/postgresql.conf" name="pgsql" postmaster_user="postgres" shutdown_wait="2"/>
<ip address="192.168.102.101/24" sleeptime="10"/>
</resources>
<service autostart="0" name="pgsql" recovery="relocate">
<postgres-8 ref="pgsql"/>
<ip ref="192.168.102.101/24"/>
</service>
</rm>
The databases were initialized via "service postgresql initdb", other than that everything is default after "yum install postgresql-server postgresql"
05aab5c230e27092d4de8bcd67d103dcef0ac1dd 6e4c91f3e6dbaf3ba5e4e0766936f00c53a63b84 2fd3610542693ffa08226499b9697916a06ea0cd Fixes this in Rawhide, Miroslav can you get this built into a RHEL package. *** This bug has been marked as a duplicate of bug 915151 *** |
Created attachment 578684 [details] cluster.conf with whitespace in service name Description of problem: Creating a postgres8 service using a whitespace in the service name causes problem checking the PID-file: Example: ccs -h node1 --addsubservice postgres ip:fs:postgres-8 name="postgres server" Starting the postgres8 service doesn't write a pid to the pid file and service monitoring tells it is not running although it is Version-Release number of selected component (if applicable): resource-agents-3.9.2-7.el6.x86_64 How reproducible: Steps to Reproduce: 1. Create a service like ccs -h node1 --addsubservice postgres ip:fs:postgres-8 name="postgres server". It will not complain about the whitespace. /var/log/cluster/rgmanager.log however will show: Apr 18 19:25:53 rgmanager [postgres-8] Checking Existence Of File /var/lib/pgsql/data/postgresql.conf [postgres-8:postgres server] > Failed - File Is NoApr 18 19:25:53 rgmanager [postgres-8] Verifying Configuration Of postgres-8:postgres server > Failed Apr 18 19:25:53 rgmanager start on postgres-8 "postgres server" returned 2 (invalid argument(s)) Removing the whitespace fixes the problem. ccs however should warn when creating. 2. Missing PID: /var/log/cluster/rgmanager.log will show: Apr 19 17:35:48 rgmanager Service service:postgres started Apr 19 17:36:52 rgmanager [postgres-8] Verifying Configuration Of postgres-8:postgres_server Apr 19 17:36:52 rgmanager [postgres-8] Verifying Configuration Of postgres-8:postgres_server > Succeed Apr 19 17:36:52 rgmanager [postgres-8] Monitoring Service postgres-8:postgres_server Apr 19 17:36:52 rgmanager [postgres-8] Monitoring Service postgres-8:postgres_server > Service Is Not Running Apr 19 17:36:52 rgmanager status on postgres-8 "postgres_server" returned 1 (generic error) Apr 19 17:36:52 rgmanager Stopping service service:postgres Apr 19 17:36:52 rgmanager [postgres-8] Verifying Configuration Of postgres-8:postgres_server Apr 19 17:36:52 rgmanager [postgres-8] Verifying Configuration Of postgres-8:postgres_server > Succeed Apr 19 17:36:52 rgmanager [postgres-8] Stopping Service postgres-8:postgres_server Apr 19 17:36:52 rgmanager [postgres-8] Stopping Service postgres-8:postgres_server > Succeed Postgres repeatingly stop and starts now. The problem is caused by the script /usr/share/cluster/utils/ra-skelet.sh which is not only checking the existence of the pid file, but also tries to read the pid from it: read pid < "$pid_file" if [ -z "$pid" ]; then return $OCF_ERR_GENERIC fi Since there is no PID, the check fails. Making a quick change to /usr/share/cluster/postgres-8.sh seems to solve the problem: ### Bugfix pgrep -u $OCF_RESKEY_postmaster_user -f /usr/bin/postmaster > $PSQL_pid_file ### Bugfix /var/log/cluster/rgmanager.log now shows: Apr 19 17:43:12 rgmanager [postgres-8] Generating New Config File /postgres-8/postgres-8:postgres_server/postgresql.conf From /var/lib/pgsql/data/postgrApr 19 17:43:12 rgmanager [postgres-8] Generating New Config File /postgres-8/postgres-8:postgres_server/postgresql.conf From /var/lib/pgsql/data/postgrApr 19 17:43:12 rgmanager [mysql] Starting Service mysql:mysql_server > Succeed Apr 19 17:43:14 rgmanager [postgres-8] Starting Service postgres-8:postgres_server > Succeed Apr 19 17:43:14 rgmanager Service service:postgres started Apr 19 17:44:17 rgmanager [postgres-8] Verifying Configuration Of postgres-8:postgres_server Apr 19 17:44:17 rgmanager [postgres-8] Verifying Configuration Of postgres-8:postgres_server > Succeed Apr 19 17:44:17 rgmanager [postgres-8] Monitoring Service postgres-8:postgres_server Apr 19 17:44:17 rgmanager [postgres-8] Monitoring Service postgres-8:postgres_server > Service Is Running