Hide Forgot
Overcloud deployment fails in both 1 and 3 controller setup(s) (without containers): 22:35:21 Error: /Stage[main]/Tripleo::Profile::Pacemaker::Database::Mysql/Exec[galera-ready]: Failed to call refresh: /usr/bin/clustercheck >/dev/null returned 1 instead of one of [0] 22:35:21 Error: /Stage[main]/Tripleo::Profile::Pacemaker::Database::Mysql/Exec[galera-ready]: /usr/bin/clustercheck >/dev/null returned 1 instead of one of [0] 22:35:21 Error: Failed to apply catalog: Execution of '/usr/bin/mysql -NBe SELECT CONCAT(User, '@',Host) AS User FROM mysql.user' returned 1: ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2) Snapshot: 2017-06-21.5 openstack-tripleo-heat-templates -7.0.0-0.20170616123155.el7ost puppet-tripleo-7.1.1-0.20170615141731.el7ost
looking at the captured var/log/cluster/corosync.log on the overcloud, I see: Jun 21 22:04:49 galera(galera)[23595]: INFO: now attempting to detect last commit version using 'mysqld_safe --wsrep-recover' Jun 21 22:04:49 galera(galera)[23595]: ERROR: Unable to detect last known write sequence number Jun 21 18:04:49 [16546] controller-0 lrmd: notice: operation_finished: galera_start_0:23595:stderr [ cat: /var/lib/mysql/grastate.dat: No such file or directory ] Jun 21 18:04:49 [16546] controller-0 lrmd: notice: operation_finished: galera_start_0:23595:stderr [ ocf-exit-reason:Unable to detect last known write sequence number ] Which shows that the galera resource agent could not parse the initial empty last-seqno from mysql. Consequently resource start operation failed, and galera never went up. hence the reported clustercheck failure. At this stage it's unclear why. Investigation ongoing
OK, it's probably a SELinux issue: [0;31mtype=AVC msg=audit(1498082689.658:317): avc: denied { open } for pid=24495 comm="touch" path="/tmp/tmp.cgvP0Qe4oc" dev="vda2" ino=910 scontext=system_u:system_r:mysqld_safe_t:s0 tcontext=system_u:object_r:cluster_tmp_t:s0 tclass=file[0m [0;31mtype=AVC msg=audit(1498082689.659:318): avc: denied { setattr } for pid=24496 comm="chown" name="tmp.cgvP0Qe4oc" dev="vda2" ino=910 scontext=system_u:system_r:mysqld_safe_t:s0 tcontext=system_u:object_r:cluster_tmp_t:s0 tclass=file[0m [0;31mtype=AVC msg=audit(1498082689.682:319): avc: denied { open } for pid=23728 comm="mysqld_safe" path="/tmp/tmp.cgvP0Qe4oc" dev="vda2" ino=910 scontext=system_u:system_r:mysqld_safe_t:s0 tcontext=system_u:object_r:cluster_tmp_t:s0 tclass=file[0m [0;31mtype=AVC msg=audit(1498082689.682:320): avc: denied { open } for pid=23728 comm="mysqld_safe" path="/tmp/tmp.cgvP0Qe4oc" dev="vda2" ino=910 scontext=system_u:system_r:mysqld_safe_t:s0 tcontext=system_u:object_r:cluster_tmp_t:s0 tclass=file[0m [0;31mtype=AVC msg=audit(1498082689.689:321): avc: denied { open } for pid=23728 comm="mysqld_safe" path="/tmp/tmp.cgvP0Qe4oc" dev="vda2" ino=910 scontext=system_u:system_r:mysqld_safe_t:s0 tcontext=system_u:object_r:cluster_tmp_t:s0 tclass=file[0m [0;31mtype=AVC msg=audit(1498082689.689:322): avc: denied { open } for pid=23728 comm="mysqld_safe" path="/tmp/tmp.cgvP0Qe4oc" dev="vda2" ino=910 scontext=system_u:system_r:mysqld_safe_t:s0 tcontext=system_u:object_r:cluster_tmp_t:s0 tclass=file[0m [0;31mtype=AVC msg=audit(1498082689.691:323): avc: denied { open } for pid=24546 comm="mysqld_safe" path="/tmp/tmp.cgvP0Qe4oc" dev="vda2" ino=910 scontext=system_u:system_r:mysqld_safe_t:s0 tcontext=system_u:object_r:cluster_tmp_t:s0 tclass=file[0m [0;31mtype=AVC msg=audit(1498082689.691:324): avc: denied { open } for pid=24546 comm="mysqld_safe" path="/tmp/tmp.cgvP0Qe4oc" dev="vda2" ino=910 scontext=system_u:system_r:mysqld_safe_t:s0 tcontext=system_u:object_r:cluster_tmp_t:s0 tclass=file[0m [0;31mtype=AVC msg=audit(1498082689.695:325): avc: denied { open } for pid=23728 comm="mysqld_safe" path="/tmp/tmp.cgvP0Qe4oc" dev="vda2" ino=910 scontext=system_u:system_r:mysqld_safe_t:s0 tcontext=system_u:object_r:cluster_tmp_t:s0 tclass=file[0m [0;31mtype=AVC msg=audit(1498082689.695:326): avc: denied { open } for pid=23728 comm="mysqld_safe" path="/tmp/tmp.cgvP0Qe4oc" dev="vda2" ino=910 scontext=system_u:system_r:mysqld_safe_t:s0 tcontext=system_u:object_r:cluster_tmp_t:s0 tclass=file[0m [0;31m[0m [0;36m...ignoring[0m The temporary file that the resource agent needs to create to hold the output of mysqld_safe is being denied. SELinux issue seems similar to https://bugzilla.redhat.com/show_bug.cgi?id=1439182
Attila, could you tell if openstack-selinux-0.8.5-6.el7ost or greater was available in the overcloud image that you used for deployment?
I have exactly the same error. # rpm -qa openstack-selinux openstack-selinux-0.8.7-1.el7ost.noarch
Lon can you peek at the AVC and if this is real? Also can we get the job run in permissive and capture the /var/log/audit/audit.log file from that run please?
Created attachment 1290761 [details] selinux audit.log from enforcing run on controller
Are there logs from a permissive run? Without them, this is going to end up being whack-a-mole with AVCs... On the surface, this looks like a base selinux-policy issue concerning clustering and management of mysqld. Granting access to cluster_tmp_t isn't likely to be dangerous, but, doing it within openstack-selinux should be a last resort (as this will likely affect RHEL customers who are not using OpenStack at all)
Hmm. [root@localhost ~]# audit2why type=AVC msg=audit(1498082689.658:317): avc: denied { open } for pid=24495 comm="touch" path="/tmp/tmp.cgvP0Qe4oc" dev="vda2" ino=910 scontext=system_u:system_r:mysqld_safe_t:s0 tcontext=system_u:object_r:cluster_tmp_t:s0 tclass=file type=AVC msg=audit(1498082689.658:317): avc: denied { open } for pid=24495 comm="touch" path="/tmp/tmp.cgvP0Qe4oc" dev="vda2" ino=910 scontext=system_u:system_r:mysqld_safe_t:s0 tcontext=system_u:object_r:cluster_tmp_t:s0 tclass=file Was caused by: Unknown - would be allowed by active policy Possible mismatch between this policy and the one under which the audit message was generated. Possible mismatch between current in-memory boolean settings vs. permanent ones. [root@localhost ~]# rpm -q selinux-policy selinux-policy-3.13.1-102.el7_3.15.noarch [root@localhost ~]# rpm -q openstack-selinux openstack-selinux-0.8.6-2.el7ost.noarch
Updating to 0.8.7-1 causes this to begin occurring.
https://github.com/redhat-openstack/openstack-selinux/commit/74ddc0e5317694ccbab7997f85cb862b8d01c003
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3462