Bug 1433714 - Galera fails to start on controller during split stack deployment
Summary: Galera fails to start on controller during split stack deployment
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-selinux
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 11.0 (Ocata)
Assignee: Ryan Hallisey
QA Contact: Udi Shkalim
URL:
Whiteboard:
Depends On:
Blocks: 1337784 1432353
TreeView+ depends on / blocked
 
Reported: 2017-03-19 12:47 UTC by Gurenko Alex
Modified: 2017-03-23 11:30 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-23 11:30:47 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
AVC errors from the controller (2.85 KB, text/plain)
2017-03-19 12:47 UTC, Gurenko Alex
no flags Details

Description Gurenko Alex 2017-03-19 12:47:31 UTC
Created attachment 1264613 [details]
AVC errors from the controller

Description of problem: galera fails to start on the controller which results in clustercheck fail and failed deployment


Version-Release number of selected component (if applicable):


How reproducible:

Follow the deployment of parent bug


Steps to Reproduce:
1. https://polarion.engineering.redhat.com/polarion/#/project/RHELOpenStackPlatform/workitem?id=RHELOSP-20870
2. wait until deployment fails
3. go to controller and check pcs status

Actual results:

pcs status returns:

Failed Actions:
* galera_start_0 on controller-0 'unknown error' (1): call=7, status=complete, exitreason='Unable to detect last known write sequence number',
    last-rc-change='Thu Mar 16 14:03:27 2017', queued=1ms, exec=4803ms

Expected results:

No galera issues during the deployment

Additional info:

following commands result in AVC denied messages and does not restore galera

mysqld_safe --user=mysql --datadir=/var/lib/mysql --tc-heuristic-recover commit --wsrep-recover
pcs resource cleanup galera
pcs resource enable galera

after doing setenforce 0 and re-running cleanup and enable commands, service restores and clustercheck passes with 200 OK status.

# clustercheck
HTTP/1.1 200 OK
Content-Type: text/plain
Connection: close
Content-Length: 32

Galera cluster node is synced.

Comment 1 Ryan Hallisey 2017-03-19 15:22:29 UTC
I'm not sure why there are AVCs from cluster_tmp_t when the boolean daemons_enable_cluster_mode is turned on by openstack-selinux. Maybe it's not on?
  `getsebool daemons_enable_cluster_mode`

type=AVC msg=audit(1489672619.764:107): avc:  denied  { getattr } for  pid=10148 comm="ovs-ctl" path="/usr/bin/hostname" dev="vda1" ino=8522292 scontext=system_u:system_r:openvswitch_t:s0 tcontext=system_u:object_r:hostname_exec_t:s0 tclass=file

There's also the AVC from openvswitch, but I don't if that's breaking anything here.

Comment 2 Gurenko Alex 2017-03-19 18:06:10 UTC
(In reply to Ryan Hallisey from comment #1)
> I'm not sure why there are AVCs from cluster_tmp_t when the boolean
> daemons_enable_cluster_mode is turned on by openstack-selinux. Maybe it's
> not on?
>   `getsebool daemons_enable_cluster_mode`
> 
> type=AVC msg=audit(1489672619.764:107): avc:  denied  { getattr } for 
> pid=10148 comm="ovs-ctl" path="/usr/bin/hostname" dev="vda1" ino=8522292
> scontext=system_u:system_r:openvswitch_t:s0
> tcontext=system_u:object_r:hostname_exec_t:s0 tclass=file
> 
> There's also the AVC from openvswitch, but I don't if that's breaking
> anything here.

 There is a separate bug for the openvswitch avc, but I guess I have not reached the point where it can be a problem. I will check the boolean.

Comment 3 Dan Macpherson 2017-03-23 07:19:27 UTC
Confirming this issue. I hit the same error and the "setenforce 0" corrects it. Definitely seems to be an SELinux issue.

Comment 4 James Slagle 2017-03-23 11:30:33 UTC
this was actually caused by: https://bugzilla.redhat.com/show_bug.cgi?id=1434996


Note You need to log in before you can comment on or make changes to this bug.