Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1236980 - [SELinux]: RHEL7.1CTDB node goes to DISCONNECTED/BANNED state when multiple nodes are rebooted
[SELinux]: RHEL7.1CTDB node goes to DISCONNECTED/BANNED state when multiple n...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: samba (Show other bugs)
3.1
Unspecified Unspecified
urgent Severity urgent
: ---
: RHGS 3.1.0
Assigned To: Jose A. Rivera
surabhi
: Regression
Depends On: 1224879
Blocks: 1202842 1212796 1241095
  Show dependency treegraph
 
Reported: 2015-06-30 02:21 EDT by surabhi
Modified: 2015-08-04 10:14 EDT (History)
12 users (show)

See Also:
Fixed In Version: glusterfs-3.7.1-10, selinux-policy-3.13.1-33.el7
Doc Type: Bug Fix
Doc Text:
After multiple CTDB cluster nodes were rebooted one after another while I/O from a Windows client was set, the status of the cluster was incorrectly displayed as UNHEALTHY and the status of the nodes as BANNED or DISCONNECTED. With this update, the related SELinux policy no longer prevents signal transmission between the CTDB cluster and certain Samba processes. As a result, the status of the cluster and the nodes displays properly in the above situation.
Story Points: ---
Clone Of:
: 1241095 (view as bug list)
Environment:
Last Closed: 2015-07-29 01:08:25 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1495 normal SHIPPED_LIVE Important: Red Hat Gluster Storage 3.1 update 2015-07-29 04:26:26 EDT

  None (edit)
Description surabhi 2015-06-30 02:21:24 EDT
Description of problem:

CTDB cluster doesn't come to healthy state when multiple nodes are rebooted one after the other and I/O's are running from windows client.

1st time:
**************
Out of 4 node CTDB cluster, when rebooted two nodes one after the other, the node comes back and remains in UNHEALTHY state and two other nodes goes to BANNING state.

2nd time:
************
Out of 4 node CTDB cluster, when rebooted two nodes one after the other, the node comes back and remains in UNHEALTHY state and two other nodes goes to DISCONNECTED state.

It happens even without running the I/O's.

Version-Release number of selected component (if applicable):
ctdb2.5-2.5.5-2.el7rhgs.x86_64

How reproducible:
Always

Steps to Reproduce:
1.Create a CTDB setup
2. Mount volume using VIP 
3. start i/o's from windows client
4. reboot node 1, check ctdb status 
5. reboot node 3 , check ctdb status
6. wait for both nodes to come up, check ctdb status
7. ctdb status shows the nodes in UNHEALTHY/DISCONNECTED state.
8. In one scenario node goes to banned state.

Actual results:
CTDB cluster UNHEALTHY.
Node goes to banned/DISCONNECTED state 

Expected results:

Once all the nodes come up, the cluster should be up and all nodes should be in OK state.

Additional info:

When test was run in SELinux enforcing mode, there were AVC's related to ctdb and iptables.
type=AVC msg=audit(06/30/2015 01:25:33.897:367) : avc:  denied  { read } for  pid=4431 comm=iptables path=/var/lib/ctdb/iptables-ctdb.flock dev="dm-0" ino=67681652 scontext=system_u:system_r:iptables_t:s0 tcontext=system_u:object_r:ctdbd_var_lib_t:s0 tclass=file 

Switched the SELinux to permissive mode.
Still cluster not coming to healthy state.

Will provide the sosreports.
Comment 4 surabhi 2015-07-03 04:53:22 EDT
Even with the new build CTDB2.5.5-3 , the nodes are not coming to healthy state. after reboot.

Seeing following AVC's when a system is rebooted and trying to failback.
 type=AVC msg=audit(07/03/2015 01:30:25.839:154) : avc:  denied  { block_suspend } for  pid=31332 comm=smbd capability=block_suspend  scontext=system_u:system_r:smbd_t:s0 tcontext=system_u:system_r:smbd_t:s0 tclass=capability2
Comment 5 surabhi 2015-07-03 04:59:19 EDT
Worked with smb-dev and SELinux team to root cause this and seems like SELinux issue.
The fix has to come in the next build of Selinux for RHEL7.1.
The SELinux bz for RHEL7.1 is https://bugzilla.redhat.com/show_bug.cgi?id=1224879
Comment 11 surabhi 2015-07-08 08:33:11 EDT
With the policy provided in #C9 ,
With multiple reboots of nodes,all nodes comes to OK state.

No AVC's are seen related to iptables, winbind and ctdb.
Please include these policies in RHEL7.1 selinux policy build.
Comment 12 surabhi 2015-07-09 03:19:29 EDT
With #C25 in BZ : https://bugzilla.redhat.com/show_bug.cgi?id=1224879 , All the AVC's are fixed now.Need RHEL7 SELinux policy build to verify the bug.
Comment 13 surabhi 2015-07-15 05:37:00 EDT
With SELinux policy build :

selinux-policy-targeted-3.13.1-32.el7.noarch
selinux-policy-3.13.1-32.el7.noarch

I am seeing following AVC's which were not seen in the earlier build.
Worked with Milos on the same and found that the rule 
allow ctdbd_t systemd_systemctl_exec_t : file { ioctl read getattr lock execute execute_no_trans open } ;  is present in .31el7 build but is missing from .32el7 build.

Updated RHEL policy BZ : https://bugzilla.redhat.com/show_bug.cgi?id=1224879
Comment 14 Miroslav Grepl 2015-07-15 07:02:04 EDT
It is strange.

Lukas,
can you check it?
Comment 15 Lukas Vrabec 2015-07-15 07:04:50 EDT
This is very strange. 
Actually, I'm working on this issue.
Comment 17 Lukas Vrabec 2015-07-15 09:17:42 EDT
commit ce652d6c62c6d38d1dab05b862cecc863075d28c
Author: Lukas Vrabec <lvrabec@redhat.com>
Date:   Wed Jul 15 14:01:16 2015 +0200

    Allow ctdbd_t send signull to samba_unconfined_net_t.

commit 4aea5f1b161c8e711f593cf123de3b155ba71229
Author: Lukas Vrabec <lvrabec@redhat.com>
Date:   Wed Jul 15 14:00:39 2015 +0200

    Add samba_signull_unconfined_net()

commit 645b04ea4006f4f25f606662cdf9b526df7226e5
Author: Lukas Vrabec <lvrabec@redhat.com>
Date:   Wed Jul 15 13:44:41 2015 +0200

    Add samba_signull_winbind()
Comment 18 Lukas Vrabec 2015-07-15 10:48:15 EDT
I make new selinux-policy build with fixes.
Comment 19 Rejy M Cyriac 2015-07-15 11:18:07 EDT
We need a RHEL 7.1.z build for the BZ to be moved to ON_QA

The fix is to be tested with the new selinux-policy-3.13.1-33.el7 build
Comment 26 surabhi 2015-07-16 02:35:15 EDT
with the build selinux-policy-3.13.1-33.el7.noarch 
selinux-policy-targeted-3.13.1-33.el7.noarch

There is no AVC seen and all ctdb nodes comes to OK state after rebooting multiple nodes.

Need 7.1.z build for this bug.
Moving it to verified with this build which is for 7.2.
Comment 27 errata-xmlrpc 2015-07-29 01:08:25 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1495.html

Note You need to log in before you can comment on or make changes to this bug.