Bug 1303550 - [vdsm] Require selinux-policy fix for CephFS (platform bug 1365640 - released 2016-Sep-15)
[vdsm] Require selinux-policy fix for CephFS (platform bug 1365640 - released...
Status: CLOSED CURRENTRELEASE
Product: vdsm
Classification: oVirt
Component: General (Show other bugs)
4.17.18
x86_64 Unspecified
high Severity high (vote)
: ovirt-4.0.5
: 4.18.15.1
Assigned To: Allon Mureinik
Kevin Alon Goldblatt
: Reopened
Depends On: 1365640
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-01 04:42 EST by Elad
Modified: 2016-11-29 08:09 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-11-24 04:40:01 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
ykaul: ovirt‑4.0.z?
ylavi: exception?
ebenahar: planning_ack?
tnisan: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)
sosreport from hypervisor (6.19 MB, application/x-xz)
2016-02-01 04:42 EST, Elad
no flags Details
engine.log (737.55 KB, application/x-gzip)
2016-02-01 04:53 EST, Elad
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 64505 master MERGED spec: Require selinux-policy-targeted for CephFS 2016-09-30 07:57 EDT
oVirt gerrit 65054 ovirt-4.0 MERGED spec: Require selinux-policy-targeted for CephFS 2016-10-03 09:40 EDT
oVirt gerrit 65305 ovirt-4.0.5 MERGED spec: Require selinux-policy-targeted for CephFS 2016-10-11 07:53 EDT

  None (edit)
Description Elad 2016-02-01 04:42:59 EST
Created attachment 1120040 [details]
sosreport from hypervisor

Description of problem:
Sanlock fails to acquire lock for storage a POSIXFS cephfs storage domain attachment to a storage pool (either an existing storage pool and new one).

Version-Release number of selected component (if applicable):
vdsm-4.17.18-0.el7ev.noarch
sanlock-3.2.4-2.el7_2.x86_64
rhevm-3.6.2.6-0.1.el6.noarch

How reproducible:
Always

Steps to Reproduce:
1. Create a storage domain of POSIXFS compliant with ceph VFS type


Actual results:

Storage domain creation succeeds, storage domain structure is created under the specified location on the ceph server.

engine.log:

2016-02-01 09:33:20,452 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand] (ajp-/127.0.0.1:8702-4) [1332e789] START, CreateStorageDomainVDSCommand(HostName = green-vdsa, CreateStorageDomainVDSCommandParameters:{runAsync='true', hostId='3f291f8a-23af-4bfb-a545-7c51badee0f5', storageDomain='StorageDomainStatic:{name='ceph1', id='c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c'}', args='10.35.65.18:6789:/1111'}), log id: 6bc1e249
2016-02-01 09:33:21,065 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand] (ajp-/127.0.0.1:8702-4) [1332e789] FINISH, CreateStorageDomainVDSCommand, log id: 6bc1e249


Attach storage domain fails with sanlock exception:

sanlock.log:

2016-02-01 11:33:25+0200 64601 [24096]: s3 lockspace c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c:1:/rhev/data-center/mnt/10.35.65.18:6789:_1111/c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c/dom_md/ids:0
2016-02-01 11:33:25+0200 64601 [24675]: open error -13 /rhev/data-center/mnt/10.35.65.18:6789:_1111/c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c/dom_md/ids
2016-02-01 11:33:25+0200 64601 [24675]: s3 open_disk /rhev/data-center/mnt/10.35.65.18:6789:_1111/c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c/dom_md/ids error -13
2016-02-01 11:33:26+0200 64602 [24096]: s3 add_lockspace fail result -19

vdsm.log:

jsonrpc.Executor/1::ERROR::2016-02-01 11:33:26,406::task::866::Storage.TaskManager.Task::(_setError) Task=`8b5a8023-8ad7-4b04-b2d9-cd5f3989392e`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1210, in attachStorageDomain
    pool.attachSD(sdUUID)
  File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 896, in attachSD
    dom.acquireHostId(self.id)
  File "/usr/share/vdsm/storage/sd.py", line 533, in acquireHostId
    self._clusterLock.acquireHostId(hostId, async)
  File "/usr/share/vdsm/storage/clusterlock.py", line 234, in acquireHostId
    raise se.AcquireHostIdFailure(self._sdUUID, e)
AcquireHostIdFailure: Cannot acquire host id: (u'c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c', SanlockException(19, 'Sanlock lockspace add failure', 'No such device'))


Expected results:
ceph storage domain attachment to a storage pool should succeed.

Additional info:
sosreport from hypervisor
Comment 1 Elad 2016-02-01 04:53 EST
Created attachment 1120041 [details]
engine.log
Comment 2 Fred Rolland 2016-03-02 07:47:15 EST
It looks like Sanlock user don't have permissions on the mount.
In NFS, it can be fixed in the export server by specifying all_squash/anonuid/anongid see [1]

[root@RHEL7 ~]# su - sanlock -s /bin/bash
Last login: Wed Mar  2 14:06:34 IST 2016 on pts/0
-bash-4.2$ 
-bash-4.2$ 
-bash-4.2$ whoami
sanlock
-bash-4.2$ touch /rhev/data-center/mnt/ceph-1.qa.lab\:6789\:_1111/test
touch: cannot touch ‘/rhev/data-center/mnt/ceph-1.qa.lab:6789:_1111/test’: Permission denied
-bash-4.2$ exit
logout
[root@RHEL7 ~]# su - vdsm -s /bin/bash
Last login: Wed Mar  2 12:19:11 IST 2016 on pts/1
-bash-4.2$ touch /rhev/data-center/mnt/ceph-1.qa.lab\:6789\:_1111/test
-bash-4.2$ rm /rhev/data-center/mnt/ceph-1.qa.lab\:6789\:_1111/test
-bash-4.2$ 

[1] http://www.ovirt.org/documentation/how-to/troubleshooting/troubleshooting-nfs-storage-issues/
Comment 3 Fred Rolland 2016-03-02 10:04:49 EST
Permissions of directory :
ll
total 0
drwxr-xr-x 1 vdsm kvm 0 Mar  2 14:08 1111
Comment 4 Fred Rolland 2016-03-06 10:50:11 EST
Changing the root dirctory permissions solve this issue for me:
chown 36:36 1111/

A storage domain was created, and a disk was added successfully.

Elad, can we close this one ?
Comment 5 Elad 2016-03-07 04:15:30 EST
Yes
Comment 6 Fred Rolland 2016-03-07 05:15:47 EST
SE linux was blocking sanlock. Moving to permissive mode is solving this specific issue.
Comment 7 Elad 2016-03-07 07:05:38 EST
Fred, I think we should re-open this bug since sanlock cannot acquire lock while working in enforcing.
Comment 8 Fred Rolland 2016-03-07 07:51:06 EST
Sure, no problem
Comment 9 Allon Mureinik 2016-03-07 08:13:30 EST
(In reply to Elad from comment #7)
> Fred, I think we should re-open this bug since sanlock cannot acquire lock
> while working in enforcing.

Yup.
We need two things here:
1. Report a bug on selinux-policy with these details (hopefully, with a reproducer that doesn't involve oVirt) [will probably need a couple of BZs for both Fedora and RHEL]
2. Reopen this BZ and make it dependent on the selinux bug.
Comment 10 Tal Nisan 2016-04-14 04:48:15 EDT
Moving to 3.6.7 as bug 1315332 will not be converged for 3.6.6
Comment 11 Yaniv Lavi (Dary) 2016-05-23 09:21:45 EDT
oVirt 4.0 beta has been released, moving to RC milestone.
Comment 12 Yaniv Lavi (Dary) 2016-05-23 09:24:33 EDT
oVirt 4.0 beta has been released, moving to RC milestone.
Comment 14 Allon Mureinik 2016-10-04 06:34:41 EDT
Taking the BZ in order to take responsibility if one of my patches is faulty.
Comment 15 Kevin Alon Goldblatt 2016-11-02 11:47:12 EDT
Verified with code:
----------------------
vdsm-4.18.999-761.git2137fe6.el7.centos.x86_64
rhevm-4.0.5-0.1.el7ev.noarch

Verified with the following scenario:
--------------------------------------
Created a ceph storage domain 

Acual Results:
The domain was created successfully
No SanlockException was thrown
No Errors repored
Was able to create a disk on the storage domain successfully

Moving to VERIFIED!
Comment 16 Allon Mureinik 2016-11-06 09:50:41 EST
(In reply to Kevin Alon Goldblatt from comment #15)
> Verified with the following scenario:
> --------------------------------------
> Created a ceph storage domain 
Just to confirm - you mean a POSIXFS domain with an underlying CephFS, right?
Comment 17 Allon Mureinik 2016-11-24 04:40:01 EST
4.0.5 has been released, closing.
Comment 18 Kevin Alon Goldblatt 2016-11-29 08:09:11 EST
(In reply to Allon Mureinik from comment #16)
> (In reply to Kevin Alon Goldblatt from comment #15)
> > Verified with the following scenario:
> > --------------------------------------
> > Created a ceph storage domain 
> Just to confirm - you mean a POSIXFS domain with an underlying CephFS, right?

I created a storage domain of POSIXFS compliant with ceph VFS type
Comment 19 Kevin Alon Goldblatt 2016-11-29 08:09:54 EST
(In reply to Kevin Alon Goldblatt from comment #18)
> (In reply to Allon Mureinik from comment #16)
> > (In reply to Kevin Alon Goldblatt from comment #15)
> > > Verified with the following scenario:
> > > --------------------------------------
> > > Created a ceph storage domain 
> > Just to confirm - you mean a POSIXFS domain with an underlying CephFS, right?
> 
> I created a storage domain of POSIXFS compliant with ceph VFS type

Yes I created a storage domain of POSIXFS compliant with ceph VFS type

Note You need to log in before you can comment on or make changes to this bug.