Bug 1303550 - [vdsm] Require selinux-policy fix for CephFS (platform bug 1365640 - released 2016-Sep-15)
Summary: [vdsm] Require selinux-policy fix for CephFS (platform bug 1365640 - released...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: 4.17.18
Hardware: x86_64
OS: Unspecified
high
high
Target Milestone: ovirt-4.0.5
: 4.18.15.1
Assignee: Allon Mureinik
QA Contact: Kevin Alon Goldblatt
URL:
Whiteboard:
Depends On: 1365640
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-02-01 09:42 UTC by Elad
Modified: 2016-11-29 13:09 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-11-24 09:40:01 UTC
oVirt Team: Storage
Embargoed:
ykaul: ovirt-4.0.z?
ylavi: exception?
ebenahar: planning_ack?
tnisan: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)
sosreport from hypervisor (6.19 MB, application/x-xz)
2016-02-01 09:42 UTC, Elad
no flags Details
engine.log (737.55 KB, application/x-gzip)
2016-02-01 09:53 UTC, Elad
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1095615 0 high CLOSED [RFE] Allow the use of CephFS as a storage domain within RHEV 2022-07-09 08:31:13 UTC
oVirt gerrit 64505 0 master MERGED spec: Require selinux-policy-targeted for CephFS 2020-09-10 12:51:17 UTC
oVirt gerrit 65054 0 ovirt-4.0 MERGED spec: Require selinux-policy-targeted for CephFS 2020-09-10 12:51:17 UTC
oVirt gerrit 65305 0 ovirt-4.0.5 MERGED spec: Require selinux-policy-targeted for CephFS 2020-09-10 12:51:17 UTC

Internal Links: 1095615

Description Elad 2016-02-01 09:42:59 UTC
Created attachment 1120040 [details]
sosreport from hypervisor

Description of problem:
Sanlock fails to acquire lock for storage a POSIXFS cephfs storage domain attachment to a storage pool (either an existing storage pool and new one).

Version-Release number of selected component (if applicable):
vdsm-4.17.18-0.el7ev.noarch
sanlock-3.2.4-2.el7_2.x86_64
rhevm-3.6.2.6-0.1.el6.noarch

How reproducible:
Always

Steps to Reproduce:
1. Create a storage domain of POSIXFS compliant with ceph VFS type


Actual results:

Storage domain creation succeeds, storage domain structure is created under the specified location on the ceph server.

engine.log:

2016-02-01 09:33:20,452 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand] (ajp-/127.0.0.1:8702-4) [1332e789] START, CreateStorageDomainVDSCommand(HostName = green-vdsa, CreateStorageDomainVDSCommandParameters:{runAsync='true', hostId='3f291f8a-23af-4bfb-a545-7c51badee0f5', storageDomain='StorageDomainStatic:{name='ceph1', id='c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c'}', args='10.35.65.18:6789:/1111'}), log id: 6bc1e249
2016-02-01 09:33:21,065 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand] (ajp-/127.0.0.1:8702-4) [1332e789] FINISH, CreateStorageDomainVDSCommand, log id: 6bc1e249


Attach storage domain fails with sanlock exception:

sanlock.log:

2016-02-01 11:33:25+0200 64601 [24096]: s3 lockspace c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c:1:/rhev/data-center/mnt/10.35.65.18:6789:_1111/c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c/dom_md/ids:0
2016-02-01 11:33:25+0200 64601 [24675]: open error -13 /rhev/data-center/mnt/10.35.65.18:6789:_1111/c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c/dom_md/ids
2016-02-01 11:33:25+0200 64601 [24675]: s3 open_disk /rhev/data-center/mnt/10.35.65.18:6789:_1111/c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c/dom_md/ids error -13
2016-02-01 11:33:26+0200 64602 [24096]: s3 add_lockspace fail result -19

vdsm.log:

jsonrpc.Executor/1::ERROR::2016-02-01 11:33:26,406::task::866::Storage.TaskManager.Task::(_setError) Task=`8b5a8023-8ad7-4b04-b2d9-cd5f3989392e`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1210, in attachStorageDomain
    pool.attachSD(sdUUID)
  File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 896, in attachSD
    dom.acquireHostId(self.id)
  File "/usr/share/vdsm/storage/sd.py", line 533, in acquireHostId
    self._clusterLock.acquireHostId(hostId, async)
  File "/usr/share/vdsm/storage/clusterlock.py", line 234, in acquireHostId
    raise se.AcquireHostIdFailure(self._sdUUID, e)
AcquireHostIdFailure: Cannot acquire host id: (u'c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c', SanlockException(19, 'Sanlock lockspace add failure', 'No such device'))


Expected results:
ceph storage domain attachment to a storage pool should succeed.

Additional info:
sosreport from hypervisor

Comment 1 Elad 2016-02-01 09:53:06 UTC
Created attachment 1120041 [details]
engine.log

Comment 2 Fred Rolland 2016-03-02 12:47:15 UTC
It looks like Sanlock user don't have permissions on the mount.
In NFS, it can be fixed in the export server by specifying all_squash/anonuid/anongid see [1]

[root@RHEL7 ~]# su - sanlock -s /bin/bash
Last login: Wed Mar  2 14:06:34 IST 2016 on pts/0
-bash-4.2$ 
-bash-4.2$ 
-bash-4.2$ whoami
sanlock
-bash-4.2$ touch /rhev/data-center/mnt/ceph-1.qa.lab\:6789\:_1111/test
touch: cannot touch ‘/rhev/data-center/mnt/ceph-1.qa.lab:6789:_1111/test’: Permission denied
-bash-4.2$ exit
logout
[root@RHEL7 ~]# su - vdsm -s /bin/bash
Last login: Wed Mar  2 12:19:11 IST 2016 on pts/1
-bash-4.2$ touch /rhev/data-center/mnt/ceph-1.qa.lab\:6789\:_1111/test
-bash-4.2$ rm /rhev/data-center/mnt/ceph-1.qa.lab\:6789\:_1111/test
-bash-4.2$ 

[1] http://www.ovirt.org/documentation/how-to/troubleshooting/troubleshooting-nfs-storage-issues/

Comment 3 Fred Rolland 2016-03-02 15:04:49 UTC
Permissions of directory :
ll
total 0
drwxr-xr-x 1 vdsm kvm 0 Mar  2 14:08 1111

Comment 4 Fred Rolland 2016-03-06 15:50:11 UTC
Changing the root dirctory permissions solve this issue for me:
chown 36:36 1111/

A storage domain was created, and a disk was added successfully.

Elad, can we close this one ?

Comment 5 Elad 2016-03-07 09:15:30 UTC
Yes

Comment 6 Fred Rolland 2016-03-07 10:15:47 UTC
SE linux was blocking sanlock. Moving to permissive mode is solving this specific issue.

Comment 7 Elad 2016-03-07 12:05:38 UTC
Fred, I think we should re-open this bug since sanlock cannot acquire lock while working in enforcing.

Comment 8 Fred Rolland 2016-03-07 12:51:06 UTC
Sure, no problem

Comment 9 Allon Mureinik 2016-03-07 13:13:30 UTC
(In reply to Elad from comment #7)
> Fred, I think we should re-open this bug since sanlock cannot acquire lock
> while working in enforcing.

Yup.
We need two things here:
1. Report a bug on selinux-policy with these details (hopefully, with a reproducer that doesn't involve oVirt) [will probably need a couple of BZs for both Fedora and RHEL]
2. Reopen this BZ and make it dependent on the selinux bug.

Comment 10 Tal Nisan 2016-04-14 08:48:15 UTC
Moving to 3.6.7 as bug 1315332 will not be converged for 3.6.6

Comment 11 Yaniv Lavi 2016-05-23 13:21:45 UTC
oVirt 4.0 beta has been released, moving to RC milestone.

Comment 12 Yaniv Lavi 2016-05-23 13:24:33 UTC
oVirt 4.0 beta has been released, moving to RC milestone.

Comment 14 Allon Mureinik 2016-10-04 10:34:41 UTC
Taking the BZ in order to take responsibility if one of my patches is faulty.

Comment 15 Kevin Alon Goldblatt 2016-11-02 15:47:12 UTC
Verified with code:
----------------------
vdsm-4.18.999-761.git2137fe6.el7.centos.x86_64
rhevm-4.0.5-0.1.el7ev.noarch

Verified with the following scenario:
--------------------------------------
Created a ceph storage domain 

Acual Results:
The domain was created successfully
No SanlockException was thrown
No Errors repored
Was able to create a disk on the storage domain successfully

Moving to VERIFIED!

Comment 16 Allon Mureinik 2016-11-06 14:50:41 UTC
(In reply to Kevin Alon Goldblatt from comment #15)
> Verified with the following scenario:
> --------------------------------------
> Created a ceph storage domain 
Just to confirm - you mean a POSIXFS domain with an underlying CephFS, right?

Comment 17 Allon Mureinik 2016-11-24 09:40:01 UTC
4.0.5 has been released, closing.

Comment 18 Kevin Alon Goldblatt 2016-11-29 13:09:11 UTC
(In reply to Allon Mureinik from comment #16)
> (In reply to Kevin Alon Goldblatt from comment #15)
> > Verified with the following scenario:
> > --------------------------------------
> > Created a ceph storage domain 
> Just to confirm - you mean a POSIXFS domain with an underlying CephFS, right?

I created a storage domain of POSIXFS compliant with ceph VFS type

Comment 19 Kevin Alon Goldblatt 2016-11-29 13:09:54 UTC
(In reply to Kevin Alon Goldblatt from comment #18)
> (In reply to Allon Mureinik from comment #16)
> > (In reply to Kevin Alon Goldblatt from comment #15)
> > > Verified with the following scenario:
> > > --------------------------------------
> > > Created a ceph storage domain 
> > Just to confirm - you mean a POSIXFS domain with an underlying CephFS, right?
> 
> I created a storage domain of POSIXFS compliant with ceph VFS type

Yes I created a storage domain of POSIXFS compliant with ceph VFS type


Note You need to log in before you can comment on or make changes to this bug.