Bug 1303550
Summary: | [vdsm] Require selinux-policy fix for CephFS (platform bug 1365640 - released 2016-Sep-15) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] vdsm | Reporter: | Elad <ebenahar> | ||||||
Component: | General | Assignee: | Allon Mureinik <amureini> | ||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Kevin Alon Goldblatt <kgoldbla> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 4.17.18 | CC: | amureini, bugs, ebenahar, frolland, kgoldbla, tnisan, ylavi | ||||||
Target Milestone: | ovirt-4.0.5 | Keywords: | Reopened | ||||||
Target Release: | 4.18.15.1 | Flags: | ykaul:
ovirt-4.0.z?
ylavi: exception? ebenahar: planning_ack? tnisan: devel_ack+ rule-engine: testing_ack+ |
||||||
Hardware: | x86_64 | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2016-11-24 09:40:01 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | 1365640 | ||||||||
Bug Blocks: | |||||||||
Attachments: |
|
Created attachment 1120041 [details]
engine.log
It looks like Sanlock user don't have permissions on the mount. In NFS, it can be fixed in the export server by specifying all_squash/anonuid/anongid see [1] [root@RHEL7 ~]# su - sanlock -s /bin/bash Last login: Wed Mar 2 14:06:34 IST 2016 on pts/0 -bash-4.2$ -bash-4.2$ -bash-4.2$ whoami sanlock -bash-4.2$ touch /rhev/data-center/mnt/ceph-1.qa.lab\:6789\:_1111/test touch: cannot touch ‘/rhev/data-center/mnt/ceph-1.qa.lab:6789:_1111/test’: Permission denied -bash-4.2$ exit logout [root@RHEL7 ~]# su - vdsm -s /bin/bash Last login: Wed Mar 2 12:19:11 IST 2016 on pts/1 -bash-4.2$ touch /rhev/data-center/mnt/ceph-1.qa.lab\:6789\:_1111/test -bash-4.2$ rm /rhev/data-center/mnt/ceph-1.qa.lab\:6789\:_1111/test -bash-4.2$ [1] http://www.ovirt.org/documentation/how-to/troubleshooting/troubleshooting-nfs-storage-issues/ Permissions of directory : ll total 0 drwxr-xr-x 1 vdsm kvm 0 Mar 2 14:08 1111 Changing the root dirctory permissions solve this issue for me: chown 36:36 1111/ A storage domain was created, and a disk was added successfully. Elad, can we close this one ? Yes SE linux was blocking sanlock. Moving to permissive mode is solving this specific issue. Fred, I think we should re-open this bug since sanlock cannot acquire lock while working in enforcing. Sure, no problem (In reply to Elad from comment #7) > Fred, I think we should re-open this bug since sanlock cannot acquire lock > while working in enforcing. Yup. We need two things here: 1. Report a bug on selinux-policy with these details (hopefully, with a reproducer that doesn't involve oVirt) [will probably need a couple of BZs for both Fedora and RHEL] 2. Reopen this BZ and make it dependent on the selinux bug. Moving to 3.6.7 as bug 1315332 will not be converged for 3.6.6 oVirt 4.0 beta has been released, moving to RC milestone. oVirt 4.0 beta has been released, moving to RC milestone. Taking the BZ in order to take responsibility if one of my patches is faulty. Verified with code: ---------------------- vdsm-4.18.999-761.git2137fe6.el7.centos.x86_64 rhevm-4.0.5-0.1.el7ev.noarch Verified with the following scenario: -------------------------------------- Created a ceph storage domain Acual Results: The domain was created successfully No SanlockException was thrown No Errors repored Was able to create a disk on the storage domain successfully Moving to VERIFIED! (In reply to Kevin Alon Goldblatt from comment #15) > Verified with the following scenario: > -------------------------------------- > Created a ceph storage domain Just to confirm - you mean a POSIXFS domain with an underlying CephFS, right? 4.0.5 has been released, closing. (In reply to Allon Mureinik from comment #16) > (In reply to Kevin Alon Goldblatt from comment #15) > > Verified with the following scenario: > > -------------------------------------- > > Created a ceph storage domain > Just to confirm - you mean a POSIXFS domain with an underlying CephFS, right? I created a storage domain of POSIXFS compliant with ceph VFS type (In reply to Kevin Alon Goldblatt from comment #18) > (In reply to Allon Mureinik from comment #16) > > (In reply to Kevin Alon Goldblatt from comment #15) > > > Verified with the following scenario: > > > -------------------------------------- > > > Created a ceph storage domain > > Just to confirm - you mean a POSIXFS domain with an underlying CephFS, right? > > I created a storage domain of POSIXFS compliant with ceph VFS type Yes I created a storage domain of POSIXFS compliant with ceph VFS type |
Created attachment 1120040 [details] sosreport from hypervisor Description of problem: Sanlock fails to acquire lock for storage a POSIXFS cephfs storage domain attachment to a storage pool (either an existing storage pool and new one). Version-Release number of selected component (if applicable): vdsm-4.17.18-0.el7ev.noarch sanlock-3.2.4-2.el7_2.x86_64 rhevm-3.6.2.6-0.1.el6.noarch How reproducible: Always Steps to Reproduce: 1. Create a storage domain of POSIXFS compliant with ceph VFS type Actual results: Storage domain creation succeeds, storage domain structure is created under the specified location on the ceph server. engine.log: 2016-02-01 09:33:20,452 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand] (ajp-/127.0.0.1:8702-4) [1332e789] START, CreateStorageDomainVDSCommand(HostName = green-vdsa, CreateStorageDomainVDSCommandParameters:{runAsync='true', hostId='3f291f8a-23af-4bfb-a545-7c51badee0f5', storageDomain='StorageDomainStatic:{name='ceph1', id='c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c'}', args='10.35.65.18:6789:/1111'}), log id: 6bc1e249 2016-02-01 09:33:21,065 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorageDomainVDSCommand] (ajp-/127.0.0.1:8702-4) [1332e789] FINISH, CreateStorageDomainVDSCommand, log id: 6bc1e249 Attach storage domain fails with sanlock exception: sanlock.log: 2016-02-01 11:33:25+0200 64601 [24096]: s3 lockspace c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c:1:/rhev/data-center/mnt/10.35.65.18:6789:_1111/c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c/dom_md/ids:0 2016-02-01 11:33:25+0200 64601 [24675]: open error -13 /rhev/data-center/mnt/10.35.65.18:6789:_1111/c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c/dom_md/ids 2016-02-01 11:33:25+0200 64601 [24675]: s3 open_disk /rhev/data-center/mnt/10.35.65.18:6789:_1111/c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c/dom_md/ids error -13 2016-02-01 11:33:26+0200 64602 [24096]: s3 add_lockspace fail result -19 vdsm.log: jsonrpc.Executor/1::ERROR::2016-02-01 11:33:26,406::task::866::Storage.TaskManager.Task::(_setError) Task=`8b5a8023-8ad7-4b04-b2d9-cd5f3989392e`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 873, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 49, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 1210, in attachStorageDomain pool.attachSD(sdUUID) File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper return method(self, *args, **kwargs) File "/usr/share/vdsm/storage/sp.py", line 896, in attachSD dom.acquireHostId(self.id) File "/usr/share/vdsm/storage/sd.py", line 533, in acquireHostId self._clusterLock.acquireHostId(hostId, async) File "/usr/share/vdsm/storage/clusterlock.py", line 234, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'c7ca9e53-7a7e-4c3f-8293-5c8b393caa5c', SanlockException(19, 'Sanlock lockspace add failure', 'No such device')) Expected results: ceph storage domain attachment to a storage pool should succeed. Additional info: sosreport from hypervisor