Description of problem: Activating NFS storage domain fails when sanlock fail to acquire host id: 2018-06-21 19:34:00,569+0300 INFO (jsonrpc/4) [storage.SANLock] Acquiring host id for domain 08272182-9fb1-4609-bd3b-0246b66eafa3 (id=250, async=False) (clusterlock:284) 2018-06-21 19:34:00,570+0300 INFO (jsonrpc/4) [vdsm.api] FINISH createStoragePool error=Cannot acquire host id: (u'08272182-9fb1-4609-bd3b-0246b66eafa3', SanlockException(2, 'Sanlock lockspace add failure', 'No such file or directory')) from=::ffff:10.35.0.102,51372, flow_id=34fd12d, task_id=ecdfeeb2-f9c3-4b1c-853b-91c7929022fd (api:51) 2018-06-21 19:34:00,570+0300 ERROR (jsonrpc/4) [storage.TaskManager.Task] (Task='ecdfeeb2-f9c3-4b1c-853b-91c7929022fd') Unexpected error (task:875) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run return fn(*args, **kargs) File "<decorator-gen-31>", line 2, in createStoragePool File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, in method ret = func(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1003, in createStoragePool leaseParams) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 634, in create self._acquireTemporaryClusterLock(msdUUID, leaseParams) File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 565, in _acquireTemporaryClusterLock msd.acquireHostId(self.id) File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 829, in acquireHostId self._manifest.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 455, in acquireHostId self._domainLock.acquireHostId(hostId, async) File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 315, in acquireHostId raise se.AcquireHostIdFailure(self._sdUUID, e) AcquireHostIdFailure: Cannot acquire host id: (u'08272182-9fb1-4609-bd3b-0246b66eafa3', SanlockException(2, 'Sanlock lockspace add failure', 'No such file or directory')) # cat /var/log/sanlock.log 2018-06-20 01:12:47 1769 [7039]: sanlock daemon started 3.6.0 host b262e149-c91c-4ad6-84bb-f21f898133fc.voodoo1.tl 2018-06-21 04:06:43 98604 [7039]: helper pid 7040 term signal 15 2018-06-21 04:07:00 5 [843]: lockfile open error /var/run/sanlock/sanlock.pid: Permission denied 2018-06-21 18:01:37 4 [823]: lockfile open error /var/run/sanlock/sanlock.pid: Permission denied 2018-06-21 19:03:28 3716 [6551]: lockfile open error /var/run/sanlock/sanlock.pid: Permission denied 2018-06-21 19:06:08 3875 [7030]: lockfile open error /var/run/sanlock/sanlock.pid: Permission denied 2018-06-21 19:19:50 4698 [8344]: lockfile open error /var/run/sanlock/sanlock.pid: Permission denied Enabling permissive mode avoids this issue: # ausearch -m avc -ts recent ---- time->Thu Jun 21 19:38:19 2018 type=AVC msg=audit(1529599099.639:855): avc: denied { dac_override } for pid=9851 comm="sanlock" capability=1 scontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tcontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tclass=capability permissive=1 # ps -efZ | grep sanlock | grep -v grep system_u:system_r:sanlock_t:s0-s0:c0.c1023 sanlock 9851 1 0 19:38 ? 00:00:00 /usr/sbin/sanlock daemon system_u:system_r:sanlock_t:s0-s0:c0.c1023 root 9852 9851 0 19:38 ? 00:00:00 /usr/sbin/sanlock daemon Version-Release number of selected component (if applicable): # rpm -qa | egrep 'sanlock|selinux' | sort libselinux-2.8-1.fc28.x86_64 libselinux-utils-2.8-1.fc28.x86_64 libvirt-lock-sanlock-4.2.0-1.fc28.x86_64 python2-libselinux-2.8-1.fc28.x86_64 python2-sanlock-3.6.0-2.fc28.x86_64 python3-libselinux-2.8-1.fc28.x86_64 rpm-plugin-selinux-4.14.1-9.fc28.x86_64 sanlock-3.6.0-2.fc28.x86_64 sanlock-lib-3.6.0-2.fc28.x86_64 selinux-policy-3.14.1-32.fc28.noarch selinux-policy-targeted-3.14.1-32.fc28.noarch How reproducible: 100% Steps to Reproduce: - Create NFS storage domain - Activate inactive storage domain Actual results: Activating storage domain fails because of the sanlock failure. Expected results: Activating storage domain succeeds. Workaround: Use permissive mode: # setenforce 0
Miroslav, can you take a look?
Adding also Lukas Vrabec, fedora maintainer.
Hi, Do the sanlock daemon require DAC_OVERRRIDE capability? We removed from distribution SELinux policy quite lot of allow rules allowing DAC_OVERRIDE capability for the deamon. In most cases it's issue on the application/daemon side, not SELinux side. For more info see following article: https://lukas-vrabec.com/index.php/2018/07/03/why-do-you-see-dac_override-selinux-denials/ I can help you to identify which file has too tight unix permissions please turn on full auditing on audit daemon and reproduce the scenario: https://lukas-vrabec.com/index.php/2018/07/16/how-to-enable-full-auditing-in-audit-daemon/ Thanks, Lukas.
Thanks Lukas, by default sanlock is using "sanlock" (179) for both its uid and gid. From reading the links, it sounds like sanlock should instead use gid of "root", is that correct? Nir, set gname = root in sanlock.conf to try this and see if it works.
David, Yes agree, apache and other services using the same fix.
Nir, can you confirm that using group "root" will not upset things on nfs with root squash?
---- type=PROCTITLE msg=audit(07/17/2018 09:51:29.082:1878) : proctitle=/usr/sbin/sanlock daemon type=PATH msg=audit(07/17/2018 09:51:29.082:1878) : item=0 name=/var/run/sanlock/ inode=444150 dev=00:16 mode=dir,755 ouid=sanlock ogid=sanlock rdev=00:00 obj=system_u:object_r:sanlock_var_run_t:s0 nametype=PARENT cap_fp=none cap_fi=none cap_fe=0 cap_fver=0 type=CWD msg=audit(07/17/2018 09:51:29.082:1878) : cwd=/ type=SYSCALL msg=audit(07/17/2018 09:51:29.082:1878) : arch=x86_64 syscall=openat success=no exit=EACCES(Permission denied) a0=0xffffff9c a1=0x7fff47fc0630 a2=O_WRONLY|O_CREAT|O_CLOEXEC a3=0x1a4 items=1 ppid=1 pid=15264 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=sanlock exe=/usr/sbin/sanlock subj=system_u:system_r:sanlock_t:s0-s0:c0.c1023 key=(null) type=AVC msg=audit(07/17/2018 09:51:29.082:1878) : avc: denied { dac_override } for pid=15264 comm=sanlock capability=dac_override scontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tcontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tclass=capability permissive=0 ----
Nir, have you been able to check if setting "gname = root" in sanlock.conf fixes the problem? If so, then I'll check in the fix to the code to use root as the default group.
Sorry, I did not have time to check this yet. Can you explain this change? why would we want to run sanlock with group root? I would like to run sanlock with group sanlock, and change selinux policy so sanlock can access its own pid file.
Apparently group root is the right way to do this according to the experts, but I don't understand it well enough to explain myself. The links in comment 3 explain it pretty well.
(In reply to David Teigland from comment #10) Base on: https://lukas-vrabec.com/index.php/2018/07/03/why-do-you-see-dac_override-selinux-denials/ The problem is: 1. sanlock running as root, before dropping permissions 2. sanlock try to write to /var/run/sanlock/sanlock.pid, owned by sanlock:sanlock So why not change /var/run/sanlock/sanlock.pid ownership to root:root? I tried this patch on RHEL 7.5: diff --git a/src/lockfile.c b/src/lockfile.c index 5a2518e..e837d49 100644 --- a/src/lockfile.c +++ b/src/lockfile.c @@ -89,13 +89,6 @@ int lockfile(const char *dir, const char *name, int uid, int gid) goto fail; } - rv = fchown(fd, uid, gid); - if (rv < 0) { - log_error("lockfile fchown error %s: %s", - path, strerror(errno)); - goto fail; - } - return fd; fail: close(fd); With this patch: # ls -lhZ /run/sanlock/ -rw-r--r--. root root system_u:object_r:sanlock_var_run_t:s0 sanlock.pid srw-rw----. sanlock sanlock system_u:object_r:sanlock_var_run_t:s0 sanlock.sock Did not test on Fedora 28 yet (environment issue).
Nir, that looks good to me if it resolves the issue.
(In reply to David Teigland from comment #12) Ok, I'll test it and post a patch to sanlock.
Nir, is this still a problem?
I tested this patch on Fedora 28, and it fixes the issue. https://lists.fedorahosted.org/archives/list/sanlock-devel@lists.fedorahosted.org/thread/ZTG4E264IMHVFSVDKDPA6HSGBUFBQ2JI/ Milos, can you review this patch?
LGTM.
David, we are still missing the fedora package with the fix, right?
sanlock-3.6.0-8.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2019-a27b970ddf
sanlock-3.6.0-4.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2019-ea4ecdc166
sanlock-3.6.0-8.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-a27b970ddf
sanlock-3.6.0-4.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-ea4ecdc166
sanlock-3.6.0-4.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.
sanlock-3.6.0-8.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.