Bug 1593853 - Sanlock fail to create lockfile on Fedora 28
Summary: Sanlock fail to create lockfile on Fedora 28
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: sanlock
Version: 28
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
Assignee: David Teigland
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: oVirt_on_Fedora
TreeView+ depends on / blocked
 
Reported: 2018-06-21 16:50 UTC by Nir Soffer
Modified: 2020-05-20 23:20 UTC (History)
9 users (show)

Fixed In Version: sanlock-3.6.0-4.fc28 sanlock-3.6.0-8.fc29
Clone Of:
Environment:
Last Closed: 2018-12-06 14:52:31 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Nir Soffer 2018-06-21 16:50:02 UTC
Description of problem:

Activating NFS storage domain fails when sanlock fail to acquire host id:

2018-06-21 19:34:00,569+0300 INFO  (jsonrpc/4) [storage.SANLock] Acquiring host id for domain 08272182-9fb1-4609-bd3b-0246b66eafa3 (id=250, async=False) (clusterlock:284)
2018-06-21 19:34:00,570+0300 INFO  (jsonrpc/4) [vdsm.api] FINISH createStoragePool error=Cannot acquire host id: (u'08272182-9fb1-4609-bd3b-0246b66eafa3', SanlockException(2, 'Sanlock lockspace add failure', 'No such file or directory')) from=::ffff:10.35.0.102,51372, flow_id=34fd12d, task_id=ecdfeeb2-f9c3-4b1c-853b-91c7929022fd (api:51)
2018-06-21 19:34:00,570+0300 ERROR (jsonrpc/4) [storage.TaskManager.Task] (Task='ecdfeeb2-f9c3-4b1c-853b-91c7929022fd') Unexpected error (task:875)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/storage/task.py", line 882, in _run
    return fn(*args, **kargs)
  File "<decorator-gen-31>", line 2, in createStoragePool
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 49, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/hsm.py", line 1003, in createStoragePool
    leaseParams)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 634, in create
    self._acquireTemporaryClusterLock(msdUUID, leaseParams)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sp.py", line 565, in _acquireTemporaryClusterLock
    msd.acquireHostId(self.id)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 829, in acquireHostId
    self._manifest.acquireHostId(hostId, async)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sd.py", line 455, in acquireHostId
    self._domainLock.acquireHostId(hostId, async)
  File "/usr/lib/python2.7/site-packages/vdsm/storage/clusterlock.py", line 315, in acquireHostId
    raise se.AcquireHostIdFailure(self._sdUUID, e)
AcquireHostIdFailure: Cannot acquire host id: (u'08272182-9fb1-4609-bd3b-0246b66eafa3', SanlockException(2, 'Sanlock lockspace add failure', 'No such file or directory'))


# cat /var/log/sanlock.log
2018-06-20 01:12:47 1769 [7039]: sanlock daemon started 3.6.0 host b262e149-c91c-4ad6-84bb-f21f898133fc.voodoo1.tl
2018-06-21 04:06:43 98604 [7039]: helper pid 7040 term signal 15
2018-06-21 04:07:00 5 [843]: lockfile open error /var/run/sanlock/sanlock.pid: Permission denied
2018-06-21 18:01:37 4 [823]: lockfile open error /var/run/sanlock/sanlock.pid: Permission denied
2018-06-21 19:03:28 3716 [6551]: lockfile open error /var/run/sanlock/sanlock.pid: Permission denied
2018-06-21 19:06:08 3875 [7030]: lockfile open error /var/run/sanlock/sanlock.pid: Permission denied
2018-06-21 19:19:50 4698 [8344]: lockfile open error /var/run/sanlock/sanlock.pid: Permission denied


Enabling permissive mode avoids this issue:

# ausearch -m avc -ts recent
----
time->Thu Jun 21 19:38:19 2018
type=AVC msg=audit(1529599099.639:855): avc:  denied  { dac_override } for  pid=9851 comm="sanlock" capability=1  scontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tcontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tclass=capability permissive=1

# ps -efZ | grep sanlock | grep -v grep
system_u:system_r:sanlock_t:s0-s0:c0.c1023 sanlock 9851 1  0 19:38 ?   00:00:00 /usr/sbin/sanlock daemon
system_u:system_r:sanlock_t:s0-s0:c0.c1023 root 9852 9851  0 19:38 ?   00:00:00 /usr/sbin/sanlock daemon


Version-Release number of selected component (if applicable):
# rpm -qa | egrep 'sanlock|selinux' | sort
libselinux-2.8-1.fc28.x86_64
libselinux-utils-2.8-1.fc28.x86_64
libvirt-lock-sanlock-4.2.0-1.fc28.x86_64
python2-libselinux-2.8-1.fc28.x86_64
python2-sanlock-3.6.0-2.fc28.x86_64
python3-libselinux-2.8-1.fc28.x86_64
rpm-plugin-selinux-4.14.1-9.fc28.x86_64
sanlock-3.6.0-2.fc28.x86_64
sanlock-lib-3.6.0-2.fc28.x86_64
selinux-policy-3.14.1-32.fc28.noarch
selinux-policy-targeted-3.14.1-32.fc28.noarch


How reproducible:
100%

Steps to Reproduce:
- Create NFS storage domain
- Activate inactive storage domain


Actual results:
Activating storage domain fails because of the sanlock failure.

Expected results:
Activating storage domain succeeds.

Workaround:
Use permissive mode:
# setenforce 0

Comment 1 Nir Soffer 2018-06-21 16:51:19 UTC
Miroslav, can you take a look?

Comment 2 Sandro Bonazzola 2018-06-28 15:54:10 UTC
Adding also Lukas Vrabec, fedora maintainer.

Comment 3 Lukas Vrabec 2018-07-16 09:26:57 UTC
Hi, 

Do the sanlock daemon require DAC_OVERRRIDE capability? We removed from distribution SELinux policy quite lot of allow rules allowing DAC_OVERRIDE capability for the deamon. In most cases it's issue on the application/daemon side, not SELinux side. For more info see following article: 
https://lukas-vrabec.com/index.php/2018/07/03/why-do-you-see-dac_override-selinux-denials/

I can help you to identify which file has too tight unix permissions please turn on full auditing on audit daemon and reproduce the scenario:
https://lukas-vrabec.com/index.php/2018/07/16/how-to-enable-full-auditing-in-audit-daemon/

Thanks,
Lukas.

Comment 4 David Teigland 2018-07-16 14:53:23 UTC
Thanks Lukas, by default sanlock is using "sanlock" (179) for both its uid and gid.  From reading the links, it sounds like sanlock should instead use gid of "root", is that correct?

Nir, set gname = root in sanlock.conf to try this and see if it works.

Comment 5 Lukas Vrabec 2018-07-17 10:00:12 UTC
David, 

Yes agree, apache and other services using the same fix.

Comment 6 David Teigland 2018-07-17 14:59:36 UTC
Nir, can you confirm that using group "root" will not upset things on nfs with root squash?

Comment 7 Milos Malik 2018-07-18 08:28:06 UTC
----
type=PROCTITLE msg=audit(07/17/2018 09:51:29.082:1878) : proctitle=/usr/sbin/sanlock daemon 
type=PATH msg=audit(07/17/2018 09:51:29.082:1878) : item=0 name=/var/run/sanlock/ inode=444150 dev=00:16 mode=dir,755 ouid=sanlock ogid=sanlock rdev=00:00 obj=system_u:object_r:sanlock_var_run_t:s0 nametype=PARENT cap_fp=none cap_fi=none cap_fe=0 cap_fver=0 
type=CWD msg=audit(07/17/2018 09:51:29.082:1878) : cwd=/ 
type=SYSCALL msg=audit(07/17/2018 09:51:29.082:1878) : arch=x86_64 syscall=openat success=no exit=EACCES(Permission denied) a0=0xffffff9c a1=0x7fff47fc0630 a2=O_WRONLY|O_CREAT|O_CLOEXEC a3=0x1a4 items=1 ppid=1 pid=15264 auid=unset uid=root gid=root euid=root suid=root fsuid=root egid=root sgid=root fsgid=root tty=(none) ses=unset comm=sanlock exe=/usr/sbin/sanlock subj=system_u:system_r:sanlock_t:s0-s0:c0.c1023 key=(null) 
type=AVC msg=audit(07/17/2018 09:51:29.082:1878) : avc:  denied  { dac_override } for  pid=15264 comm=sanlock capability=dac_override  scontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tcontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tclass=capability permissive=0 
----

Comment 8 David Teigland 2018-07-27 14:35:33 UTC
Nir, have you been able to check if setting "gname = root" in sanlock.conf fixes the problem?  If so, then I'll check in the fix to the code to use root as the default group.

Comment 9 Nir Soffer 2018-07-27 14:54:20 UTC
Sorry, I did not have time to check this yet.

Can you explain this change? why would we want to run sanlock with group root?

I would like to run sanlock with group sanlock, and change selinux policy so
sanlock can access its own pid file.

Comment 10 David Teigland 2018-07-27 15:00:32 UTC
Apparently group root is the right way to do this according to the experts, but I don't understand it well enough to explain myself.  The links in comment 3 explain it pretty well.

Comment 11 Nir Soffer 2018-07-29 17:35:00 UTC
(In reply to David Teigland from comment #10)
Base on:
https://lukas-vrabec.com/index.php/2018/07/03/why-do-you-see-dac_override-selinux-denials/

The problem is:

1. sanlock running as root, before dropping permissions
2. sanlock try to write to /var/run/sanlock/sanlock.pid, owned by sanlock:sanlock

So why not change /var/run/sanlock/sanlock.pid ownership to root:root?

I tried this patch on RHEL 7.5:

diff --git a/src/lockfile.c b/src/lockfile.c
index 5a2518e..e837d49 100644
--- a/src/lockfile.c
+++ b/src/lockfile.c
@@ -89,13 +89,6 @@ int lockfile(const char *dir, const char *name, int uid, int gid)
                goto fail;
        }
 
-       rv = fchown(fd, uid, gid);
-       if (rv < 0) {
-               log_error("lockfile fchown error %s: %s",
-                         path, strerror(errno));
-               goto fail;
-       }
-
        return fd;
  fail:
        close(fd);


With this patch:

# ls -lhZ /run/sanlock/
-rw-r--r--. root    root    system_u:object_r:sanlock_var_run_t:s0 sanlock.pid
srw-rw----. sanlock sanlock system_u:object_r:sanlock_var_run_t:s0 sanlock.sock

Did not test on Fedora 28 yet (environment issue).

Comment 12 David Teigland 2018-07-30 19:03:39 UTC
Nir, that looks good to me if it resolves the issue.

Comment 13 Nir Soffer 2018-07-30 19:18:14 UTC
(In reply to David Teigland from comment #12)
Ok, I'll test it and post a patch to sanlock.

Comment 14 David Teigland 2018-11-07 20:39:58 UTC
Nir, is this still a problem?

Comment 15 Nir Soffer 2018-11-29 22:21:04 UTC
I tested this patch on Fedora 28, and it fixes the issue.
https://lists.fedorahosted.org/archives/list/sanlock-devel@lists.fedorahosted.org/thread/ZTG4E264IMHVFSVDKDPA6HSGBUFBQ2JI/

Milos, can you review this patch?

Comment 16 Lukas Vrabec 2018-12-06 13:08:33 UTC
LGTM.

Comment 17 Nir Soffer 2019-01-15 12:07:57 UTC
David, we are still missing the fedora package with the fix, right?

Comment 18 Fedora Update System 2019-01-30 20:58:20 UTC
sanlock-3.6.0-8.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2019-a27b970ddf

Comment 19 Fedora Update System 2019-01-30 20:59:56 UTC
sanlock-3.6.0-4.fc28 has been submitted as an update to Fedora 28. https://bodhi.fedoraproject.org/updates/FEDORA-2019-ea4ecdc166

Comment 20 Fedora Update System 2019-01-31 02:30:30 UTC
sanlock-3.6.0-8.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-a27b970ddf

Comment 21 Fedora Update System 2019-01-31 03:26:02 UTC
sanlock-3.6.0-4.fc28 has been pushed to the Fedora 28 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-ea4ecdc166

Comment 22 Fedora Update System 2019-03-20 21:17:10 UTC
sanlock-3.6.0-4.fc28 has been pushed to the Fedora 28 stable repository. If problems still persist, please make note of it in this bug report.

Comment 23 Fedora Update System 2019-03-20 22:12:29 UTC
sanlock-3.6.0-8.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.

Comment 24 Fedora Update System 2019-03-21 14:40:09 UTC
sanlock-3.6.0-8.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.