Bug 1141658

Summary: [RHEL7] Host reboot after blocking connection to master domain because selinux prevents sanlock from terminating or killing vdsm
Product: Red Hat Enterprise Virtualization Manager Reporter: Elad <ebenahar>
Component: vdsmAssignee: Nir Soffer <nsoffer>
Status: CLOSED CURRENTRELEASE QA Contact: Elad <ebenahar>
Severity: urgent Docs Contact:
Priority: medium    
Version: 3.5.0CC: amureini, bazulay, ebenahar, ecohen, gklein, iheim, laravot, lkuchlan, lpeer, mgrepl, michal.skrivanek, nsoffer, ogofen, scohen, teigland, tnisan, yeylon
Target Milestone: ---Keywords: Regression, ZStream
Target Release: 3.5.0   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: storage
Fixed In Version: vdsm-4.16.8.1-4.el6ev Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1175109 (view as bug list) Environment:
Last Closed: 2015-02-16 13:41:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1142454, 1152594, 1172147    
Bug Blocks: 1164308, 1164311, 1175109    
Attachments:
Description Flags
logs from engine and host
none
3.4.2 reproduction
none
logs-3
none
old_policy_logs
none
old_policy_logs
none
/var/log/ from host 20.11.14
none
logs and image
none
22.12.14 none

Description Elad 2014-09-15 07:20:08 UTC
Created attachment 937453 [details]
logs from engine and host

Description of problem:
In case the host loses its connectivity to the storage server where the master domain is located on, the host is fenced.

Version-Release number of selected component (if applicable):
rhev3.5 v3.1
rhel7 host
vdsm-4.16.3-3.el7.beta.x86_64
sanlock-3.1.0-2.el7.x86_64
rhevm-3.5.0-0.12.beta.el6ev.noarch

How reproducible:
Always, both with block and file

Steps to Reproduce:
1. Have 2 storage domains from different types (NFS and iSCSI in my case) in the DC with 1 host
2. Disconnect the host from the master domain using iptables


Actual results:
After 2 minutes, the DC moves to non-responsive and the host fence.

When testing with NFS as master domain:

Thread-59::ERROR::2014-09-15 09:47:23,961::dispatcher::79::Storage.Dispatcher::(wrapper) Connection timed out
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/dispatcher.py", line 71, in wrapper
    result = ctask.prepare(func, *args, **kwargs)
  File "/usr/share/vdsm/storage/task.py", line 103, in wrapper
    return m(self, *a, **kw)
  File "/usr/share/vdsm/storage/task.py", line 1179, in prepare
    raise self.error
Timeout: Connection timed out
Thread-59::DEBUG::2014-09-15 09:47:23,961::vm::2299::vm.Vm::(_startUnderlyingVm) vmId=`e79dd187-3be3-4659-8201-e00b818c3064`::_ongoingCreations released
Thread-59::ERROR::2014-09-15 09:47:23,961::vm::2336::vm.Vm::(_startUnderlyingVm) vmId=`e79dd187-3be3-4659-8201-e00b818c3064`::The vm start process failed
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 2276, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/virt/vm.py", line 3311, in _run
    devices = self.buildConfDevices()
  File "/usr/share/vdsm/virt/vm.py", line 2073, in buildConfDevices
    self._normalizeVdsmImg(drv)
  File "/usr/share/vdsm/virt/vm.py", line 1996, in _normalizeVdsmImg
    drv['volumeID'])
StorageUnavailableError: ('Failed to get size for volume %s', 'b8cb8732-f9e3-4d19-abb2-68f00cb1b04d')



iSCSI as master domain case:

 Thread-72::ERROR::2014-09-15 10:01:44,831::dispatcher::79::Storage.Dispatcher::(wrapper) (2, 'Sanlock resource read failure', 'No such file or directory')
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/dispatcher.py", line 71, in wrapper
    result = ctask.prepare(func, *args, **kwargs)
  File "/usr/share/vdsm/storage/task.py", line 103, in wrapper
    return m(self, *a, **kw)
  File "/usr/share/vdsm/storage/task.py", line 1179, in prepare
    raise self.error
SanlockException: (2, 'Sanlock resource read failure', 'No such file or directory')

Reconstruct doesn't take place although there are other domains in the DC
 

Expected results:
Host shouldn't be fenced in case it loses its connectivity with the master domain.  

Additional info: logs from engine and host

Comment 2 Nir Soffer 2014-09-16 11:36:08 UTC
(In reply to Elad from comment #0)
I think there is a confusion here abut the term "fenced". Fencing is not related to this.

When host running SPM loose access to the master domain, where sanlock keeps the spm resource, sanlock will terminate vdsm. If sanlock cannot terminated vdsm, it will reboot the host.

Did you test this with an older version and the host did not reboot in this case? If not, this is not a regression.

Comment 3 Allon Mureinik 2014-09-16 12:30:46 UTC
In a single host scenario, why don't we just reconstruct master to another domain?

Comment 4 Elad 2014-09-16 12:32:05 UTC
(In reply to Nir Soffer from comment #2)
> (In reply to Elad from comment #0)
> I think there is a confusion here abut the term "fenced". Fencing is not
> related to this.
> 
Nir, 
When the master domain is blocked and there are other domains which are reachable from the hosts in the DC to take the master role, reconstruct should take place.
The host shouldn't be rebooted.

> When host running SPM loose access to the master domain, where sanlock keeps
> the spm resource, sanlock will terminate vdsm. If sanlock cannot terminated
> vdsm, it will reboot the host.
> 
> Did you test this with an older version and the host did not reboot in this
> case? If not, this is not a regression.

I set the Regression flag

Comment 5 Elad 2014-09-16 12:35:31 UTC
This is a regression because reconstruct doesn't work in case the master domain is blocked.

Comment 6 Nir Soffer 2014-09-16 13:16:13 UTC
Elad, maybe I was not clear. This is not a regression unless you prove the that same scenario works on earlier version. If you can reproduce it in 3.4, there is no regression here.

We try to avoid regressions in new versions, this has a cost, delaying releases and work on new features. So we should cannot mark a bug as regression without proof that it is one.

Comment 7 Elad 2014-09-16 14:24:57 UTC
Nir, reconstruct is not a new feature.
We execute the scenario using our automation frequentley.

I'm working on reproducing it on 3.4.2 but it is a waste of time. This is clearly a regression.

Comment 8 Elad 2014-09-16 14:49:31 UTC
Created attachment 938058 [details]
3.4.2 reproduction

Reproduced on 3.4.2:
Blocked the connection to the master domain at ~17:34.

The host wasn't rebooted when it couldn't access the master domain. Reconstruct took place after a few minutes:

2014-09-16 17:41:15,416 INFO  [org.ovirt.engine.core.bll.eventqueue.EventQueueMonitor] (org.ovirt.thread.pool-4-thread-8) Finished reconstruct for pool 00000002-0002-0002-0002-00000000031e. Clearing event queue
2014-09-16 17:41:15,416 INFO  [org.ovirt.engine.core.bll.eventqueue.EventQueueMonitor] (org.ovirt.thread.pool-4-thread-8) The following operation Event [eventType=DOMAINFAILOVER, storagePoolId=00000002-0002-0002-0002-00000000031e, domainId=ad240d89-609f-4fbc-8c57-21fdf677d524, vdsId=null, description=] was cancelled, because of reconstruct was run before




Used rhev3.4.2 av11.2
vdsm-4.14.13-2.el6ev.x86_64
rhevm-3.4.2-1.1.el6ev.noarch
sanlock-2.8-1.el6.x86_64

Comment 9 Allon Mureinik 2014-09-16 15:08:42 UTC
There are too many moving parts here...

RHEV 3.5 which manifests the bug (see comment 0):
rhev3.5 v3.1
rhel7 host
vdsm-4.16.3-3.el7.beta.x86_64
sanlock-3.1.0-2.el7.x86_64
rhevm-3.5.0-0.12.beta.el6ev.noarch

RHEV 3.4.2 which does not manifest the bug (see comment 8):
rhev3.4.2 av11.2
vdsm-4.14.13-2.el6ev.x86_64
rhevm-3.4.2-1.1.el6ev.noarch
sanlock-2.8-1.el6.x86_64

Let's see if this is a vdsm regression (my intuition: not likely) or a platform regression.

Comment 10 Allon Mureinik 2014-09-16 15:13:44 UTC
Elad, can we double-check with RHEV 3.4.z on RHEL7?

Comment 11 Elad 2014-09-16 16:00:27 UTC
(In reply to Allon Mureinik from comment #9)
> There are too many moving parts here...
> 
> RHEV 3.5 which manifests the bug (see comment 0):
> rhev3.5 v3.1
> rhel7 host
> vdsm-4.16.3-3.el7.beta.x86_64
> sanlock-3.1.0-2.el7.x86_64
> rhevm-3.5.0-0.12.beta.el6ev.noarch
The bug was opened on 3.5


> RHEV 3.4.2 which does not manifest the bug (see comment 8):
> rhev3.4.2 av11.2
> vdsm-4.14.13-2.el6ev.x86_64
> rhevm-3.4.2-1.1.el6ev.noarch
> sanlock-2.8-1.el6.x86_64
I was asked by Nir to reprodeuce it on earlier version to check if it's a regression

> Let's see if this is a vdsm regression (my intuition: not likely) or a
> platform regression.

Comment 12 Nir Soffer 2014-09-16 17:23:59 UTC
(In reply to Elad from comment #11)
> I was asked by Nir to reprodeuce it on earlier version to check if it's a
> regression

Right, but you tested a previous version of vdsm with a previous version of rhel. So this proves that there is a regression of vdsm+rhel. It will be more useful to know if this is a regression of vdsm or rhel.

The most interesting test would be running rhev 3.4 on rhel 7. If it does not work, run rhev-3.5 on rhel 6.5.

Comment 13 Nir Soffer 2014-09-16 18:18:26 UTC
Event 1
=======

1. Sanlock fail to read the lockspace

Sep 15 09:35:48 green-vdsa sanlock[1306]: 2014-09-15 09:35:48+0300 58020 [3937]: s1 delta_renew read rv -202 offset 0 /rhev/data-center/mnt/lion.qa.lab.tlv.redhat.com:_export_elad_2/3c98701d-c674-46c5-b37d-df90b2955671/dom_md/ids
Sep 15 09:35:48 green-vdsa sanlock[1306]: 2014-09-15 09:35:48+0300 58020 [3937]: s1 renewal error -202 delta_length 10 last_success 57989

2. These errors from wdmd are suspicious

Sep 15 09:36:37 green-vdsa wdmd[1298]: test warning now 58068 ping 58058 close 0 renewal 57989 expire 58069 client 1306 sanlock_3c98701d-c674-46c5-b37d-df90b2955671:1
Sep 15 09:36:37 green-vdsa wdmd[1298]: /dev/watchdog0 closed unclean
Sep 15 09:36:37 green-vdsa kernel: watchdog watchdog0: watchdog did not stop!

3. It looks like the watchdog is starting a countdown, and we are going to
   reboot in 59 seconds, while sanlck is still trying to check the lease...

Sep 15 09:36:37 green-vdsa wdmd[1298]: test failed rem 59 now 58069 ping 58058 close 58068 renewal 57989 expire 58069 client 1306 sanlock_3c98701d-c674-46c5-b37d-df90b2955671:1

4. After 80 seconds, the lease is considered FAIL, and sanlock try to terminate
   vdsm (expected)

Sep 15 09:36:38 green-vdsa sanlock[1306]: 2014-09-15 09:36:38+0300 58069 [1306]: s1 check_our_lease failed 80
Sep 15 09:36:38 green-vdsa sanlock[1306]: 2014-09-15 09:36:38+0300 58069 [1306]: s1 kill 3289 sig 15 count 1

4. While sanlock try to kill vdsm, still gettings warnings from wdmd

Sep 15 09:37:08 green-vdsa wdmd[1298]: test failed rem 29 now 58099 ping 58058 close 58068 renewal 57989 expire 58069 client 1306 sanlock_3c98701d-c674-46c5-b37d-df90b2955671:1

5. The last message from sanlock is this:

Sep 15 09:37:08 green-vdsa sanlock[1306]: 2014-09-15 09:37:08+0300 58100 [1306]: s1 kill 3289 sig 15 count 32

6. Then we have this blob of non-printable data in the log. This may
   happen during unclean shutdown.

\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00.\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00Sep 15 09:40:10 localhost rsyslogd: [origin software="rsyslogd" swVersion="7.4.7" x-pid="1220" x-info="http://www.rsyslog.com"] start

7. The machine was rebooted

Sep 15 09:39:52 localhost journal: Runtime journal is using 8.0M (max 794.4M, leaving 1.1G of free 7.7G, current limit 794.4M).
Sep 15 09:39:52 localhost kernel: CPU0 microcode updated early to revision 0xd2, date = 2010-10-01

Notes: in vdsm log, we don't see any log about receiving signal 15. This can
explain why sanlock is failing to terminate vdsm - looks like it send a signal
to non-existing process.


Event 2
=======

1. Sanlock started

Sep 15 09:40:10 localhost systemd-sanlock: Starting sanlock: [  OK  ]

2. Sanlock fail to read the lockspace

Sep 15 09:46:01 localhost sanlock[1301]: 2014-09-15 09:46:01+0300 369 [4209]: s1 delta_renew read rv -202 offset 0 /rhev/data-center/mnt/lion.qa.lab.tlv.redhat.com:_export_elad_2/3c98701d-c674-46c5-b37d-df90b2955671/dom_md/ids

3. Again, suspicious warnings from wdmd - 40 seconds after first sanlock failure

Sep 15 09:46:40 localhost wdmd[1348]: test warning now 409 ping 399 close 0 renewal 339 expire 419 client 1301 sanlock_3c98701d-c674-46c5-b37d-df90b2955671:1
Sep 15 09:46:40 localhost wdmd[1348]: /dev/watchdog0 closed unclean
Sep 15 09:46:40 localhost kernel: watchdog watchdog0: watchdog did not stop!

4. wdmd seems to be a middle of countdown for reboot - 50 seconds left?

Sep 15 09:46:50 localhost wdmd[1348]: test failed rem 50 now 419 ping 399 close 409 renewal 339 expire 419 client 1301 sanlock_3c98701d-c674-46c5-b37d-df90b2955671:1

5. Sanlock failed to check the lease after 80 seconds, and try to terminate vdsm

Sep 15 09:46:50 localhost sanlock[1301]: 2014-09-15 09:46:50+0300 419 [1301]: s1 check_our_lease failed 80
Sep 15 09:46:50 localhost sanlock[1301]: 2014-09-15 09:46:50+0300 419 [1301]: s1 kill 3575 sig 15 count 1

6. After 40 retries, sanlock try to kill vdsm

Sep 15 09:47:29 localhost sanlock[1301]: 2014-09-15 09:47:29+0300 458 [1301]: s1 kill 3575 sig 15 count 40
Sep 15 09:47:30 localhost sanlock[1301]: 2014-09-15 09:47:30+0300 459 [1301]: s1 kill 3575 sig 9 count 41

7. After 6 retries, vdsm is not killed yet - this can happen if:
   - Vdsm is block in D state in the kernel - unlikely since we do
     anything that can block in another process
   - Or sanlock is sending a process to non-existing process

Sep 15 09:47:35 localhost sanlock[1301]: 2014-09-15 09:47:35+0300 464 [1301]: s1 kill 3575 sig 9 count 46

8. Finaly the watchog reboot the machine, and again we get a blob of non-printable data

Sep 15 09:47:36 localhost wdmd[1348]: test failed rem 5 now 464 ping 399 close 409 renewal 339 expire 419 client 1301 sanlock_3c98701d-c674-46c5-b37d-df90b2955671:1
\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00...\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00Sep 15 09:50:13 localhost rsyslogd: [origin software="rsyslogd" swVersion="7.4.7" x-pid="1251" x-info="http://www.rsyslog.com"] start

In vdsm log, we don't see the expected "Recieved signal 15" message. Looks like
sanlock is not using the correct pid for vdsm.


Event 3
=======

Same as 2:

Sep 15 10:00:14 localhost sanlock[1344]: 2014-09-15 10:00:14+0300 619 [4000]: s3 delta_renew read rv -202 offset 0 /dev/b4e237e3-9a52-4ab3-947a-30d452497c37/ids
Sep 15 10:00:53 localhost sanlock[1344]: 2014-09-15 10:00:53+0300 659 [1344]: s3 check_our_lease warning 70 last_success 589
Sep 15 10:00:54 localhost wdmd[1357]: test warning now 659 ping 649 close 0 renewal 589 expire 669 client 1344 sanlock_b4e237e3-9a52-4ab3-947a-30d452497c37:1
Sep 15 10:00:54 localhost wdmd[1357]: /dev/watchdog0 closed unclean
Sep 15 10:00:54 localhost kernel: watchdog watchdog0: watchdog did not stop!
Sep 15 10:01:03 localhost wdmd[1357]: test failed rem 50 now 669 ping 649 close 659 renewal 589 expire 669 client 1344 sanlock_b4e237e3-9a52-4ab3-947a-30d452497c37:1
Sep 15 10:01:03 localhost sanlock[1344]: 2014-09-15 10:01:03+0300 669 [1344]: s3 check_our_lease failed 80
Sep 15 10:01:03 localhost sanlock[1344]: 2014-09-15 10:01:03+0300 669 [1344]: s3 kill 3285 sig 15 count 1
Sep 15 10:01:42 localhost sanlock[1344]: 2014-09-15 10:01:42+0300 708 [1344]: s3 kill 3285 sig 15 count 40
Sep 15 10:01:47 localhost sanlock[1344]: 2014-09-15 10:01:47+0300 713 [1344]: s3 kill 3285 sig 9 count 45
Sep 15 10:01:47 localhost wdmd[1357]: test failed rem 6 now 713 ping 649 close 659 renewal 589 expire 669 client 1344 sanlock_b4e237e3-9a52-4ab3-947a-30d452497c37:1

So the "fencing" of the host seems to be caused by sanlock/wdmd issue. The rest
is probably caused by the unexpected reboot of the machine. 

David, can you take a look at this?

Comment 14 David Teigland 2014-09-16 19:31:25 UTC
It seems that the kill(SIGTERM) from sanlock to vdsm is not being delivered.
One possibility is that vdsm is blocked in the D state, but there is no
other evidence for that.  Another possibility is that selinux is blocking
the signal.  audit.log contains a lot of complaining about sanlock, but I
cannot decipher what it means.  Until we sort out what the sanlock/selinux
problem is, it may be worth testing in the permissive mode to check if things
work.

Comment 15 Nir Soffer 2014-09-16 19:44:13 UTC
David open bug 1142454 about the selinux messages.

Elad, please repeat this test on rhel 7 in permissive mode.

Comment 16 Nir Soffer 2014-09-16 21:05:00 UTC
I reproduced this on my my setup easily.

Checking audit.log.1, we can see that sanlock-helper is not allowed to send signals.

In the the log we can see

1. 40 times denial to use syscall 62 (kill) with argument f (SIGTERM)

type=AVC msg=audit(1410899479.040:5224): avc:  denied  { signal } for  pid=18646 comm="sanlock-helper" scontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tcontext=system_u:system_r:init_t:s0 tclass=process
type=SYSCALL msg=audit(1410899479.040:5224): arch=c000003e syscall=62 success=no exit=-13 a0=493b a1=f a2=200 a3=0 items=0 ppid=18645 pid=18646 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="sanlock-helper" exe="/usr/sbin/sanlock" subj=system_u:system_r:sanlock_t:s0-s0:c0.c1023 key=(null)

2. 9 times denial to use syscall 62 (kill) with argument 9 (SIGKILL)

type=AVC msg=audit(1410899519.273:5271): avc:  denied  { sigkill } for  pid=18646 comm="sanlock-helper" scontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tcontext=system_u:system_r:init_t:s0 tclass=process
type=SYSCALL msg=audit(1410899519.273:5271): arch=c000003e syscall=62 success=no exit=-13 a0=493b a1=9 a2=200 a3=0 items=0 ppid=18645 pid=18646 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="sanlock-helper" exe="/usr/sbin/sanlock" subj=system_u:system_r:sanlock_t:s0-s0:c0.c1023 key=(null)

This match exactly sanlock logic, and what I could see when watching sanlock.log while blocking the storage.

So this is an selinux issue so this is blocked on bug 1142454.

Comment 17 Elad 2014-09-17 07:54:13 UTC
Created attachment 938371 [details]
logs-3

It seems like a rhel7 issue while selinux is enforcing.
Tested with a rhel7 host on rhev3.4.2. The host is rebooted after master domain is unreachable for ~2 minutes.

First renewal error:

Sep 17 10:35:57 green-vdsb sanlock[24301]: 2014-09-17 10:35:57+0300 77160 [24863]: s2 renewal error -202 delta_length 10 last_success 77119
Sep 17 10:36:08 green-vdsb sanlock[24301]: 2014-09-17 10:36:08+0300 77171 [24863]: s2 delta_renew read rv -202 offset 0 /rhev/data-center/mnt/lion.qa.lab.tlv.redhat.com:_export_elad_6/f7d48355-0beb-46bb-86cd-400e1
2605612/dom_md/ids

Host is rebooted:

Sep 17 10:37:23 green-vdsb sanlock[24301]: 2014-09-17 10:37:23+0300 77246 [24863]: s2 renewal error -2 delta_length 10 last_success 77119
Sep 17 10:37:23 green-vdsb wdmd[24322]: test failed rem 9 now 77246 ping 77185 close 77195 renewal 77119 expire 77199 client 24301 sanlock_f7d48355-0beb-46bb-86cd-400e12605612:1


Tested again with selinux as permissive and the bug is not reproduced. The host is not rebooted and reconstruct succeeds.

2014-09-17 10:50:37,578 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.ReconstructMasterVDSCommand] (org.ovirt.thread.pool-4-thread-26) [63da4302] FINISH, ReconstructMasterVDSCommand, log id: 6e41fc4b
2014-09-17 10:50:37,589 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.RefreshStoragePoolVDSCommand] (org.ovirt.thread.pool-4-thread-25) [63da4302] START, RefreshStoragePoolVDSCommand(HostName = gb, HostId = 44f824e1-1af1-4af1-ade7-af997dc692d6, storagePoolId = b40d9615-5e01-4ad5-8ece-6a705740e293, masterStorageDomainId=ea367f2f-8591-400f-bd1f-48a1edab6fdc, masterVersion=2), log id: 819c767

Logs attached

Comment 18 Nir Soffer 2014-09-17 22:37:30 UTC
I reproduce the same issue on Fedora 19.

1. 40 retries to send SIGTERM:

type=AVC msg=audit(1410992928.276:1447): avc:  denied  { signal } for  pid=3878 comm="sanlock-helper" scontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tcontext=system_u:system_r:initrc_t:s0 tclass=process
type=SYSCALL msg=audit(1410992928.276:1447): arch=c000003e syscall=62 success=no exit=-13 a0=f7e a1=f a2=200 a3=4 items=0 ppid=3877 pid=3878 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="sanlock-helper" exe="/usr/sbin/sanlock" subj=system_u:system_r:sanlock_t:s0-s0:c0.c1023 key=(null)

2. Couple of retries to send SIGKILL

type=AVC msg=audit(1410992976.124:1495): avc:  denied  { sigkill } for  pid=3878 comm="sanlock-helper" scontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tcontext=system_u:system_r:initrc_t:s0 tclass=process
type=SYSCALL msg=audit(1410992976.124:1495): arch=c000003e syscall=62 success=no exit=-13 a0=f7e a1=9 a2=200 a3=4 items=0 ppid=3877 pid=3878 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="sanlock-helper" exe="/usr/sbin/sanlock" subj=system_u:system_r:sanlock_t:s0-s0:c0.c1023 key=(null)

On Fedora 20, we the behavior is little better:

1. 40 retries to send SIGTERM

type=AVC msg=audit(1410991380.283:100996): avc:  denied  { signal } for  pid=678 comm="sanlock-helper" scontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tcontext=system_u:system_r:virtd_t:s0-s0:c0.c1023 tclass=process permissive=0
type=SYSCALL msg=audit(1410991380.283:100996): arch=c000003e syscall=62 success=no exit=-13 a0=79a a1=f a2=200 a3=6 items=0 ppid=677 pid=678 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="sanlock-helper" exe="/usr/sbin/sanlock" subj=system_u:system_r:sanlock_t:s0-s0:c0.c1023 key=(null)

2. The first attempt to send SIGKILL succeeds, and the host
   is not rebooted.

So this bug exists in upstream since Fedora 19 came out, and we could have detected and fixed it *before* rhel7 was released.

Comment 19 Nir Soffer 2014-09-17 22:58:31 UTC
The watchdog timing are correct; the host reboot 50-60 seconds after sanlock fail to maintain the lease.

In Event 1 (see comment 13), the log stop after 32 retires, but this does not mean that the machine was rebooted at this point. This is likely missing data that was not flushed to disk during the unclean shutdown.

Comment 21 Nir Soffer 2014-09-30 14:17:04 UTC
For automation we can use permissive mode on the rhel7 hosts.

Other then that, this must be fixed in the selinux policy.

Comment 22 Allon Mureinik 2014-10-02 13:05:00 UTC
(In reply to Nir Soffer from comment #21)
> For automation we can use permissive mode on the rhel7 hosts.
> 
> Other then that, this must be fixed in the selinux policy.
Removing the betablocker based on this comment. This is, however, a GA blocker.

Comment 23 Nir Soffer 2014-10-12 12:47:59 UTC
Ori, please do not change bugs severity. Contact Allon for such changes.

Comment 24 Ori Gofen 2014-10-12 13:05:29 UTC
Sorry Nir, I should have posted a comment as well. 
I encountered this bug while running tests on IBM-PPC, the effect it has on the host caused more than just a reboot.
After discussing this issue with Meital, we decided to set this bug to urgent

Comment 25 Michal Skrivanek 2014-10-13 10:47:27 UTC
(In reply to Ori from comment #24)
> Sorry Nir, I should have posted a comment as well. 
> I encountered this bug while running tests on IBM-PPC, the effect it has on
> the host caused more than just a reboot.
> After discussing this issue with Meital, we decided to set this bug to urgent

so what does it cause? Do the logs match the observation in previous comments?

Comment 27 Ori Gofen 2014-10-14 07:17:40 UTC
Michal, currently PPC hosts run a costume IBM-PPC OS on top of fedora20.
When a host reboot is executed the boot loader doesn't choose default option,gets stuck on bootstrap, hence, can't rejoin it's previous setup which renders the DC no operational.
This behavior, obviously, is not related to vdsm in any way,from logs we observe a similarity to rhel7 hosts,though BZ #1151835 is a separate bug and needs to be verified once the fix will merge into the IBM-PPC/fedora OS.

Comment 30 Allon Mureinik 2014-10-21 15:28:01 UTC
Isn't this a dup of bug 1151835 ?

Comment 31 Nir Soffer 2014-10-21 18:53:43 UTC
(In reply to Allon Mureinik from comment #30)
> Isn't this a dup of bug 1151835 ?

No, bug 1151835 may be a dup of this. May because I did not see any evidence yet that it is indeed this issue.

Comment 33 Allon Mureinik 2014-10-30 11:16:57 UTC
(In reply to Elad from comment #32)
> The provided link [1] doesn't have any rpm file under it 
> 
> [1] https://brewweb.devel.redhat.com/taskinfo?taskID=8095125

Here you go:
https://brewweb.devel.redhat.com/buildinfo?buildID=391552

Comment 34 Elad 2014-11-02 08:42:09 UTC
Setting back the need-info flag. Will be answered soon

Comment 35 Elad 2014-11-02 09:46:46 UTC
Verfied that the bug still reproduced using latest vdsm build (vt8) with selinux-policy-3.12.1-153.el7.noarch, and it is. Attaching the logs of this reproduce (old_policy_logs).

Nov  2 11:07:38 green-vdsb wdmd[12647]: test failed rem 20 now 256999 ping 256949 close 256959 renewal 256882 expire 256962 client 12666 sanlock_a0f9e8e2-3647-4370-957a-b21654320621:2

===========================================================================


Updated selinux to selinux-policy-3.13.1-6.el7.noarch. The bug reproduced also with this selinux policy.

Nov  2 11:32:56 green-vdsb wdmd[859]: test failed rem -1 now 367 ping 296 close 306 renewal 236 expire 316 client 850 sanlock_a0f9e8e2-3647-4370-957a-b21654320621:2

Logs attached (new_policy_logs).

Used:
libselinux-2.2.2-6.el7.x86_64
selinux-policy-targeted-3.13.1-6.el7.noarch
libselinux-utils-2.2.2-6.el7.x86_64
libselinux-python-2.2.2-6.el7.x86_64
selinux-policy-3.13.1-6.el7.noarch
libselinux-ruby-2.2.2-6.el7.x86_64

vdsm-4.16.7.2-1.el7.x86_64

Comment 36 Elad 2014-11-02 09:47:41 UTC
Created attachment 952866 [details]
old_policy_logs

Comment 37 Elad 2014-11-02 09:48:16 UTC
Created attachment 952867 [details]
old_policy_logs

Comment 38 Elad 2014-11-02 09:50:28 UTC
(In reply to Elad from comment #37)
> Created attachment 952867 [details]
> old_policy_logs

Those are the logs from the second reproduce with the new selinux policy

Comment 39 Nir Soffer 2014-11-02 14:49:30 UTC
(In reply to Elad from comment #35)
> Nov  2 11:32:56 green-vdsb wdmd[859]: test failed rem -1 now 367 ping 296
> close 306 renewal 236 expire 316 client 850
> sanlock_a0f9e8e2-3647-4370-957a-b21654320621:2

Elad, this sanlock log has nothing to do with this bug.

Comment 40 Elad 2014-11-02 14:53:00 UTC
(In reply to Nir Soffer from comment #39)
> (In reply to Elad from comment #35)
> > Nov  2 11:32:56 green-vdsb wdmd[859]: test failed rem -1 now 367 ping 296
> > close 306 renewal 236 expire 316 client 850
> > sanlock_a0f9e8e2-3647-4370-957a-b21654320621:2
> 
> Elad, this sanlock log has nothing to do with this bug.

OK. You can use it as a time indication for the logs attached

Comment 41 Nir Soffer 2014-11-02 14:59:31 UTC
Elad, looking in sanlock log from attachment 952867 [details] (see comment 37),
I see two events.

1. Event ending at 2014-11-02 11:07:53+0200

2014-11-02 11:07:02+0200 256962 [12666]: s1 kill 21431 sig 15 count 1
...
2014-11-02 11:07:40+0200 257001 [12666]: s1 kill 21431 sig 15 count 40
2014-11-02 11:07:41+0200 257002 [12666]: s1 kill 21431 sig 9 count 41
...
2014-11-02 11:07:53+0200 257014 [12666]: s1 kill 21431 sig 9 count 53
<junk>
2014-11-02 11:10:57+0200 17 [850]: sanlock daemon started 3.1.0 host c7edbb8b-d110-4628-b305-b12afe4cea0e.green-vdsb

This event show that sanlock is not allowed to send SIGKILL

2. Event ending at 

2014-11-02 11:32:06+0200 316 [850]: s1 kill 3097 sig 15 count 1
...
2014-11-02 11:32:43+0200 354 [850]: s1 kill 3097 sig 15 count 39
<junk>
2014-11-02 11:35:54+0200 17 [871]: sanlock daemon started 3.1.0 host ecc23218-e17e-4581-a675-f46194a8bd71.green-vdsb

We don't know what happend here, since we don't have enough logs.

Can you confirm that both events happened while new selinux-policy was installed?

Comment 42 Elad 2014-11-02 15:03:38 UTC
(In reply to Nir Soffer from comment #41)
> Elad, looking in sanlock log from attachment 952867 [details] (see comment
> 37),
> I see two events.
> 
> 1. Event ending at 2014-11-02 11:07:53+0200
> 
> 2014-11-02 11:07:02+0200 256962 [12666]: s1 kill 21431 sig 15 count 1
> ...
> 2014-11-02 11:07:40+0200 257001 [12666]: s1 kill 21431 sig 15 count 40
> 2014-11-02 11:07:41+0200 257002 [12666]: s1 kill 21431 sig 9 count 41
> ...
> 2014-11-02 11:07:53+0200 257014 [12666]: s1 kill 21431 sig 9 count 53
> <junk>
> 2014-11-02 11:10:57+0200 17 [850]: sanlock daemon started 3.1.0 host
> c7edbb8b-d110-4628-b305-b12afe4cea0e.green-vdsb
> 
> This event show that sanlock is not allowed to send SIGKILL
> 
> 2. Event ending at 
> 
> 2014-11-02 11:32:06+0200 316 [850]: s1 kill 3097 sig 15 count 1
> ...
> 2014-11-02 11:32:43+0200 354 [850]: s1 kill 3097 sig 15 count 39
> <junk>
> 2014-11-02 11:35:54+0200 17 [871]: sanlock daemon started 3.1.0 host
> ecc23218-e17e-4581-a675-f46194a8bd71.green-vdsb
> 
> We don't know what happend here, since we don't have enough logs.
> 
> Can you confirm that both events happened while new selinux-policy was
> installed?
No, event #1 occurred while the old selinux policy was installed. 
Event #2 occurred while the new selinux policy was installed

Comment 43 Nir Soffer 2014-11-02 15:07:40 UTC
(In reply to Elad from comment #42)
> (In reply to Nir Soffer from comment #41)
> > Can you confirm that both events happened while new selinux-policy was
> > installed?
> No, event #1 occurred while the old selinux policy was installed. 
> Event #2 occurred while the new selinux policy was installed

Ok, I need to see audit.log from this timeframe to tell if you reproduced
this or not.

The logs you uploaded do not include audit logs.

Comment 44 Elad 2014-11-02 15:19:47 UTC
I've updloaded all the content under /var/log

Comment 45 Nir Soffer 2014-11-02 15:46:27 UTC
Miroslav, can you confirm that the fix for allowing sanlock to send signals to vdsm is included in these packages?

selinux-policy-targeted-3.13.1-6.el7.noarch
selinux-policy-3.13.1-6.el7.noarch

We have several rhel 7 hosts in the QE where /var/log/audit 
is empty - do you know why it is empty and how to configure
it so we have the audit logs again, or how to get the sanlock
AVC without those logs?

Comment 46 Nir Soffer 2014-11-02 15:53:38 UTC
I verified the selinux fix with this build:
selinux-policy-3.12.1-153.el7_0.12.noarch
selinux-policy-targeted-3.12.1-153.el7_0.12.noarch

Maybe 3.13.1-6 does not include the required patches?

Comment 47 Miroslav Grepl 2014-11-03 15:43:03 UTC
What AVC are you getting with 3.13.1-6?

Comment 48 Nir Soffer 2014-11-03 16:09:50 UTC
(In reply to Miroslav Grepl from comment #47)
> What AVC are you getting with 3.13.1-6?

None, since I don't have audit.log - what is the way to get these
without the logs?

Comment 49 Miroslav Grepl 2014-11-05 07:48:48 UTC
Is auditd running?

Comment 50 Nir Soffer 2014-11-14 14:41:51 UTC
Elad, please answer Miroslav (see comment 49)

Comment 51 Elad 2014-11-20 10:30:07 UTC
Created attachment 959290 [details]
/var/log/ from host 20.11.14

(In reply to Nir Soffer from comment #50)
> Elad, please answer Miroslav (see comment 49)

Reproduced while auditd was running, attaching the logs.

Master domain was blocked at:

 Nov 20 12:22:03 green-vdsb iscsid: Kernel reported iSCSI connection 1:0 error (1011 - ISCSI_ERR_CONN_FAILED: iSCSI connection failed) state (3)
------------------------------------------BLOCKING MASTER DOMAIN-------------------------------------------



[root@green-vdsb ~]# rpm -qa |grep selinux
libselinux-2.2.2-6.el7.x86_64
selinux-policy-3.13.1-6.el7.noarch
libselinux-ruby-2.2.2-6.el7.x86_64
libselinux-utils-2.2.2-6.el7.x86_64
selinux-policy-targeted-3.13.1-6.el7.noarch
libselinux-python-2.2.2-6.el7.x86_64

vdsm-4.16.7.4-1.el7ev.x86_64

Red Hat Enterprise Linux Server release 7.0 (Maipo)

sanlock-3.1.0-2.el7.x86_64

Comment 52 Allon Mureinik 2014-12-09 12:28:38 UTC
Miroslav, audit.log is attached - can you take a look please?

Comment 53 Nir Soffer 2014-12-11 15:03:21 UTC
In audit.log, we can see 40 instances of:

type=AVC msg=audit(1416478990.034:655): avc:  denied  { signal } for  pid=739 comm="sanlock-helper" scontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tcontext=system_u:system_r:unconfined_service_t:s0 tclass=process

And 8 of:

type=AVC msg=audit(1416479029.486:695): avc:  denied  { sigkill } for  pid=739 comm="sanlock-helper" scontext=system_u:system_r:sanlock_t:s0-s0:c0.c1023 tcontext=system_u:system_r:unconfined_service_t:s0 tclass=process

So the issue is reproduced with the installed packages.

Miroslav, it looks like the fix is not available in the version tested by Elad.

Comment 54 Nir Soffer 2014-12-11 15:10:24 UTC
Elad, can you attach the output of these commands?

ps -efZ | grep initrc_t
ps -efZ | grep sanlock
ps -efZ | grep vdsm

Comment 56 Elad 2014-12-14 08:48:08 UTC
[root@RHEL7Server ~]# ps -efZ | grep initrc_t
unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 root 5655 5603  0 10:47 pts/1 00:00:00 grep --color=auto initrc_t



[root@RHEL7Server ~]# ps -efZ | grep sanlock
system_u:system_r:wdmd_t:s0     root      1511     1  0 10:35 ?        00:00:00 wdmd -G sanlock
system_u:system_r:sanlock_t:s0-s0:c0.c1023 sanlock 1519 1  0 10:35 ?   00:00:00 sanlock daemon -U sanlock -G sanlock
system_u:system_r:sanlock_t:s0-s0:c0.c1023 root 1531 1519  0 10:35 ?   00:00:00 sanlock daemon -U sanlock -G sanlock
unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 root 5698 5603  0 10:47 pts/1 00:00:00 grep --color=auto sanlock
                                                                                                               
[root@RHEL7Server ~]# ps -efZ | grep vdsm                                                                                                                                                                            
system_u:system_r:virtd_t:s0-s0:c0.c1023 root 1421 1  0 10:35 ?        00:00:00 /usr/bin/python /usr/share/vdsm/supervdsmServer --sockfile /var/run/vdsm/svdsm.sock
system_u:system_r:virtd_t:s0-s0:c0.c1023 qemu 3880 1421  0 10:36 ?     00:00:00 [supervdsmServer] <defunct>
system_u:system_r:virtd_t:s0-s0:c0.c1023 vdsm 5174 1  5 10:45 ?        00:00:04 /usr/bin/python /usr/share/vdsm/vdsm
system_u:system_r:virtd_t:s0-s0:c0.c1023 vdsm 5542 5174  0 10:46 ?     00:00:00 /usr/libexec/ioprocess --read-pipe-fd 39 --write-pipe-fd 38 --max-threads 60 --max-queued-requests -1
unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 root 5702 5603  0 10:47 pts/1 00:00:00 grep --color=auto vdsm

Comment 57 Elad 2014-12-14 08:58:41 UTC
Nir, the output from the previous comment is from a RHEL7 host with the provided SElinux packages from comment #55. 
I tested again the scenario using those packages and it doesn't reproduce, the host doesn't reboot and reconstruct takes place, the problem seems to be solved.

libselinux-2.2.2-6.el7.x86_64
selinux-policy-3.12.1-153.el7_0.13.noarch
selinux-policy-sandbox-3.12.1-153.el7_0.13.noarch
libselinux-utils-2.2.2-6.el7.x86_64
libselinux-python-2.2.2-6.el7.x86_64
selinux-policy-minimum-3.12.1-153.el7_0.13.noarch
selinux-policy-mls-3.12.1-153.el7_0.13.noarch
selinux-policy-targeted-3.12.1-153.el7_0.13.noarch
selinux-policy-devel-3.12.1-153.el7_0.13.noarch
selinux-policy-doc-3.12.1-153.el7_0.13.noarch

Comment 58 Nir Soffer 2014-12-14 12:47:17 UTC
We will have to require this version in the vdsm spec. Waiting until a version containing this fix is released.

Comment 59 Elad 2014-12-14 13:37:28 UTC
Nir, FYI, reconstruct takes place but it not finish successfully because sanlock doesn't kill vdsm when the master domain is unreachable. https://bugzilla.redhat.com/show_bug.cgi?id=1149448

Comment 60 Nir Soffer 2014-12-16 22:29:44 UTC
Customers never seen this issue, no doc text required.

Comment 62 lkuchlan 2014-12-17 13:49:19 UTC
Created attachment 970122 [details]
logs and image

It is still reproduce
please find attached the logs

Comment 63 lkuchlan 2014-12-18 08:32:22 UTC
Please do not pay attention to my previous comment, it is attached to other bug(id=1149448)

Comment 64 Elad 2014-12-22 07:55:57 UTC
Created attachment 971904 [details]
22.12.14

Tried to verify the issue. Blocked connectivity to master domain (NFS) while there is another active domain (iSCSI) in the DC. 
The host doesn't reboot, vdsm is killed. But, it seems that the reconstruct flow gets stuck.

2014-12-22 09:11:08,708 WARN  [org.ovirt.engine.core.bll.storage.ReconstructMasterDomainCommand] (org.ovirt.thread.pool-7-thread-4) [6389c2e0] CanDoAction of action ReconstructMasterDomain failed. Reasons:VAR__ACTION__RECONSTRUCT_MASTER,VAR__TYPE__STORAGE__DOMAIN,ACTION_TYPE_FAILED_NO_VDS_IN_POOL


The host doesn't get recover, it is stuck in state 'connecting' from now on.
Both storage domains are in active status.

Uploading the logs


Nir, Liron, I guess this is something which is not related to this specific bug. Should I report is separately or re-open the current?

Comment 65 Nir Soffer 2014-12-22 13:11:18 UTC
(In reply to Elad from comment #64)
> Nir, Liron, I guess this is something which is not related to this specific
> bug. Should I report is separately or re-open the current?

Please open a new bug for this new issue.

Comment 66 Elad 2014-12-23 13:07:29 UTC
Moving this one to VERIFIED since sanlcok is able to kill vdsm once the master domain is unreachable.

Verified using rhev 3.5 vt13.4