Bug 1150243 - [3.5-6.6] RHEV-H Fail to start vm on libvirt error: open disk image permission denied.
Summary: [3.5-6.6] RHEV-H Fail to start vm on libvirt error: open disk image permissio...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-node-plugin-vdsm
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 3.5.0
Assignee: Fabian Deutsch
QA Contact: Ilanit Stein
URL:
Whiteboard: node
Depends On: 1150377
Blocks: 1122979 rhev35betablocker 1156038 1156105 rhev35rcblocker rhev35gablocker
TreeView+ depends on / blocked
 
Reported: 2014-10-07 18:42 UTC by Ilanit Stein
Modified: 2016-02-10 20:06 UTC (History)
28 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1156105 (view as bug list)
Environment:
Last Closed: 2015-02-12 14:02:39 UTC
oVirt Team: Node
Target Upstream Version:


Attachments (Terms of Use)
All logs of automatic vms test (3.32 MB, application/x-bzip)
2014-10-07 18:45 UTC, Ilanit Stein
no flags Details
logs (201.06 KB, application/x-gzip)
2014-10-08 06:35 UTC, Douglas Schilling Landgraf
no flags Details
rhevm-side logs (358.77 KB, application/x-xz)
2014-10-08 13:18 UTC, Fabian Deutsch
no flags Details
rhevh-side logs (150.04 KB, application/x-xz)
2014-10-08 13:19 UTC, Fabian Deutsch
no flags Details
vm_starts_successful (281.48 KB, image/png)
2014-10-14 14:13 UTC, Ying Cui
no flags Details
log for rhev-hypervisor6-6.6-20141021.0.el6ev (vdsm log is 3 hours before engine time) (1.38 MB, application/x-bzip)
2014-10-21 14:47 UTC, Ilanit Stein
no flags Details


Links
System ID Priority Status Summary Last Updated
oVirt gerrit 34400 master MERGED Add 03-vdsm-sebool-config handler Never
oVirt gerrit 34401 ovirt-3.5 MERGED Add 03-vdsm-sebool-config handler Never
oVirt gerrit 34510 master MERGED hooks: Update sebool hook to stay in sync w/ vdsm Never
oVirt gerrit 34511 ovirt-3.5 MERGED hooks: Update sebool hook to stay in sync w/ vdsm Never

Description Ilanit Stein 2014-10-07 18:42:43 UTC
Description of problem:
VM fail to start in any way (run, run once).

libvirt error from vdsm.log:

Thread-2608::DEBUG::2014-10-07 16:04:30,152::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 1 edom: 10 level: 2 message: internal error process exited while connecting to monitor: 2014-10-07T16:04:29.965332Z qemu-kvm: -drive file=/rhev/data-center/68be8be1-1be6-4ef6-b0b7-bbd299c74ed9/a08453a4-d429-4e98-82bc-7d2a5a94bd0e/images/2d345fed-b095-4db3-8bb0-ac7ed4b55a26/b99e3e95-6e6b-4088-bcad-ca0afb03d02f,if=none,id=drive-virtio-disk0,format=qcow2,serial=2d345fed-b095-4db3-8bb0-ac7ed4b55a26,cache=none,werror=stop,rerror=stop,aio=threads: could not open disk image /rhev/data-center/68be8be1-1be6-4ef6-b0b7-bbd299c74ed9/a08453a4-d429-4e98-82bc-7d2a5a94bd0e/images/2d345fed-b095-4db3-8bb0-ac7ed4b55a26/b99e3e95-6e6b-4088-bcad-ca0afb03d02f: Permission denied

Thread-2608::DEBUG::2014-10-07 16:04:30,152::vm::2289::vm.Vm::(_startUnderlyingVm) vmId=`eb4c8cf2-cbfc-4f03-9ffc-d80e247bda83`::_ongoingCreations released
Thread-2608::ERROR::2014-10-07 16:04:30,152::vm::2326::vm.Vm::(_startUnderlyingVm) vmId=`eb4c8cf2-cbfc-4f03-9ffc-d80e247bda83`::The vm start process failed
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 2266, in _startUnderlyingVm
  File "/usr/share/vdsm/virt/vm.py", line 3368, in _run
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2709, in createXML
libvirtError: internal error process exited while connecting to monitor: 2014-10-07T16:04:29.965332Z qemu-kvm: -drive file=/rhev/data-center/68be8be1-1be6-4ef6-b0b7-bbd299c74ed9/a08453a4-d429-4e98-82bc-7d2a5a94bd0e/images/2d345fed-b095-4db3-8bb0-ac7ed4b55a26/b99e3e95-6e6b-4088-bcad-ca0afb03d02f,if=none,id=drive-virtio-disk0,format=qcow2,serial=2d345fed-b095-4db3-8bb0-ac7ed4b55a26,cache=none,werror=stop,rerror=stop,aio=threads: could not open disk image /rhev/data-center/68be8be1-1be6-4ef6-b0b7-bbd299c74ed9/a08453a4-d429-4e98-82bc-7d2a5a94bd0e/images/2d345fed-b095-4db3-8bb0-ac7ed4b55a26/b99e3e95-6e6b-4088-bcad-ca0afb03d02f: Permission denied

Thread-2608::DEBUG::2014-10-07 16:04:30,154::vm::2838::vm.Vm::(setDownStatus) vmId=`eb4c8cf2-cbfc-4f03-9ffc-d80e247bda83`::Changed state to Down: internal error process exited while connecting to monitor: 2014-10-07T16:04:29.965332Z qemu-kvm: -drive file=/rhev/data-center/68be8be1-1be6-4ef6-b0b7-bbd299c74ed9/a08453a4-d429-4e98-82bc-7d2a5a94bd0e/images/2d345fed-b095-4db3-8bb0-ac7ed4b55a26/b99e3e95-6e6b-4088-bcad-ca0afb03d02f,if=none,id=drive-virtio-disk0,format=qcow2,serial=2d345fed-b095-4db3-8bb0-ac7ed4b55a26,cache=none,werror=stop,rerror=stop,aio=threads: could not open disk image /rhev/data-center/68be8be1-1be6-4ef6-b0b7-bbd299c74ed9/a08453a4-d429-4e98-82bc-7d2a5a94bd0e/images/2d345fed-b095-4db3-8bb0-ac7ed4b55a26/b99e3e95-6e6b-4088-bcad-ca0afb03d02f: Permission denied
 (code=1)

Version-Release number of selected component (if applicable):
Red Hat Enterprise Virtualization Hypervisor release 6.6 (20141007.0.el6ev)

How reproducible:
100%
Failed all start VM test cases in automation test.
http://jenkins.qa.lab.tlv.redhat.com:8080/view/Compute/view/3.5-git/view/Dashboard/job/3.5-git-compute-virt-reg_vms_rhevh-nfs/1/

Comment 1 Ilanit Stein 2014-10-07 18:45:00 UTC
Created attachment 944711 [details]
All logs of automatic vms test

Comment 3 Douglas Schilling Landgraf 2014-10-08 06:35:01 UTC
Hi Dan,

To execute a quick test, I have changed the selinux to permissive and tried to start a vm based on NFS storage but no success. Any idea?

vdsm.log (full attached in .tar.gz)
=============================================
Dummy-154::DEBUG::2014-10-08 06:17:03,803::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail) SUCCESS: <err> = '1+0 records in\n1+0 records out\n1024000 bytes (1.0 MB) copied, 0.290565 s, 3.5 MB/s\n'; <rc> = 0
Thread-378::DEBUG::2014-10-08 06:17:04,180::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 1 edom: 10 level: 2 message: internal error process exited while connecting to monitor: qemu-kvm: -drive file=/rhev/data-center/mnt/192.168.1.103:_nfs_nfs06/b14ec19c-daa8-4d24-bdbf-35cf6c2c32d1/images/11111111-1111-1111-1111-111111111111/CentOS-6.5-x86_64-minimal.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw,serial=: Duplicate ID 'drive-ide0-1-0' for drive

Thread-378::DEBUG::2014-10-08 06:17:04,182::vm::2289::vm.Vm::(_startUnderlyingVm) vmId=`004cd6f4-9b4b-4868-ac7e-3be0a74080bc`::_ongoingCreations released
Thread-378::ERROR::2014-10-08 06:17:04,182::vm::2326::vm.Vm::(_startUnderlyingVm) vmId=`004cd6f4-9b4b-4868-ac7e-3be0a74080bc`::The vm start process failed
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 2266, in _startUnderlyingVm
  File "/usr/share/vdsm/virt/vm.py", line 3368, in _run
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2709, in createXML
libvirtError: internal error process exited while connecting to monitor: qemu-kvm: -drive file=/rhev/data-center/mnt/192.168.1.103:_nfs_nfs06/b14ec19c-daa8-4d24-bdbf-35cf6c2c32d1/images/11111111-1111-1111-1111-111111111111/CentOS-6.5-x86_64-minimal.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw,serial=: Duplicate ID 'drive-ide0-1-0' for drive

Thread-378::DEBUG::2014-10-08 06:17:04,185::vm::2838::vm.Vm::(setDownStatus) vmId=`004cd6f4-9b4b-4868-ac7e-3be0a74080bc`::Changed state to Down: internal error process exited while connecting to monitor: qemu-kvm: -drive file=/rhev/data-center/mnt/192.168.1.103:_nfs_nfs06/b14ec19c-daa8-4d24-bdbf-35cf6c2c32d1/images/11111111-1111-1111-1111-111111111111/CentOS-6.5-x86_64-minimal.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw,serial=: Duplicate ID 'drive-ide0-1-0' for drive
 (code=1)

Comment 4 Douglas Schilling Landgraf 2014-10-08 06:35:49 UTC
Created attachment 944862 [details]
logs

Comment 5 Francesco Romani 2014-10-08 07:01:20 UTC
Likely related:

https://bugzilla.redhat.com/show_bug.cgi?id=1130915

because of


Thread-378::DEBUG::2014-10-08 06:17:04,185::vm::2838::vm.Vm::(setDownStatus) vmId=`004cd6f4-9b4b-4868-ac7e-3be0a74080bc`::Changed state to Down: internal error process exited while connecting to monitor: qemu-kvm: -drive file=/rhev/data-center/mnt/192.168.1.103:_nfs_nfs06/b14ec19c-daa8-4d24-bdbf-35cf6c2c32d1/images/11111111-1111-1111-1111-111111111111/CentOS-6.5-x86_64-minimal.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw,serial=: Duplicate ID 'drive-ide0-1-0' for drive
 (code=1)

Comment 6 Omer Frenkel 2014-10-08 07:17:11 UTC
(In reply to Francesco Romani from comment #5)

Actually i assume its: Bug 1149637 - Cannot start VM with attached ISO

as a workaround - when you create the vm, don't attach a cd, you can edit the vm and add the cd after the create process is done

Comment 7 Ying Cui 2014-10-08 07:20:57 UTC
Virt QE failed on rhevh 6.6 20141007.0.el6ev installation on physical machine, bz #1150377, so we need to wait the bug #1150377 fixed firstly, then check this bug on QE side.

Comment 8 Ilanit Stein 2014-10-08 08:56:21 UTC
(In reply to Omer Frenkel from comment #6)

The VM run has no iso attached.
Just created a new VM, add disk, and press run -> failed to run.

Comment 9 Fabian Deutsch 2014-10-08 08:59:08 UTC
To me it looks like there are two problems, the one in described in the description (permission denied on image) and the one pointed out by Francesco in comment 5.

Ilanit, what ovirt engine build did you use?

Comment 10 Fabian Deutsch 2014-10-08 09:29:07 UTC
Ilanit, could you please also provide the permissions on the images:

$ ls -shal <images dir>

Comment 11 Fabian Deutsch 2014-10-08 13:16:40 UTC
I do not see this error with

rhevm-3.5.0-0.14.beta.el6ev.noarch
RHEV-H 6.6-20141007.0.el6ev

Steps I tried:

1. Install RHEV-H and RHEV-M
2. Export another path on m side using NFS
3. Add new domain for that NFS exported path
4. Create a new VM without disks
5. Start VM (worked)
6. Add a 4GB disk to the VM, residing on the data domain from 3.
7. Started VM (works)


Please provide the exact steps how you run into the problem given in the description.

Comment 12 Fabian Deutsch 2014-10-08 13:18:39 UTC
Created attachment 945000 [details]
rhevm-side logs

Comment 13 Fabian Deutsch 2014-10-08 13:19:19 UTC
Created attachment 945001 [details]
rhevh-side logs

This and the preovious logs are from a test where I could not reproduce this issue.

Please provide more infromations

Comment 14 Fabian Deutsch 2014-10-08 13:41:47 UTC
Even attaching an ISO and starting the VM is working.

Comment 15 Fabian Deutsch 2014-10-08 13:44:21 UTC
Ying, can you please try to reproduce this problem?

Comment 16 Fabian Deutsch 2014-10-08 19:12:28 UTC
Moving this to vdsm for further investigations.

Comment 17 Douglas Schilling Landgraf 2014-10-08 20:01:20 UTC
(In reply to Omer Frenkel from comment #6)
> (In reply to Francesco Romani from comment #5)
> 
> Actually i assume its: Bug 1149637 - Cannot start VM with attached ISO
> 
> as a workaround - when you create the vm, don't attach a cd, you can edit
> the vm and add the cd after the create process is done

Hi Omer and Francesco,

Thanks for the input, the error I saw in comment#3 is indeed a dup of bz#1149637. The workaround Omer shared in comment#6 worked like a charm.

About the original report "Permission Denied" from comment#1, we (me or fabian) couldn't reproduce it. For now, we have added additional selinux rules based on the audit file in the vm for the next build.

http://gerrit.ovirt.org/#/c/33947/
http://gerrit.ovirt.org/#/c/33946/

Comment 18 Dan Kenigsberg 2014-10-08 20:45:44 UTC
"Permission denied" went away in permissive mode; selinux rules posted to node.

Comment 19 Douglas Schilling Landgraf 2014-10-08 20:56:05 UTC
(In reply to Dan Kenigsberg from comment #18)
> "Permission denied" went away in permissive mode; selinux rules posted to
> node.

The tests were executed on selinux Enforcing and Permissive mode. Let's keep for now in ovirt-node component.

Comment 21 Ying Cui 2014-10-09 10:55:02 UTC
(In reply to Fabian Deutsch from comment #15)
> Ying, can you please try to reproduce this problem?

Yes, I am trying to reproduce this issue with 6.6 build now, but install rhevh 6.6 is failed see bug #1150377, and upgrading rhevh 6.5.z(node 3.0) to 6.6(node 3.1) is failed too(bug coming soon). so I can not provide more for this bug from QE side till now.

I will try to reproduce this bug when bug 1150377 is fixed.

Comment 22 Ilanit Stein 2014-10-12 06:12:06 UTC
(in reply to comment #9) ovirt engine build: vt5 (rhevm-3.5.0-0.14.beta.el6ev)

The rhev-h hosts were installed by vdsm-upgrade.

Comment 23 Fabian Deutsch 2014-10-14 13:22:20 UTC
Moving this to modified, because it strongly looks like this is a selinux-error only (see comment 18)

Comment 24 Ying Cui 2014-10-14 13:24:40 UTC
The bug 1150377 is fixed on this build:
el6 http://brewweb.devel.redhat.com/brew/taskinfo?taskID=8090904

And I test this bug on above build, I can not reproduce it, there is no libvirtError.

Test version:
ovirt-node-3.1.0-0.22.20141010git96b7ca3.el6.noarch
Red Hat Enterprise Virtualization Hypervisor release 6.6 (20141008.0.el6ev)
rhevm 3.5.0-0.14.beta.el6ev
vdsm-4.16.6-1.el6ev.x86_64

Test steps:
1. RHEVH 6.6 build successful.
2. add rhevh from rhevm portal.
3. add nfs domain.
4. start VM with disks, successful.

Comment 26 Ying Cui 2014-10-14 14:13:02 UTC
Created attachment 946924 [details]
vm_starts_successful

Comment 28 Fabian Deutsch 2014-10-21 10:35:49 UTC
Ilanit, you can install the rpm on the RHEV-M host, then a suggestion will appear to upgrade any RHEV-H instance in your cluster to this newer version.

Does this help?

Comment 30 Ilanit Stein 2014-10-21 14:45:52 UTC
Tested rhev-hypervisor6-6.6-20141021.0.el6ev with automatic test:


Found permission failure again:

Thread-70::ERROR::2014-10-21 12:21:34,948::vm::2341::vm.Vm::(_startUnderlyingVm) vmId=`5b090261-dde9-4ede-a216-8be98c1b1301`::The vm start process failed
Traceback (most recent call last):
  File "/usr/share/vdsm/vm.py", line 2301, in _startUnderlyingVm
  File "/usr/share/vdsm/vm.py", line 3252, in _run
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 92, in wrapper
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2709, in createXML
libvirtError: internal error process exited while connecting to monitor: 2014-10-21T12:21:34.784994Z qemu-kvm: -drive file=/rhev/data-center/94bfd2ed-a533-4bc3-b708-ab7936ce7a05/5080386f-c0c0-463c-b427-4ab074874075/images/f9695bfd-6972-4a77-b434-e4293d970bfb/bc33285d-81ae-46ec-808a-87797441595b,if=none,id=drive-virtio-disk0,format=qcow2,serial=f9695bfd-6972-4a77-b434-e4293d970bfb,cache=none,werror=stop,rerror=stop,aio=threads: could not open disk image /rhev/data-center/94bfd2ed-a533-4bc3-b708-ab7936ce7a05/5080386f-c0c0-463c-b427-4ab074874075/images/f9695bfd-6972-4a77-b434-e4293d970bfb/bc33285d-81ae-46ec-808a-87797441595b: Permission denied

logs attached.

Comment 31 Ilanit Stein 2014-10-21 14:47:58 UTC
Created attachment 949003 [details]
log for rhev-hypervisor6-6.6-20141021.0.el6ev (vdsm log is 3 hours before engine time)

You can look at host 10.35.64.190, @12:21

Comment 32 Fabian Deutsch 2014-10-21 15:07:37 UTC
Let me add that Ilanit verified that it works, when permissive mode is used.

Comment 33 Fabian Deutsch 2014-10-21 18:21:26 UTC
Ilanit, can you please try to reproduce this issue manually? 

It seems that both of your failures appeared in the automation.
We are curious to get the exact steps and version to reproduce this problem manually.

Comment 35 Fabian Deutsch 2014-10-22 06:01:50 UTC
According to Ryan's investigations this looks like an selinux issue, which prevents sanlock to access NFS mounts, can this also be reproduced on RHEL? (Also see my question in comment 33)

Comment 36 Ilanit Stein 2014-10-22 10:13:08 UTC
(in replay to comment #33)
Reproduce it manually (simply try to start vm) on rhev-h 6.6 (20141021.0.el6ev)
I tried in on 2 hosts, and on both start vm failed on:

VM mig is down. Exit message: internal error process exited while connecting to monitor: 2014-10-22T09:54:02.709210Z qemu-kvm: -drive file=/rhev/data-center/19128d10-126e-44a5-9774-bb7593dc069a/cc5dd25e-7d16-421e-9213-3ab7c4f0158a/images/daa34cb8-07fc-4b6e-829c-8ce5089904c9/38189aee-98d8-4780-9cbf-abb2473a701d,if=none,id=drive-virtio-disk0,format=raw,serial=daa34cb8-07fc-4b6e-829c-8ce5089904c9,cache=none,werror=stop,rerror=stop,aio=threads: could not open disk image /rhev/data-center/19128d10-126e-44a5-9774-bb7593dc069a/cc5dd25e-7d16-421e-9213-3ab7c4f0158a/images/daa34cb8-07fc-4b6e-829c-8ce5089904c9/38189aee-98d8-4780-9cbf-abb2473a701d: Permission denied
.

(in replay to comment #34)
[root@rose01 ~]# audit2allow -a
#============= sanlock_t ==============
#!!!! This avc can be allowed using the boolean 'sanlock_use_nfs'
allow sanlock_t nfs_t:dir search;
#!!!! This avc can be allowed using the boolean 'sanlock_use_nfs'
allow sanlock_t nfs_t:file open;
#============= sshd_t ==============
allow sshd_t var_log_t:file write;
#============= svirt_t ==============
#!!!! This avc can be allowed using the boolean 'virt_use_nfs'
allow svirt_t nfs_t:file open;
#!!!! This avc can be allowed using the boolean 'virt_use_nfs'
allow svirt_t nfs_t:filesystem getattr;

(in replay to comment #35)
Start VM work OK on rhel6.6 (with policycoreutils-python-2.0.83-19.47.el6_6.1.x86_64) and selinux enforced, with the same engine, av12.2.

Comment 37 Ying Cui 2014-10-22 10:40:01 UTC
if can reproduce on rhevh 6.6 for 3.4.z build(20141021.0.el6ev), we need clone it to z-stream, flag to 3.4.z

Comment 38 Fabian Deutsch 2014-10-22 10:44:16 UTC
Thanks Ilanit.

(In reply to Ilanit Stein from comment #36)
> (in replay to comment #33)

...

> 
> (in replay to comment #34)
> [root@rose01 ~]# audit2allow -a
> #============= sanlock_t ==============
> #!!!! This avc can be allowed using the boolean 'sanlock_use_nfs'

^^ This line looks like setsebool is not run. (Which should set sanlock_use_nfs to on).

Comment 39 Fabian Deutsch 2014-10-22 10:50:25 UTC
Yes, the booleans are not set correctly IIUIC.

[root@rose01 ~]# getsebool -a | grep virt
virt_use_comm --> off
virt_use_execmem --> off
virt_use_fusefs --> off
virt_use_nfs --> off
virt_use_samba --> off
virt_use_sanlock --> off
virt_use_sysfs --> on
virt_use_usb --> on
virt_use_xserver --> off

[root@rose01 log]# rpm -q ovirt-node vdsm policycoreutils-python
ovirt-node-3.0.1-19.el6.18.noarch
vdsm-4.14.17-1.el6ev.x86_64
policycoreutils-python-2.0.83-19.47.el6_6.1.x86_64

[root@rose01 log]# cat /etc/system-release
Red Hat Enterprise Virtualization Hypervisor release 6.6 (20141021.0.el6ev)


Alon, can you tell if some logfile can tell us if vdsm failed to update those booleans?

Comment 40 Fabian Deutsch 2014-10-22 11:03:28 UTC
Ilanit, did you add rose01 from RHEV-M side or from RHEV-H side?

Comment 41 haiyang,dong 2014-10-22 11:08:58 UTC
(In reply to Ilanit Stein from comment #36)
> (in replay to comment #33)
> Reproduce it manually (simply try to start vm) on rhev-h 6.6
> (20141021.0.el6ev)
> I tried in on 2 hosts, and on both start vm failed on:

Hey istein@,
 I try to reproduce this bug in my host, bug i meet can't install rhev-hypervisor6-6.6-20141021.0.el6ev successful on physical machine- Error Exception display.(See https://bugzilla.redhat.com/show_bug.cgi?id=1150377#c20).

how did you install rhev-hypervisor6-6.6-20141021.0.el6ev iso in your host? you install it in a vm or a physical machine.

> 
> VM mig is down. Exit message: internal error process exited while connecting
> to monitor: 2014-10-22T09:54:02.709210Z qemu-kvm: -drive
> file=/rhev/data-center/19128d10-126e-44a5-9774-bb7593dc069a/cc5dd25e-7d16-
> 421e-9213-3ab7c4f0158a/images/daa34cb8-07fc-4b6e-829c-8ce5089904c9/38189aee-
> 98d8-4780-9cbf-abb2473a701d,if=none,id=drive-virtio-disk0,format=raw,
> serial=daa34cb8-07fc-4b6e-829c-8ce5089904c9,cache=none,werror=stop,
> rerror=stop,aio=threads: could not open disk image
> /rhev/data-center/19128d10-126e-44a5-9774-bb7593dc069a/cc5dd25e-7d16-421e-
> 9213-3ab7c4f0158a/images/daa34cb8-07fc-4b6e-829c-8ce5089904c9/38189aee-98d8-
> 4780-9cbf-abb2473a701d: Permission denied
> .
> 
> (in replay to comment #34)
> [root@rose01 ~]# audit2allow -a
> #============= sanlock_t ==============
> #!!!! This avc can be allowed using the boolean 'sanlock_use_nfs'
> allow sanlock_t nfs_t:dir search;
> #!!!! This avc can be allowed using the boolean 'sanlock_use_nfs'
> allow sanlock_t nfs_t:file open;
> #============= sshd_t ==============
> allow sshd_t var_log_t:file write;
> #============= svirt_t ==============
> #!!!! This avc can be allowed using the boolean 'virt_use_nfs'
> allow svirt_t nfs_t:file open;
> #!!!! This avc can be allowed using the boolean 'virt_use_nfs'
> allow svirt_t nfs_t:filesystem getattr;
> 
> (in replay to comment #35)
> Start VM work OK on rhel6.6 (with
> policycoreutils-python-2.0.83-19.47.el6_6.1.x86_64) and selinux enforced,
> with the same engine, av12.2.

Comment 42 Fabian Deutsch 2014-10-22 11:46:17 UTC
The host was added to the cluster from the engine side.

Comment 43 Ilanit Stein 2014-10-22 11:56:58 UTC
(in reply to comment #41)
The rhev-h host was installed from pxe server with a rhev-h 6.5, and then I run vdsm upgrade on it, by:
1. Download rhev-h rpm from brew
2. Extract the RPM > rpm2cpio <rpm_file> | cpio -idmv
3. Move the iso to /data/updates/ovirt-node-image.iso
4. Run /usr/share/vdsm-reg/vdsm-upgrade

Comment 44 haiyang,dong 2014-10-22 12:40:08 UTC
(In reply to Ilanit Stein from comment #43)
> (in reply to comment #41)
> The rhev-h host was installed from pxe server with a rhev-h 6.5, and then I
> run vdsm upgrade on it, by:
> 1. Download rhev-h rpm from brew
> 2. Extract the RPM > rpm2cpio <rpm_file> | cpio -idmv
> 3. Move the iso to /data/updates/ovirt-node-image.iso
> 4. Run /usr/share/vdsm-reg/vdsm-upgrade

Hey istein@,

as your steps, now i could reproduce it, thanks.

Comment 45 Yaniv Bronhaim 2014-10-22 14:07:22 UTC
(In reply to Fabian Deutsch from comment #39)
> Yes, the booleans are not set correctly IIUIC.
> 
> [root@rose01 ~]# getsebool -a | grep virt
> virt_use_comm --> off
> virt_use_execmem --> off
> virt_use_fusefs --> off
> virt_use_nfs --> off
> virt_use_samba --> off
> virt_use_sanlock --> off
> virt_use_sysfs --> on
> virt_use_usb --> on
> virt_use_xserver --> off
> 
> [root@rose01 log]# rpm -q ovirt-node vdsm policycoreutils-python
> ovirt-node-3.0.1-19.el6.18.noarch
> vdsm-4.14.17-1.el6ev.x86_64
> policycoreutils-python-2.0.83-19.47.el6_6.1.x86_64
> 
> [root@rose01 log]# cat /etc/system-release
> Red Hat Enterprise Virtualization Hypervisor release 6.6 (20141021.0.el6ev)
> 
> 
> Alon, can you tell if some logfile can tell us if vdsm failed to update
> those booleans?

you might see it on syslog\journalctl. vdsm sets the booleans during installation. currently we have a problem that if selinux is permissive during vdsm installation nothing will be set. the verb who sets the boolean is - vdsm-tool sebool-config (we don't have validation check currently, but we'll add). so you can run that and see if afterwards the issue is solved

Comment 46 Fabian Deutsch 2014-10-22 14:13:59 UTC
SELinux is not available at build time in brew,, this might prevent the booleans getting set correctly.

Running that verb at least set's the correct booleans, I still need to verify that the registration now works.

root@dhcp-8-185 admin]# getsebool -a | grep virt
virt_use_comm --> off
virt_use_execmem --> off
virt_use_fusefs --> off
virt_use_nfs --> off
virt_use_samba --> off
virt_use_sanlock --> off
virt_use_sysfs --> on
virt_use_usb --> on
virt_use_xserver --> off
[root@dhcp-8-185 admin]# vdsm-tool sebool-config
[root@dhcp-8-185 admin]# getsebool -a | grep virt
virt_use_comm --> off
virt_use_execmem --> off
virt_use_fusefs --> on
virt_use_nfs --> on
virt_use_samba --> on
virt_use_sanlock --> on
virt_use_sysfs --> on
virt_use_usb --> on
virt_use_xserver --> off
[root@dhcp-8-185 admin]#

Comment 47 Fabian Deutsch 2014-10-22 14:46:26 UTC
Summary so far:

Running vdsm-tool sebool-config on RHEV-H will set the correct booleans and the VMs can be started.

Background:
Spawning VMs is prohibidden by SELinux.
This is due to the fact that some sebooleans are not set correctly.
AFAIU they are not set correctly, because they are set during the %post part of the vdsm install, this does not work with RHEV-H because brew is not running SELinux (which prevents the booelans from beeing set correctly in the image).

Probably we are only seeing this on 7.0 and 6.6, because the SELinux policy got stricter and is now preventing _all_ necessary calls to spawn the vm.

Solution:
The booleans must be set at some other time, not during the build.

Comment 50 Barak 2014-10-23 12:30:44 UTC
I see 2 options for the short term:
1 - setting it on host-deploy / or upgrade
2 - check it on vdsm start and rerunning the config-tool ... this may slow down the vdsm start significantly.

Danken, Alon thoughts ?

Comment 51 Fabian Deutsch 2014-10-23 13:12:19 UTC
Just a note ovirt-node-plugin-vdsm is already using some hooks to do some configuration at runtime, maybe this mechanism is the right way to solve Barak suggestion 2)

http://gerrit.ovirt.org/gitweb?p=ovirt-node-plugin-vdsm.git;a=tree;f=hooks/on-boot;hb=HEAD

Comment 52 Yaniv Bronhaim 2014-10-23 13:18:31 UTC
(In reply to Barak from comment #50)
> I see 2 options for the short term:
> 1 - setting it on host-deploy / or upgrade

no need to do that. it refers only to the persistence of the sebool configuration, which relates only to node. so we will use the startup hook to set the boolean on each start. for now.
 
> 2 - check it on vdsm start and rerunning the config-tool ... this may slow
> down the vdsm start significantly.

same as above. but later on we will move the sebool config to be part of the configure call, which will also persist the configuration. we'll post the fix asap. anyhow, the workaround for 3.4 should be ready by fabian.

> 
> Danken, Alon thoughts ?

Comment 53 Alon Bar-Lev 2014-10-23 13:20:14 UTC
we run vdsm-tool configure during host-deploy, everything vdsm requires should be there using proper or inproper patches. Managing such cases should be from vdsm side.

Comment 54 Barak 2014-10-23 13:29:12 UTC
(In reply to Barak from comment #50)
> I see 2 options for the short term:
> 1 - setting it on host-deploy / or upgrade
> 2 - check it on vdsm start and rerunning the config-tool ... this may slow
> down the vdsm start significantly.
> 
> Danken, Alon thoughts ?

After talking to Danken ... it looks there are 2 additional options:
- We can try and do it on the boot time as well .... IIRC there are some hooks for boot time in rhevh
- we can add it to vdsm-reg start service

Comment 55 Barak 2014-10-23 13:30:56 UTC
(In reply to Alon Bar-Lev from comment #53)
> we run vdsm-tool configure during host-deploy, everything vdsm requires
> should be there using proper or inproper patches. Managing such cases should
> be from vdsm side.

But than it will not be persisted

Comment 56 Alon Bar-Lev 2014-10-23 13:39:26 UTC
(In reply to Barak from comment #55)
> (In reply to Alon Bar-Lev from comment #53)
> > we run vdsm-tool configure during host-deploy, everything vdsm requires
> > should be there using proper or inproper patches. Managing such cases should
> > be from vdsm side.
> 
> But than it will not be persisted

not sure why anything that is done one way or another will or will not be persisted... if host-deploy does X or host-deploy calls program Y to do X it should be the same.

if someone can explain that exact issue in simple terms it would be great.

Comment 57 Fabian Deutsch 2014-10-23 13:56:29 UTC
The reason why the selinux policy can not be persisted is that the selinux is loaded early at boot time, before the node specific scripts jump in and bind mount the correct policy into place.
It is a known limitation that the persistence mechanism of Node can not be used on items related to the early boot process.

A workaround could have also been to reload the selinux after the bind-mounting of Node happened, but I am not sure if that works, that is why we went with the safe way to set the booleans during boot.

Comment 58 Fabian Deutsch 2014-10-23 14:56:08 UTC
*** Bug 1156038 has been marked as a duplicate of this bug. ***

Comment 60 Alon Bar-Lev 2014-10-23 18:06:00 UTC
(In reply to Fabian Deutsch from comment #57)
> The reason why the selinux policy can not be persisted is that the selinux
> is loaded early at boot time, before the node specific scripts jump in and
> bind mount the correct policy into place.
> It is a known limitation that the persistence mechanism of Node can not be
> used on items related to the early boot process.
> 
> A workaround could have also been to reload the selinux after the
> bind-mounting of Node happened, but I am not sure if that works, that is why
> we went with the safe way to set the booleans during boot.

in case this is node specific, a node specific solution should be implemented.

we should not expose node difficulties to the world (including persisting configuration files, but this is old history).

discuss this with selinux guys and find a solution, there is a way to reload policy, I do not see why can't it be done.

also, solving it for component x will always fail to apply it to components y and z.

Comment 61 Fabian Deutsch 2014-10-24 07:01:01 UTC
(In reply to Alon Bar-Lev from comment #60)
> (In reply to Fabian Deutsch from comment #57)
> > The reason why the selinux policy can not be persisted is that the selinux
> > is loaded early at boot time, before the node specific scripts jump in and
> > bind mount the correct policy into place.
> > It is a known limitation that the persistence mechanism of Node can not be
> > used on items related to the early boot process.
> > 
> > A workaround could have also been to reload the selinux after the
> > bind-mounting of Node happened, but I am not sure if that works, that is why
> > we went with the safe way to set the booleans during boot.
> 
> in case this is node specific, a node specific solution should be
> implemented.

Yep, that is why it's now implemented in ovirt-node-plugin-vdsm

> we should not expose node difficulties to the world (including persisting
> configuration files, but this is old history).

You are right, but we need to take what we have today, and there it's difficult to hide all the details.

> discuss this with selinux guys and find a solution, there is a way to reload
> policy, I do not see why can't it be done.

Yep, we are aware of it, and we are actually investigating a different approach to address the problems from ground up.

> also, solving it for component x will always fail to apply it to components
> y and z.

Sorry, I don't understand this.

Comment 62 Alon Bar-Lev 2014-10-24 07:07:32 UTC
(In reply to Fabian Deutsch from comment #61)
> (In reply to Alon Bar-Lev from comment #60)
> > also, solving it for component x will always fail to apply it to components
> > y and z.
> 
> Sorry, I don't understand this.

providing vdsm specific solution and not generic solution will likely fail for other components.

Comment 63 Fabian Deutsch 2014-10-24 07:34:21 UTC
(In reply to Alon Bar-Lev from comment #62)
> (In reply to Fabian Deutsch from comment #61)
> > (In reply to Alon Bar-Lev from comment #60)
> > > also, solving it for component x will always fail to apply it to components
> > > y and z.
> > 
> > Sorry, I don't understand this.
> 
> providing vdsm specific solution and not generic solution will likely fail
> for other components.

Thanks - Yes, that's why we want to address this from the grounds up.

Comment 64 Ilanit Stein 2014-11-24 06:26:00 UTC
Verified on RHEV-H 6.6 for RHEV 3.5 beta 5 (rhev-hypervisor6-6.6-20141119.0)
VM automatic test PASSED.

Comment 66 Fabian Deutsch 2015-02-12 14:02:39 UTC
RHEV 3.5.0 has been released. I am closing this bug, because it has been VERIFIED.


Note You need to log in before you can comment on or make changes to this bug.