Bug 824155
| Summary: | Can't enable VM resources | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Madison Kelly <mkelly> | ||||
| Component: | selinux-policy | Assignee: | Miroslav Grepl <mgrepl> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | Milos Malik <mmalik> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 6.2 | CC: | agk, cluster-maint, dwalsh, lhh, mgrepl, mmalik, mnovacek, mtruneck | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | All | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2012-12-19 13:34:21 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
In vm.sh, I've added an 'echo' of the 'cmdline' variable which spits out: ============= cmdline C: [virsh create /shared/definitions//vm01-id.xml] May 22 16:42:42 il0050 rgmanager[2542]: start on vm "vm01-id" returned 1 (generic error) May 22 16:42:42 il0050 rgmanager[2542]: #68: Failed to start vm:vm01-id; return value: 1 ============= If I manually run 'virsh create /shared/definitions//vm01-id.xml', the VM starts and the exit code $? is 0. I've narrowed it down now, but it makes no sense from what I can tell. I've edited /usr/share/cluster/vm.sh thusly: ============ --- /usr/share/cluster/vm.sh 2011-12-07 17:40:15.000000000 -0500 +++ /usr/share/cluster/test.vm.sh 2012-05-22 16:49:43.000000000 -0400 @@ -427,7 +427,9 @@ ocf_log debug "$cmdline" + echo "A ?: [$?] - cmdline: [$cmdline]" >> /var/log/messages $cmdline + echo "B ?: [$?]" >> /var/log/messages return $? } ============ When this runs; ============ A ?: [0] - cmdline: [virsh create /shared/definitions//vm01-id.xml] B ?: [1] ============ When I run the command directly though, it's just fine: ============ [root@il0050 files]# virsh create /shared/definitions//vm01-id.xml Domain vm01-id created from /shared/definitions//vm01-id.xml ============ I'm stumped. Changing to 'resource-agents' as this seems to be a problem with 'vm.sh' specifically. SElinux denial? From /var/log/audit/audit.log: ============ type=VIRT_RESOURCE msg=audit(1337731120.514:37): user pid=3356 uid=0 auid=4294967295 ses=4294967295 msg='virt=kvm resrc=cgroup reason=allow vm="vm01-id" uuid=557bf583-dfa8-d6cf-c61c-f3b992dbf320 cgroup="/cgroup/devices/libvirt/qemu/vm01-id/" class=path path=/dev/null rdev=01:03 acl=rw: exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success' ============ You called that one. Now if you'll excuse me, I need to build an effigy to selinux... I can't close this, mind doing the honours? NOTABUG, DIGIMISSEDTHEOBVIOUS. :) It's still a bug, but in selinux-policy. Reopening then. So it works in permissive mode? I disabled selinux, but I expect permissive would have worked just fine as it doesn't block. If you wish, I can test this. If so, do you want me to watch for any particular log file or other output? It would be great. You just need to test it in permissive mode and then execute # ausearch -m avc -ts recent =========
[root@il0049 ~]# clustat
Cluster Status for cc-cluster-A @ Fri May 25 16:15:28 2012
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
il0049.optical-online.com 1 Online, Local, rgmanager
il0050.optical-online.com 2 Online, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:storage_il49 il0049.optical-online.com started
service:storage_il50 il0050.optical-online.com started
vm:vm01-id (none) disabled
=========
[root@il0049 ~]# sestatus
SELinux status: enabled
SELinuxfs mount: /selinux
Current mode: permissive
Mode from config file: permissive
Policy version: 24
Policy from config file: targeted
=========
[root@il0049 ~]# clusvcadm -F -e vm:vm01-id
Local machine trying to enable vm:vm01-id...Success
vm:vm01-id is now running on il0049.optical-online.com
=========
[root@il0049 ~]# ausearch -m avc -ts recent
----
time->Fri May 25 16:17:17 2012
type=SYSCALL msg=audit(1337977037.785:48): arch=c000003e syscall=2 success=yes exit=11 a0=1aef6c0 a1=0 a2=7fff0bd8e890 a3=7fff0bd8e5d0 items=0 ppid=4564 pid=4652 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="virsh" exe="/usr/bin/virsh" subj=system_u:system_r:xm_t:s0 key=(null)
type=AVC msg=audit(1337977037.785:48): avc: denied { open } for pid=4652 comm="virsh" name="vm01-id.xml" dev=dm-0 ino=131140 scontext=system_u:system_r:xm_t:s0 tcontext=unconfined_u:object_r:file_t:s0 tclass=file
type=AVC msg=audit(1337977037.785:48): avc: denied { read } for pid=4652 comm="virsh" name="vm01-id.xml" dev=dm-0 ino=131140 scontext=system_u:system_r:xm_t:s0 tcontext=unconfined_u:object_r:file_t:s0 tclass=file
=========
[root@il0049 ~]# clustat
Cluster Status for cc-cluster-A @ Fri May 25 16:17:29 2012
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
il0049.optical-online.com 1 Online, Local, rgmanager
il0050.optical-online.com 2 Online, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:storage_il49 il0049.optical-online.com started
service:storage_il50 il0050.optical-online.com started
vm:vm01-id il0049.optical-online.com started
You need to fix labelinf for "vm:vm01-id" file. Then it should work. file_t indicates that the file has no label. Execute # restorecon -R -v PATHO/vm:vm01-id This did not solve the problem.
=========
[root@il0049 ~]# clustat
Cluster Status for cc-cluster-A @ Mon May 28 14:37:39 2012
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
il0049.optical-online.com 1 Online, Local, rgmanager
il0050.optical-online.com 2 Online, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:storage_il49 il0049.optical-online.com started
service:storage_il50 il0050.optical-online.com started
vm:vm01-id (none) disabled
=========
[root@il0049 ~]# sestatus
SELinux status: enabled
SELinuxfs mount: /selinux
Current mode: enforcing
Mode from config file: enforcing
Policy version: 24
Policy from config file: targeted
=========
[root@il0049 ~]# restorecon -R -v PATHO/vm:vm01-id
=========
[root@il0049 ~]# clusvcadm -F -e vm:vm01-id
Local machine trying to enable vm:vm01-id...Failure
=========[ Syslog from il0049 ]=========
May 28 14:47:35 il0049 rgmanager[2437]: start on vm "vm01-id" returned 1 (generic error)
May 28 14:47:35 il0049 rgmanager[2437]: #68: Failed to start vm:vm01-id; return value: 1
May 28 14:47:35 il0049 rgmanager[2437]: Stopping service vm:vm01-id
May 28 14:47:35 il0049 rgmanager[2437]: Service vm:vm01-id is recovering
=========[ Syslog from il0050 ]=========
May 28 14:47:35 il0050 rgmanager[2453]: Recovering failed service vm:vm01-id
May 28 14:47:36 il0050 rgmanager[2453]: start on vm "vm01-id" returned 1 (generic error)
May 28 14:47:36 il0050 rgmanager[2453]: #68: Failed to start vm:vm01-id; return value: 1
May 28 14:47:36 il0050 rgmanager[2453]: Stopping service vm:vm01-id
May 28 14:47:36 il0050 rgmanager[2453]: Service vm:vm01-id is recovering
(it sits here with the service in 'recovering' state.
=========
[root@il0049 ~]# ausearch -m avc -ts recent
----
time->Mon May 28 14:42:27 2012
type=SYSCALL msg=audit(1338230547.712:9): arch=c000003e syscall=21 success=no exit=-13 a0=173e208 a1=1 a2=7fff19afe240 a3=100 items=0 ppid=1941 pid=1942 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="find" exe="/bin/find" subj=system_u:system_r:corosync_t:s0 key=(null)
type=AVC msg=audit(1338230547.712:9): avc: denied { execute } for pid=1942 comm="find" name="vm.sh.test" dev=md2 ino=393922 scontext=system_u:system_r:corosync_t:s0 tcontext=unconfined_u:object_r:usr_t:s0 tclass=file
=========
Cluster Status for cc-cluster-A @ Mon May 28 14:51:38 2012
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
il0049.optical-online.com 1 Online, Local, rgmanager
il0050.optical-online.com 2 Online, rgmanager
Service Name Owner (Last) State
------- ---- ----- ------ -----
service:storage_il49 il0049.optical-online.com started
service:storage_il50 il0050.optical-online.com started
vm:vm01-id none recovering
Ok, this is a different issue. Where is the "vm.sh.test" exactly located? I believe that I had left it in the same directory as vm.sh. When I removed it, things started working. actually # matchpathcon /usr/share/cluster/test.vm.sh /usr/share/cluster/test.vm.sh system_u:object_r:bin_t:s0 so I guess you needed just execute restorecon on this script. The file got wrong context because of the wrong file extension. All files in /usr/share/cluster ending with .sh will get the correct context. |
Created attachment 586170 [details] Related cluster.conf Description of problem: I've got a basic cluster (cluster.conf will be attached) with one VM. When I try to enable this VM, it fails. I've tried increasing rgmanager's log verbosity but nothing else is recorded beyond what is shown below. When I try to enable a VM, the following is recorded in syslog: ==[ command ]============= [root@il0049 ~]# clusvcadm -e vm:vm01-id Local machine trying to enable vm:vm01-id...Failure ==[ node 1 ]============== May 22 16:29:29 il0049 rgmanager[2525]: Starting disabled service vm:vm01-id May 22 16:29:29 il0049 rgmanager[2525]: start on vm "vm01-id" returned 1 (generic error) May 22 16:29:29 il0049 rgmanager[2525]: #68: Failed to start vm:vm01-id; return value: 1 May 22 16:29:29 il0049 rgmanager[2525]: Stopping service vm:vm01-id May 22 16:29:30 il0049 rgmanager[2525]: Service vm:vm01-id is recovering May 22 16:29:30 il0049 rgmanager[2525]: #71: Relocating failed service vm:vm01-id May 22 16:29:31 il0049 rgmanager[2525]: Service vm:vm01-id is stopped ==[ node 2 ]============== May 22 16:29:30 il0050 rgmanager[2542]: Recovering failed service vm:vm01-id May 22 16:29:31 il0050 rgmanager[2542]: start on vm "vm01-id" returned 1 (generic error) May 22 16:29:31 il0050 rgmanager[2542]: #68: Failed to start vm:vm01-id; return value: 1 May 22 16:29:31 il0050 rgmanager[2542]: Stopping service vm:vm01-id May 22 16:29:31 il0050 rgmanager[2542]: Service vm:vm01-id is recovering ========================== Version-Release number of selected component (if applicable): rgmanager-3.0.12.1-5.el6.x86_64 resource-agents-3.9.2-7.el6.x86_64 How reproducible: Rare; This is the first cluster I've run into this problem on. Steps to Reproduce: 1. Create a cluster using the cluster.conf attached. 2. Try to start the VM using clusvcadm -e ... 3. Actual results: Fails on the target, tries to recover on peer, fails and enters "failed" state. Expected results: Start the VM, or at least generate a slightly more useful error. Additional info: I worry that I am making a dumb mistake and this will act as a permanent recording of my silliness.