Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 824155

Summary: Can't enable VM resources
Product: Red Hat Enterprise Linux 6 Reporter: Madison Kelly <mkelly>
Component: selinux-policyAssignee: Miroslav Grepl <mgrepl>
Status: CLOSED NOTABUG QA Contact: Milos Malik <mmalik>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.2CC: agk, cluster-maint, dwalsh, lhh, mgrepl, mmalik, mnovacek, mtruneck
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-12-19 13:34:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Related cluster.conf none

Description Madison Kelly 2012-05-22 20:31:52 UTC
Created attachment 586170 [details]
Related cluster.conf

Description of problem:

I've got a basic cluster (cluster.conf will be attached) with one VM. When I try to enable this VM, it fails. I've tried increasing rgmanager's log verbosity but nothing else is recorded beyond what is shown below.

When I try to enable a VM, the following is recorded in syslog:
==[ command ]=============
[root@il0049 ~]# clusvcadm -e vm:vm01-id
Local machine trying to enable vm:vm01-id...Failure
==[ node 1 ]==============
May 22 16:29:29 il0049 rgmanager[2525]: Starting disabled service vm:vm01-id
May 22 16:29:29 il0049 rgmanager[2525]: start on vm "vm01-id" returned 1 (generic error)
May 22 16:29:29 il0049 rgmanager[2525]: #68: Failed to start vm:vm01-id; return value: 1
May 22 16:29:29 il0049 rgmanager[2525]: Stopping service vm:vm01-id
May 22 16:29:30 il0049 rgmanager[2525]: Service vm:vm01-id is recovering
May 22 16:29:30 il0049 rgmanager[2525]: #71: Relocating failed service vm:vm01-id
May 22 16:29:31 il0049 rgmanager[2525]: Service vm:vm01-id is stopped
==[ node 2 ]==============
May 22 16:29:30 il0050 rgmanager[2542]: Recovering failed service vm:vm01-id
May 22 16:29:31 il0050 rgmanager[2542]: start on vm "vm01-id" returned 1 (generic error)
May 22 16:29:31 il0050 rgmanager[2542]: #68: Failed to start vm:vm01-id; return value: 1
May 22 16:29:31 il0050 rgmanager[2542]: Stopping service vm:vm01-id
May 22 16:29:31 il0050 rgmanager[2542]: Service vm:vm01-id is recovering
==========================

Version-Release number of selected component (if applicable):

rgmanager-3.0.12.1-5.el6.x86_64
resource-agents-3.9.2-7.el6.x86_64

How reproducible:

Rare; This is the first cluster I've run into this problem on.

Steps to Reproduce:
1. Create a cluster using the cluster.conf attached.
2. Try to start the VM using clusvcadm -e ...
3.
  
Actual results:

Fails on the target, tries to recover on peer, fails and enters "failed" state.

Expected results:

Start the VM, or at least generate a slightly more useful error.

Additional info:

I worry that I am making a dumb mistake and this will act as a permanent recording of my silliness.

Comment 1 Madison Kelly 2012-05-22 20:44:48 UTC
In vm.sh, I've added an 'echo' of the 'cmdline' variable which spits out:

=============
cmdline C: [virsh create /shared/definitions//vm01-id.xml]
May 22 16:42:42 il0050 rgmanager[2542]: start on vm "vm01-id" returned 1 (generic error)
May 22 16:42:42 il0050 rgmanager[2542]: #68: Failed to start vm:vm01-id; return value: 1
=============

If I manually run 'virsh create /shared/definitions//vm01-id.xml', the VM starts and the exit code $? is 0.

Comment 2 Madison Kelly 2012-05-22 20:54:34 UTC
I've narrowed it down now, but it makes no sense from what I can tell. I've edited /usr/share/cluster/vm.sh thusly:

============
--- /usr/share/cluster/vm.sh	2011-12-07 17:40:15.000000000 -0500
+++ /usr/share/cluster/test.vm.sh	2012-05-22 16:49:43.000000000 -0400
@@ -427,7 +427,9 @@
 
 	ocf_log debug "$cmdline"
 
+	echo "A ?: [$?] - cmdline: [$cmdline]" >> /var/log/messages
 	$cmdline
+	echo "B ?: [$?]" >> /var/log/messages
 	return $?
 }
============

When this runs;

============
A ?: [0] - cmdline: [virsh create /shared/definitions//vm01-id.xml]
B ?: [1]
============

When I run the command directly though, it's just fine:

============
[root@il0050 files]# virsh create /shared/definitions//vm01-id.xml 
Domain vm01-id created from /shared/definitions//vm01-id.xml
============

I'm stumped.

Comment 3 Madison Kelly 2012-05-22 20:56:19 UTC
Changing to 'resource-agents' as this seems to be a problem with 'vm.sh' specifically.

Comment 4 Lon Hohberger 2012-05-22 21:00:42 UTC
SElinux denial?

Comment 6 Madison Kelly 2012-05-23 00:04:09 UTC
From /var/log/audit/audit.log:
============
type=VIRT_RESOURCE msg=audit(1337731120.514:37): user pid=3356 uid=0 auid=4294967295 ses=4294967295 msg='virt=kvm resrc=cgroup reason=allow vm="vm01-id" uuid=557bf583-dfa8-d6cf-c61c-f3b992dbf320 cgroup="/cgroup/devices/libvirt/qemu/vm01-id/" class=path path=/dev/null rdev=01:03 acl=rw: exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success'
============

You called that one. Now if you'll excuse me, I need to build an effigy to selinux...

Comment 7 Madison Kelly 2012-05-23 00:06:28 UTC
I can't close this, mind doing the honours? NOTABUG, DIGIMISSEDTHEOBVIOUS. :)

Comment 8 Lon Hohberger 2012-05-23 14:43:55 UTC
It's still a bug, but in selinux-policy.

Comment 9 Madison Kelly 2012-05-23 14:53:52 UTC
Reopening then.

Comment 10 Miroslav Grepl 2012-05-24 11:22:33 UTC
So it works in permissive mode?

Comment 11 Madison Kelly 2012-05-24 16:26:40 UTC
I disabled selinux, but I expect permissive would have worked just fine as it doesn't block. If you wish, I can test this. If so, do you want me to watch for any particular log file or other output?

Comment 12 Miroslav Grepl 2012-05-25 06:40:37 UTC
It would be great. You just need to test it in permissive mode and then execute

# ausearch -m avc -ts recent

Comment 13 Madison Kelly 2012-05-25 20:18:06 UTC
=========
[root@il0049 ~]# clustat
Cluster Status for cc-cluster-A @ Fri May 25 16:15:28 2012
Member Status: Quorate

 Member Name                             ID   Status
 ------ ----                             ---- ------
 il0049.optical-online.com                   1 Online, Local, rgmanager
 il0050.optical-online.com                   2 Online, rgmanager

 Service Name                   Owner (Last)                   State         
 ------- ----                   ----- ------                   -----         
 service:storage_il49           il0049.optical-online.com      started       
 service:storage_il50           il0050.optical-online.com      started       
 vm:vm01-id                     (none)                         disabled      
=========
[root@il0049 ~]# sestatus 
SELinux status:                 enabled
SELinuxfs mount:                /selinux
Current mode:                   permissive
Mode from config file:          permissive
Policy version:                 24
Policy from config file:        targeted
=========
[root@il0049 ~]# clusvcadm -F -e vm:vm01-id
Local machine trying to enable vm:vm01-id...Success
vm:vm01-id is now running on il0049.optical-online.com
=========
[root@il0049 ~]# ausearch -m avc -ts recent
----
time->Fri May 25 16:17:17 2012
type=SYSCALL msg=audit(1337977037.785:48): arch=c000003e syscall=2 success=yes exit=11 a0=1aef6c0 a1=0 a2=7fff0bd8e890 a3=7fff0bd8e5d0 items=0 ppid=4564 pid=4652 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="virsh" exe="/usr/bin/virsh" subj=system_u:system_r:xm_t:s0 key=(null)
type=AVC msg=audit(1337977037.785:48): avc:  denied  { open } for  pid=4652 comm="virsh" name="vm01-id.xml" dev=dm-0 ino=131140 scontext=system_u:system_r:xm_t:s0 tcontext=unconfined_u:object_r:file_t:s0 tclass=file
type=AVC msg=audit(1337977037.785:48): avc:  denied  { read } for  pid=4652 comm="virsh" name="vm01-id.xml" dev=dm-0 ino=131140 scontext=system_u:system_r:xm_t:s0 tcontext=unconfined_u:object_r:file_t:s0 tclass=file
=========
[root@il0049 ~]# clustat
Cluster Status for cc-cluster-A @ Fri May 25 16:17:29 2012
Member Status: Quorate

 Member Name                             ID   Status
 ------ ----                             ---- ------
 il0049.optical-online.com                   1 Online, Local, rgmanager
 il0050.optical-online.com                   2 Online, rgmanager

 Service Name                   Owner (Last)                   State         
 ------- ----                   ----- ------                   -----         
 service:storage_il49           il0049.optical-online.com      started       
 service:storage_il50           il0050.optical-online.com      started       
 vm:vm01-id                     il0049.optical-online.com      started

Comment 14 Miroslav Grepl 2012-05-27 19:29:44 UTC
You need to fix labelinf for "vm:vm01-id" file. Then it should work.

file_t indicates that the file has no label.

Execute

# restorecon -R -v PATHO/vm:vm01-id

Comment 15 Madison Kelly 2012-05-28 19:03:54 UTC
This did not solve the problem.

=========
[root@il0049 ~]# clustat 
Cluster Status for cc-cluster-A @ Mon May 28 14:37:39 2012
Member Status: Quorate

 Member Name                             ID   Status
 ------ ----                             ---- ------
 il0049.optical-online.com                   1 Online, Local, rgmanager
 il0050.optical-online.com                   2 Online, rgmanager

 Service Name                   Owner (Last)                   State         
 ------- ----                   ----- ------                   -----         
 service:storage_il49           il0049.optical-online.com      started       
 service:storage_il50           il0050.optical-online.com      started       
 vm:vm01-id                     (none)                         disabled      
=========
[root@il0049 ~]# sestatus 
SELinux status:                 enabled
SELinuxfs mount:                /selinux
Current mode:                   enforcing
Mode from config file:          enforcing
Policy version:                 24
Policy from config file:        targeted
=========
[root@il0049 ~]# restorecon -R -v PATHO/vm:vm01-id
=========
[root@il0049 ~]# clusvcadm -F -e vm:vm01-id
Local machine trying to enable vm:vm01-id...Failure
=========[ Syslog from il0049 ]=========
May 28 14:47:35 il0049 rgmanager[2437]: start on vm "vm01-id" returned 1 (generic error)
May 28 14:47:35 il0049 rgmanager[2437]: #68: Failed to start vm:vm01-id; return value: 1
May 28 14:47:35 il0049 rgmanager[2437]: Stopping service vm:vm01-id
May 28 14:47:35 il0049 rgmanager[2437]: Service vm:vm01-id is recovering
=========[ Syslog from il0050 ]=========
May 28 14:47:35 il0050 rgmanager[2453]: Recovering failed service vm:vm01-id
May 28 14:47:36 il0050 rgmanager[2453]: start on vm "vm01-id" returned 1 (generic error)
May 28 14:47:36 il0050 rgmanager[2453]: #68: Failed to start vm:vm01-id; return value: 1
May 28 14:47:36 il0050 rgmanager[2453]: Stopping service vm:vm01-id
May 28 14:47:36 il0050 rgmanager[2453]: Service vm:vm01-id is recovering
(it sits here with the service in 'recovering' state.
=========
[root@il0049 ~]# ausearch -m avc -ts recent
----
time->Mon May 28 14:42:27 2012
type=SYSCALL msg=audit(1338230547.712:9): arch=c000003e syscall=21 success=no exit=-13 a0=173e208 a1=1 a2=7fff19afe240 a3=100 items=0 ppid=1941 pid=1942 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="find" exe="/bin/find" subj=system_u:system_r:corosync_t:s0 key=(null)
type=AVC msg=audit(1338230547.712:9): avc:  denied  { execute } for  pid=1942 comm="find" name="vm.sh.test" dev=md2 ino=393922 scontext=system_u:system_r:corosync_t:s0 tcontext=unconfined_u:object_r:usr_t:s0 tclass=file
=========
Cluster Status for cc-cluster-A @ Mon May 28 14:51:38 2012
Member Status: Quorate

 Member Name                             ID   Status
 ------ ----                             ---- ------
 il0049.optical-online.com                   1 Online, Local, rgmanager
 il0050.optical-online.com                   2 Online, rgmanager

 Service Name                   Owner (Last)                   State         
 ------- ----                   ----- ------                   -----         
 service:storage_il49           il0049.optical-online.com      started       
 service:storage_il50           il0050.optical-online.com      started       
 vm:vm01-id                     none                           recovering

Comment 16 Miroslav Grepl 2012-05-29 05:43:21 UTC
Ok, this is a different issue. Where is the "vm.sh.test" exactly located?

Comment 18 Madison Kelly 2012-09-05 12:52:19 UTC
I believe that I had left it in the same directory as vm.sh. When I removed it, things started working.

Comment 19 Miroslav Grepl 2012-09-11 08:37:03 UTC
actually

# matchpathcon /usr/share/cluster/test.vm.sh
/usr/share/cluster/test.vm.sh	system_u:object_r:bin_t:s0

so I guess you needed just execute restorecon on this script.

Comment 23 Miroslav Grepl 2012-12-19 13:34:21 UTC
The file got wrong context because of the wrong file extension. All files in /usr/share/cluster ending with .sh will get the correct context.