Description of problem: 1. When selinux=enforce, config hugepage to kvm guest, selinux AVC denial, and HugePages_Free still is equal to HugePages_Total 2. When selinux=permissive, config hugepage to kvm guest, HugePages_Free is reducing from 95->81->42->31->9->4->1->0. After that, libvirtd is hang Version-Release number of selected component (if applicable): - RHEL5u6-20100924-Server-x86_64-kvm - kernel-2.6.18-223.el5 - kvm-83-199.el5 - libvirt-0.8.2-6.el5 How reproducible: 3/3 Steps to Reproduce: When selinux=enforce status 1.# getenforce Enforcing 2.# mkdir /dev/hugepages 3.# mount -t hugetlbfs hugetlbfs /dev/hugepages 4.#sysctl vm.nr_hugepages=100 5.#service libvirtd restart 6.# more /proc/meminfo |grep Huge HugePages_Total: 100 HugePages_Free: 100 HugePages_Rsvd: 0 Hugepagesize: 2048 kB 7.Config hugepage info into guest config xml <memoryBacking><hugepages/></memoryBacking> 8.# virsh start rhel5-6 Domain rhel5-6 started 9.# more /proc/meminfo | grep Huge HugePages_Total: 100 HugePages_Free: 100 HugePages_Rsvd: 0 Hugepagesize: 2048 kB When selinux=permissive status 10. # virsh destroy rhel5-6 11. # setenforce 0 # getenforce Permissive 12. # virsh start rhel5-6 Domain rhel5-6 started 13. # more /proc/meminfo | grep Huge HugePages_Total: 100 HugePages_Free: 95 HugePages_Rsvd: 0 Hugepagesize: 2048 kB [root@dhcp-93-197 ~]# more /proc/meminfo | grep Huge HugePages_Total: 100 HugePages_Free: 81 HugePages_Rsvd: 0 Hugepagesize: 2048 kB [root@dhcp-93-197 ~]# more /proc/meminfo | grep Huge HugePages_Total: 100 HugePages_Free: 56 HugePages_Rsvd: 0 Hugepagesize: 2048 kB ........ [root@dhcp-93-197 ~]# more /proc/meminfo | grep Huge HugePages_Total: 100 HugePages_Free: 1 HugePages_Rsvd: 0 Hugepagesize: 2048 kB [root@dhcp-93-197 ~]# more /proc/meminfo | grep Huge HugePages_Total: 100 HugePages_Free: 0 HugePages_Rsvd: 0 Hugepagesize: 2048 kB 14. # virsh list --all Id Name State ---------------------------------- Actual results: When selinux=enforce status 8. Selinux pops up 'AVC denial' warning. HugePages_Free is not changed, and it still is equal to HugePages_Total When selinux=permissive status 13. HugePages_Free is reducing from 95,81...9,4,1,0. 14. No guest is listed for `virsh list --all` and libvirtd is hang Expected results: For both selinux=enforce and selinux=permissive status, it should not deny hugepage setup and don't cause libvirtd hang Additional info: Attach tail -f /var/log/messages and tail -f /var/log/audit/audit.log for reference
Created attachment 450165 [details] /var/log/messages
Created attachment 450166 [details] /var/log/audit/audit.log
In another terminal tab, execute 15. # virsh destroy 2 error: Failed to destroy domain 2 error: Timed out during operation: cannot acquire state change lock # tail /var/log/messages Sep 28 19:56:10 dhcp-93-197 last message repeated 3 times Sep 28 20:26:31 dhcp-93-197 libvirtd: 20:26:31.005: error : qemuDomainObjBeginJobWithDriver:405 : Timed out during operation: cannot acquire state change lock
Then continue with some operation, hope they can provide some help 16. # service libvirtd status libvirtd (pid 4624) is running... 17. # service libvirtd restart Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ] 18. # virsh list --all Id Name State ---------------------------------- After step18, libvirtd still hang there # tail /var/log/messages Sep 28 20:38:27 dhcp-93-197 libvirtd: 20:38:27.692: warning : qemudDispatchSignalEvent:396 : Shutting down on signal 15 Sep 28 20:38:32 dhcp-93-197 libvirtd: 20:38:32.036: error : virRunWithHook:857 : internal error '/sbin/iptables --table filter --delete INPUT --in-interface virbr0 --protocol udp --destination-port 69 --jump ACCEPT' exited with non-zero status 1 and signal 0: iptables: Bad rule (does a matching rule exist in that chain?) Sep 28 20:38:32 dhcp-93-197 libvirtd: 20:38:32.497: warning : qemudStartup:1656 : Unable to create cgroup for driver: No such device or address
Testing on libvirt-python-0.8.2-7.el5 libvirt-0.8.2-7.el5 when setting selinux=forcing 1. mount hugetlbfs, e.g: #mkdir /dev/hugepages #mount -t hugetlbfs hugetlbfs /dev/hugepages 2> reserve memory for huge pages, e.g: #sysctl vm.nr_hugepages=100 3> restart libvirtd service #service libvirtd restart 4> cat meminfo before guest start # more /proc/meminfo |grep Huge 2. run a guest with the following xml added to domain xml. <memoryBacking><hugepages/></memoryBacking> 3. confirm the guest was booted correctly 4. check HugePages_Free in meminfo again, # more /proc/meminfo |grep Huge more /proc/meminfo |grep Huge HugePages_Total: 100 HugePages_Free: 100 HugePages_Rsvd: 0 Hugepagesize: 2048 kB the huagepages_Free isn't changed. Checked /var/log/audit/audit.log it displays: type=AVC msg=audit(1287568512.348:3191): avc: denied { unlink } for pid=11776 comm="qemu-kvm" name="kvm.c0G1pP" dev=hugetlbfs ino=41452 scontext=system_u:system_r:svirt_t:s0:c126,c371 tcontext=system_u:object_r:hugetlbfs_t:s0 tclass=file type=SYSCALL msg=audit(1287568512.348:3191): arch=c000003e syscall=87 success=no exit=-13 a0=19517ee0 a1=c2 a2=7 a3=0 items=0 ppid=1 pid=11776 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=513 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c126,c371 key=(null) type=AVC msg=audit(1287568512.348:3192): avc: denied { write } for pid=11776 comm="qemu-kvm" name="kvm.c0G1pP" dev=hugetlbfs ino=41452 scontext=system_u:system_r:svirt_t:s0:c126,c371 tcontext=system_u:object_r:hugetlbfs_t:s0 tclass=file type=SYSCALL msg=audit(1287568512.348:3192): arch=c000003e syscall=77 success=no exit=-13 a0=8 a1=41600000 a2=ffe00000 a3=0 items=0 ppid=1 pid=11776 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=513 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c126,c371 key=(null) type=AVC msg=audit(1287568512.348:3193): avc: denied { read } for pid=11776 comm="qemu-kvm" path="/dev/hugepages/libvirt/qemu/kvm.c0G1pP" dev=hugetlbfs ino=41452 scontext=system_u:system_r:svirt_t:s0:c126,c371 tcontext=system_u:object_r:hugetlbfs_t:s0 tclass=file type=SYSCALL msg=audit(1287568512.348:3193): arch=c000003e syscall=9 success=no exit=-13 a0=0 a1=41600000 a2=3 a3=2 items=0 ppid=1 pid=11776 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=513 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c126,c371 key=(null)
I was able to reproduce this locally... Apparently there is not enough huge pages available for the guest's memory and the kernel tries to kill qemu-kvm ("kernel: VM: killing process qemu-kvm" message in the log) when HugePages_Free gets to zero and its still not enough. Interestingly, the kernel doesn't succeed in killing qemu-kvm process since I can still see it in the process list and /proc/PID/status says it's in sleeping state. Sending SIGKILL kills it, SIGTERM doesn't. Also libvirt is not really stuck, it's just waiting on that qemu-kvm process. Setting guest's memory to fit into HugePages_Total makes everything work. Although in permissive mode only.
I'm reassigning to selinux, as it doesn't look like we can do anything about this behavior in libvirt.
Miroslav, I think we have another bug about allowing svirt to manage hugepages, since hugepages does not support labeling in RHEL5.
Yes, we have. I am closing this bug as DUPLICATE. *** This bug has been marked as a duplicate of bug 652644 ***