Bug 638161
Summary: | Set hugepage for kvm guest will cause libvirtd hang | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | zhanghaiyan <yoyzhang> | ||||||
Component: | selinux-policy | Assignee: | Miroslav Grepl <mgrepl> | ||||||
Status: | CLOSED DUPLICATE | QA Contact: | BaseOS QE Security Team <qe-baseos-security> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | low | ||||||||
Version: | 5.6 | CC: | amit.shah, dallan, dwalsh, eblake, jdenemar, kxiong, llim, virt-maint, xen-maint | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2010-11-18 13:05:12 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
zhanghaiyan
2010-09-28 12:17:25 UTC
Created attachment 450165 [details]
/var/log/messages
Created attachment 450166 [details]
/var/log/audit/audit.log
In another terminal tab, execute 15. # virsh destroy 2 error: Failed to destroy domain 2 error: Timed out during operation: cannot acquire state change lock # tail /var/log/messages Sep 28 19:56:10 dhcp-93-197 last message repeated 3 times Sep 28 20:26:31 dhcp-93-197 libvirtd: 20:26:31.005: error : qemuDomainObjBeginJobWithDriver:405 : Timed out during operation: cannot acquire state change lock Then continue with some operation, hope they can provide some help 16. # service libvirtd status libvirtd (pid 4624) is running... 17. # service libvirtd restart Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ] 18. # virsh list --all Id Name State ---------------------------------- After step18, libvirtd still hang there # tail /var/log/messages Sep 28 20:38:27 dhcp-93-197 libvirtd: 20:38:27.692: warning : qemudDispatchSignalEvent:396 : Shutting down on signal 15 Sep 28 20:38:32 dhcp-93-197 libvirtd: 20:38:32.036: error : virRunWithHook:857 : internal error '/sbin/iptables --table filter --delete INPUT --in-interface virbr0 --protocol udp --destination-port 69 --jump ACCEPT' exited with non-zero status 1 and signal 0: iptables: Bad rule (does a matching rule exist in that chain?) Sep 28 20:38:32 dhcp-93-197 libvirtd: 20:38:32.497: warning : qemudStartup:1656 : Unable to create cgroup for driver: No such device or address Testing on libvirt-python-0.8.2-7.el5 libvirt-0.8.2-7.el5 when setting selinux=forcing 1. mount hugetlbfs, e.g: #mkdir /dev/hugepages #mount -t hugetlbfs hugetlbfs /dev/hugepages 2> reserve memory for huge pages, e.g: #sysctl vm.nr_hugepages=100 3> restart libvirtd service #service libvirtd restart 4> cat meminfo before guest start # more /proc/meminfo |grep Huge 2. run a guest with the following xml added to domain xml. <memoryBacking><hugepages/></memoryBacking> 3. confirm the guest was booted correctly 4. check HugePages_Free in meminfo again, # more /proc/meminfo |grep Huge more /proc/meminfo |grep Huge HugePages_Total: 100 HugePages_Free: 100 HugePages_Rsvd: 0 Hugepagesize: 2048 kB the huagepages_Free isn't changed. Checked /var/log/audit/audit.log it displays: type=AVC msg=audit(1287568512.348:3191): avc: denied { unlink } for pid=11776 comm="qemu-kvm" name="kvm.c0G1pP" dev=hugetlbfs ino=41452 scontext=system_u:system_r:svirt_t:s0:c126,c371 tcontext=system_u:object_r:hugetlbfs_t:s0 tclass=file type=SYSCALL msg=audit(1287568512.348:3191): arch=c000003e syscall=87 success=no exit=-13 a0=19517ee0 a1=c2 a2=7 a3=0 items=0 ppid=1 pid=11776 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=513 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c126,c371 key=(null) type=AVC msg=audit(1287568512.348:3192): avc: denied { write } for pid=11776 comm="qemu-kvm" name="kvm.c0G1pP" dev=hugetlbfs ino=41452 scontext=system_u:system_r:svirt_t:s0:c126,c371 tcontext=system_u:object_r:hugetlbfs_t:s0 tclass=file type=SYSCALL msg=audit(1287568512.348:3192): arch=c000003e syscall=77 success=no exit=-13 a0=8 a1=41600000 a2=ffe00000 a3=0 items=0 ppid=1 pid=11776 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=513 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c126,c371 key=(null) type=AVC msg=audit(1287568512.348:3193): avc: denied { read } for pid=11776 comm="qemu-kvm" path="/dev/hugepages/libvirt/qemu/kvm.c0G1pP" dev=hugetlbfs ino=41452 scontext=system_u:system_r:svirt_t:s0:c126,c371 tcontext=system_u:object_r:hugetlbfs_t:s0 tclass=file type=SYSCALL msg=audit(1287568512.348:3193): arch=c000003e syscall=9 success=no exit=-13 a0=0 a1=41600000 a2=3 a3=2 items=0 ppid=1 pid=11776 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=513 comm="qemu-kvm" exe="/usr/libexec/qemu-kvm" subj=system_u:system_r:svirt_t:s0:c126,c371 key=(null) I was able to reproduce this locally... Apparently there is not enough huge pages available for the guest's memory and the kernel tries to kill qemu-kvm ("kernel: VM: killing process qemu-kvm" message in the log) when HugePages_Free gets to zero and its still not enough. Interestingly, the kernel doesn't succeed in killing qemu-kvm process since I can still see it in the process list and /proc/PID/status says it's in sleeping state. Sending SIGKILL kills it, SIGTERM doesn't. Also libvirt is not really stuck, it's just waiting on that qemu-kvm process. Setting guest's memory to fit into HugePages_Total makes everything work. Although in permissive mode only. I'm reassigning to selinux, as it doesn't look like we can do anything about this behavior in libvirt. Miroslav, I think we have another bug about allowing svirt to manage hugepages, since hugepages does not support labeling in RHEL5. Yes, we have. I am closing this bug as DUPLICATE. *** This bug has been marked as a duplicate of bug 652644 *** |