Bug 983446

Summary: guest hang if query cpu frequently during pxe boot(both macvtap and openvswitch backend)
Product: Red Hat Enterprise Linux 7 Reporter: Qian Guo <qiguo>
Component: qemu-kvmAssignee: Dr. David Alan Gilbert <dgilbert>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 7.0CC: acathrow, bsarathy, chayang, dgilbert, dyasny, flang, hhuang, juzhang, knoel, michen, mkenneth, mrezanin, mst, shuang, sluo, tburke, virt-maint, xutian
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 852612 Environment:
Last Closed: 2014-06-13 09:46:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 852612    
Bug Blocks:    

Comment 2 Qian Guo 2013-07-11 09:04:36 UTC
Additional info:

1.Test this w/ bridge-utils based bridge, not hit this issue, so just related w/ macvtap.

2.vhost=on, off, both hit.

so change the title that just macvtap guest hit this.

Comment 3 Qian Guo 2013-07-12 02:42:03 UTC
Sorry to forget paste the components versions:

# uname -r
3.10.0-0.rc7.64.el7.x86_64
# rpm -q qemu-kvm
qemu-kvm-1.5.1-2.el7.x86_64

Thanks !
Qian Guo

Comment 4 Qian Guo 2013-07-25 07:09:27 UTC
Hit this bug w/ openvswitch, the comandline of qemu-kvm like this:
/usr/libexec/qemu-kvm -M pc -device pci-bridge,id=bridge1,chassis_nr=1 -cpu Penryn -m 6G -smp 4,sockets=1,cores=4,threads=1 -enable-kvm -nodefaults -nodefconfig -drive file=/home/pxe222.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,werror=stop,rerror=stop,aio=native -device virtio-scsi-pci,bus=bridge1,addr=0x1,id=virtio-disk0, -device scsi-hd,bus=virtio-disk0.0,drive=drive-virtio-disk0,id=scsi-hd1 -spice port=5930,disable-ticketing -global qxl-vga.vram_size=67108864 -vga qxl -monitor stdio -netdev tap,id=vnet0,vhost=on,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown,queues=4 -device virtio-net-pci,mq=on,vectors=9,bus=bridge1,addr=0x3,netdev=vnet0,mac=54:52:1a:4b:c2:01,id=vnic1 -boot menu=on -qmp tcp:0:4444,server,nowait -monitor unix:/tmp/monitor1,server,nowait

Components:
host kernel: 
# uname -r
3.10.0-2.el7.x86_64

rpm build version:
# rpm -q qemu-kvm
qemu-kvm-1.5.1-2.el7.x86_64

so change the bug title.

Comment 5 Dr. David Alan Gilbert 2014-01-16 11:37:49 UTC
Following this through from the original cloned bug (852612)
It was fixed in RHEL6's qemu-kvm-0.12.1.2-2.325.el6 by the commit's:

  3adb4b49c0/549bc787b - kvm: x86: Fix DPL write back of segment registers
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index c720124..f8796cd 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -909,7 +909,7 @@ static void set_seg(struct kvm_segment *lhs, const SegmentCache *rhs)
     lhs->limit = rhs->limit;
     lhs->type = (flags >> DESC_TYPE_SHIFT) & 15;
     lhs->present = (flags & DESC_P_MASK) != 0;
-    lhs->dpl = rhs->selector & 3;
+    lhs->dpl = (flags >> DESC_DPL_SHIFT) & 3;
     lhs->db = (flags >> DESC_B_SHIFT) & 1;
     lhs->s = (flags & DESC_S_MASK) != 0;
     lhs->l = (flags >> DESC_L_SHIFT) & 1;

and that line is in the target-i386/kvm.c set_seg in RHEL7 git

-------------
  93f2dc950c/5ac41b20c - kvm: x86: Remove obsolete SS.RPL/DPL alignment
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index f8796cd..f6634a6 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -1037,13 +1037,6 @@ void kvm_arch_load_regs(CPUState *env)
            set_seg(&sregs.fs, &env->segs[R_FS]);
            set_seg(&sregs.gs, &env->segs[R_GS]);
            set_seg(&sregs.ss, &env->segs[R_SS]);
-
-           if (env->cr[0] & CR0_PE_MASK) {
-               /* force ss cpl to cs cpl */
-               sregs.ss.selector = (sregs.ss.selector & ~3) |
-                       (sregs.cs.selector & 3);
-               sregs.ss.dpl = sregs.ss.selector & 3;
-           }
     }
 
     set_seg(&sregs.tr, &env->tr);

the comment 'force ss cpl to cs cpl' isn't in the RHEL7 code base; kvm_arch_load_regs is no more, but target-i386/kvm.c's kvm_put_sregs has what looks like the same set_seg calls to load the state, and it doesn't have the CR0_PE_MASK test.

So it looks like both of those fixes are in the RHEL7 tree.

Comment 6 Dr. David Alan Gilbert 2014-01-16 12:45:52 UTC
Since the code all looks present and correct -> ON_QA

Comment 7 Miroslav Rezanina 2014-02-13 07:01:32 UTC
Changing to MODIFIED to fullfill errata process

Comment 12 Dr. David Alan Gilbert 2014-02-14 19:43:05 UTC
Sibiao:
  The reason I didn't fill in the 'fixed version' is because as I say in Comment 10 the original analysis I did was wrong, so the problem isn't the same problem as the bug it cloned, and therefore we don't actually know if the bug is fixed, hence why I asked for the retest.

Comment 13 Sibiao Luo 2014-02-17 03:04:05 UTC
Verified this issue with the same steps and qemu-kvm command line as comment #0 using macvtap backend on the latest qemu-kvm-1.5.3-47.el7.x86_64. Guest work well without meeting any 'KVM internal error' in qemu monitor if query cpus frequently during pxe boot.

host info:
# uname -r && rpm -q qemu-kvm
3.10.0-86.el7.x86_64
qemu-kvm-1.5.3-47.el7.x86_64

Qemu-kvm command line:
# /usr/libexec/qemu-kvm -M pc -device pci-bridge,id=bridge1,chassis_nr=1 -device pci-bridge,id=bridge2,bus=bridge1,addr=0x2,chassis_nr=1 -cpu Penryn -m 4G -smp 4,sockets=1,cores=4,threads=1 -enable-kvm -nodefaults -nodefconfig -monitor stdio -boot menu=on -qmp tcp:0:4444,server,nowait -spice port=5931,disable-ticketing -global qxl-vga.vram_size=67108864 -vga qxl -device qxl,id=video1,vram_size=67108864,bus=bridge2,addr=0x2 -drive file=/home/pxeboot.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,bus=bridge2,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,id=vnet0,vhost=on,fd=4 -device virtio-net-pci,bus=bridge2,addr=0x3,netdev=vnet0,mac=22:11:22:45:66:90,id=vnic1 4<>/dev/tap4 -monitor unix:/tmp/mon1,server,nowait

Results:
Guest work well without meeting any 'KVM internal error' in qemu monitor if query cpus frequently during pxe boot.

Base on above, this issue has been fixed correctly, move to VERIFIED status. Please correct me if any mistakes.

Best Regards,
sluo

Comment 14 Dr. David Alan Gilbert 2014-02-17 10:58:14 UTC
OK, it's good to know it's currently working; if comment #3 is correct and it was visible in RHEL7 then I don't think we can say exactly what fixed it, but good it seems to have gone away.

Comment 15 Ludek Smid 2014-06-13 09:46:47 UTC
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.