Bug 983446 - guest hang if query cpu frequently during pxe boot(both macvtap and openvswitch backend)
guest hang if query cpu frequently during pxe boot(both macvtap and openvswit...
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
Unspecified Unspecified
high Severity high
: rc
: ---
Assigned To: Dr. David Alan Gilbert
Virtualization Bugs
Depends On: 852612
  Show dependency treegraph
Reported: 2013-07-11 04:56 EDT by Qian Guo
Modified: 2014-06-17 23:31 EDT (History)
18 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 852612
Last Closed: 2014-06-13 05:46:47 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Comment 2 Qian Guo 2013-07-11 05:04:36 EDT
Additional info:

1.Test this w/ bridge-utils based bridge, not hit this issue, so just related w/ macvtap.

2.vhost=on, off, both hit.

so change the title that just macvtap guest hit this.
Comment 3 Qian Guo 2013-07-11 22:42:03 EDT
Sorry to forget paste the components versions:

# uname -r
# rpm -q qemu-kvm

Thanks !
Qian Guo
Comment 4 Qian Guo 2013-07-25 03:09:27 EDT
Hit this bug w/ openvswitch, the comandline of qemu-kvm like this:
/usr/libexec/qemu-kvm -M pc -device pci-bridge,id=bridge1,chassis_nr=1 -cpu Penryn -m 6G -smp 4,sockets=1,cores=4,threads=1 -enable-kvm -nodefaults -nodefconfig -drive file=/home/pxe222.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,werror=stop,rerror=stop,aio=native -device virtio-scsi-pci,bus=bridge1,addr=0x1,id=virtio-disk0, -device scsi-hd,bus=virtio-disk0.0,drive=drive-virtio-disk0,id=scsi-hd1 -spice port=5930,disable-ticketing -global qxl-vga.vram_size=67108864 -vga qxl -monitor stdio -netdev tap,id=vnet0,vhost=on,script=/etc/ovs-ifup,downscript=/etc/ovs-ifdown,queues=4 -device virtio-net-pci,mq=on,vectors=9,bus=bridge1,addr=0x3,netdev=vnet0,mac=54:52:1a:4b:c2:01,id=vnic1 -boot menu=on -qmp tcp:0:4444,server,nowait -monitor unix:/tmp/monitor1,server,nowait

host kernel: 
# uname -r

rpm build version:
# rpm -q qemu-kvm

so change the bug title.
Comment 5 Dr. David Alan Gilbert 2014-01-16 06:37:49 EST
Following this through from the original cloned bug (852612)
It was fixed in RHEL6's qemu-kvm- by the commit's:

  3adb4b49c0/549bc787b - kvm: x86: Fix DPL write back of segment registers
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index c720124..f8796cd 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -909,7 +909,7 @@ static void set_seg(struct kvm_segment *lhs, const SegmentCache *rhs)
     lhs->limit = rhs->limit;
     lhs->type = (flags >> DESC_TYPE_SHIFT) & 15;
     lhs->present = (flags & DESC_P_MASK) != 0;
-    lhs->dpl = rhs->selector & 3;
+    lhs->dpl = (flags >> DESC_DPL_SHIFT) & 3;
     lhs->db = (flags >> DESC_B_SHIFT) & 1;
     lhs->s = (flags & DESC_S_MASK) != 0;
     lhs->l = (flags >> DESC_L_SHIFT) & 1;

and that line is in the target-i386/kvm.c set_seg in RHEL7 git

  93f2dc950c/5ac41b20c - kvm: x86: Remove obsolete SS.RPL/DPL alignment
diff --git a/qemu-kvm-x86.c b/qemu-kvm-x86.c
index f8796cd..f6634a6 100644
--- a/qemu-kvm-x86.c
+++ b/qemu-kvm-x86.c
@@ -1037,13 +1037,6 @@ void kvm_arch_load_regs(CPUState *env)
            set_seg(&sregs.fs, &env->segs[R_FS]);
            set_seg(&sregs.gs, &env->segs[R_GS]);
            set_seg(&sregs.ss, &env->segs[R_SS]);
-           if (env->cr[0] & CR0_PE_MASK) {
-               /* force ss cpl to cs cpl */
-               sregs.ss.selector = (sregs.ss.selector & ~3) |
-                       (sregs.cs.selector & 3);
-               sregs.ss.dpl = sregs.ss.selector & 3;
-           }
     set_seg(&sregs.tr, &env->tr);

the comment 'force ss cpl to cs cpl' isn't in the RHEL7 code base; kvm_arch_load_regs is no more, but target-i386/kvm.c's kvm_put_sregs has what looks like the same set_seg calls to load the state, and it doesn't have the CR0_PE_MASK test.

So it looks like both of those fixes are in the RHEL7 tree.
Comment 6 Dr. David Alan Gilbert 2014-01-16 07:45:52 EST
Since the code all looks present and correct -> ON_QA
Comment 7 Miroslav Rezanina 2014-02-13 02:01:32 EST
Changing to MODIFIED to fullfill errata process
Comment 12 Dr. David Alan Gilbert 2014-02-14 14:43:05 EST
  The reason I didn't fill in the 'fixed version' is because as I say in Comment 10 the original analysis I did was wrong, so the problem isn't the same problem as the bug it cloned, and therefore we don't actually know if the bug is fixed, hence why I asked for the retest.
Comment 13 Sibiao Luo 2014-02-16 22:04:05 EST
Verified this issue with the same steps and qemu-kvm command line as comment #0 using macvtap backend on the latest qemu-kvm-1.5.3-47.el7.x86_64. Guest work well without meeting any 'KVM internal error' in qemu monitor if query cpus frequently during pxe boot.

host info:
# uname -r && rpm -q qemu-kvm

Qemu-kvm command line:
# /usr/libexec/qemu-kvm -M pc -device pci-bridge,id=bridge1,chassis_nr=1 -device pci-bridge,id=bridge2,bus=bridge1,addr=0x2,chassis_nr=1 -cpu Penryn -m 4G -smp 4,sockets=1,cores=4,threads=1 -enable-kvm -nodefaults -nodefconfig -monitor stdio -boot menu=on -qmp tcp:0:4444,server,nowait -spice port=5931,disable-ticketing -global qxl-vga.vram_size=67108864 -vga qxl -device qxl,id=video1,vram_size=67108864,bus=bridge2,addr=0x2 -drive file=/home/pxeboot.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,bus=bridge2,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0 -netdev tap,id=vnet0,vhost=on,fd=4 -device virtio-net-pci,bus=bridge2,addr=0x3,netdev=vnet0,mac=22:11:22:45:66:90,id=vnic1 4<>/dev/tap4 -monitor unix:/tmp/mon1,server,nowait

Guest work well without meeting any 'KVM internal error' in qemu monitor if query cpus frequently during pxe boot.

Base on above, this issue has been fixed correctly, move to VERIFIED status. Please correct me if any mistakes.

Best Regards,
Comment 14 Dr. David Alan Gilbert 2014-02-17 05:58:14 EST
OK, it's good to know it's currently working; if comment #3 is correct and it was visible in RHEL7 then I don't think we can say exactly what fixed it, but good it seems to have gone away.
Comment 15 Ludek Smid 2014-06-13 05:46:47 EDT
This request was resolved in Red Hat Enterprise Linux 7.0.

Contact your manager or support representative in case you have further questions about the request.

Note You need to log in before you can comment on or make changes to this bug.