Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 920472

Summary: [WHQL][netkvm]NDISTest 6.5 - [2 Machine] - SingleEtherType and NDISTest 6.5 - [2 Machine] - Stats failed on win2k8-R2/win7/win2012 OS on OVS
Product: Red Hat Enterprise Linux 7 Reporter: Min Deng <mdeng>
Component: openvswitchAssignee: Yvugenfi <yvugenfi>
Status: CLOSED CURRENTRELEASE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.4CC: ailan, atragler, dfleytma, fleitner, jhsiao, juzhang, knoel, lijin, lilu, mdeng, michen, rbalakri, rpacheco, virt-bugs, yvugenfi
Target Milestone: pre-dev-freezeKeywords: Extras
Target Release: 7.4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-09-28 14:27:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
2012log
none
win2012-ovs-netkvm-build65.hckx
none
2012R2
none
764
none
2k8R2 none

Description Min Deng 2013-03-12 08:03:44 UTC
Description of problem:
  NDISTest 6.5 - [2 Machine] - SingleEtherType and NDISTest 6.5 - [2 Machine] - Stats failed on win2k8-R2/win7/win2012 OS
Version-Release number of selected component (if applicable):
 build 54
How reproducible:
 10 time 10 failed

Steps to Reproduce:
1.boot up two guests and submit the two jobs to HCK
  /usr/libexec/qemu-kvm -M rhel6.4.0 -m 6G -smp 4 -cpu cpu64-rhel6,+x2apic,+sep -usbdevice tablet -drive file=win2012-nic2.raw,format=raw,if=none,id=drive-virtio0,boot=on,cache=none,werror=stop,rerror=stop -device ide-drive,drive=drive-virtio0,id=virtio-blk-pci0,bootindex=1 -netdev tap,sndbuf=0,id=hostnet0,script=/etc/qemu-ifup,downscript=no -device e1000,netdev=hostnet0,mac=00:32:48:11:13:08,bus=pci.0,addr=0x4 -uuid 12a99a62-c003-41d0-9c5f-b0c6c9dc24c5 -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -chardev socket,id=111a,path=/tmp/win8-32-nic-49-2,server,nowait -mon chardev=111a,mode=readline -name win2012-1 -netdev tap,sndbuf=0,vhost=on,id=hostnet1,script=/etc/ovs0-ifup,downscript=/etc/ovs-ifdown -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:54:60:1a:c3:7a,bus=pci.0,addr=0x7 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio -vnc :2 -vga cirrus
 /usr/libexec/qemu-kvm -M rhel6.4.0 -m 6G -smp 4 -cpu cpu64-rhel6,+x2apic,+sep -usbdevice tablet -drive file=win2012-nic1.raw,format=raw,if=none,id=drive-virtio0,boot=on,cache=none,werror=stop,rerror=stop -device ide-drive,drive=drive-virtio0,id=virtio-blk-pci0,bootindex=1 -netdev tap,sndbuf=0,id=hostnet0,script=/etc/qemu-ifup,downscript=no -device e1000,netdev=hostnet0,mac=00:33:19:17:13:28,bus=pci.0,addr=0x4 -uuid 643d9955-831c-4bd9-96dd-2ac54a6ee16e -rtc base=localtime,clock=host,driftfix=slew -no-kvm-pit-reinjection -chardev socket,id=111a,path=/tmp/win8-32-nic-49-1,server,nowait -mon chardev=111a,mode=readline -name win2012-1 -netdev tap,sndbuf=0,id=hostnet1,vhost=on,script=/etc/ovs0-ifup,downscript=/etc/ovs0-ifdown -device virtio-net-pci,netdev=hostnet1,id=net1,mac=00:34:70:1c:36:48,bus=pci.0,addr=0x7 -global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 -monitor stdio -vnc :1 -vga cirrus

Actual results:
The jobs always failed due to some errors.
Expected results:
The job can pass.

Additional info:Upload related HCK files to the bug.

Comment 1 Min Deng 2013-03-13 02:18:16 UTC
Created attachment 709273 [details]
2012log

Comment 2 dawu 2013-03-13 03:44:58 UTC
This issue also happened on other platforms and other jobs in OVS configuration environment, it always caused by the error in logs: "xx total breakpoints were hit in the protocol driver while this test was executing", following is the details for other platforms and jobs:

win2k8-32: NDISTest 6.5 - [2 Machine] - SingleEtherType 
           NDISTest 6.5 - [2 Machine] - Stats

win2k8-64: NDISTest 6.5 - [2 Machine] - Stats

win8-32:   NDISTest 6.5 - [2 Machine] - Stats
           NDISTest 6.5 - [2 Machine] - PacketFilters
           NDISTest 6.5 - [2 Machine] - MultiCastAddress
win8-64:   NDISTest 6.5 - [2 Machine] - MultiCastAddress

above jobs were run many times (5~6),never passed,but some other NDISTest 6.5 jobs also hit the same issue, but when ran 2~3 times or even 5 times, it passed finally.

Thanks
Best Regards,
Dawn

Comment 5 Dmitry Fleytman 2013-06-26 15:28:04 UTC
Hello,

This test passes on our setup with NetKVM build65 and OVS openvswitch-1.7.1-7.el6.
Please, retest.

Dmitry

Comment 6 lijin 2013-07-05 10:46:36 UTC
in ovs environment,many jobs failed in build 65 netkvm HCK test,the verison of ovs is openvswitch-1.9.0-3.el6.x86_64.
many jobs have error in logs: "xx total breakpoints were hit in the protocol driver while this test was executing", following is the related platforms and jobs:
win2012:
NDISTest 6.5 - [2 Machine] - SingleEtherType 
NDISTest 6.5 - [2 Machine] - Stats
NDISTest 6.5 - [2 Machine] - PacketFilters
NDISTest 6.5 - [2 Machine] - GlitchFreeDevice

win2k8-32: 
NDISTest 6.5 - [2 Machine] - SingleEtherType 
NDISTest 6.5 - [2 Machine] - Stats

win2k8-64: 
NDISTest 6.5 - [2 Machine] - Stats
NDISTest 6.5 - [2 Machine] - PacketFilters
NDISTest 6.5 - [2 Machine] - SingleEtherType 
NDISTest 6.5 - [2 Machine] - MultiCastAddress

w2k8-R2:
NDISTest 6.5 - [2 Machine] - Stats
NDISTest 6.5 - [2 Machine] - reset
NDISTest 6.5 - [2 Machine] - SingleEtherType 
NDISTest 6.5 - [2 Machine] - MultiCastAddress
NDISTest 6.5 - [2 Machine] - GlitchFreeDevice


win7-32:
NDISTest 6.5 - [2 Machine] - Stats
NDISTest 6.5 - [2 Machine] - LinkCheck
NDISTest 6.5 - [2 Machine] - InvalidPackets
NDISTest 6.5 - [2 Machine] - SingleEtherType 
NDISTest 6.5 - [2 Machine] - GlitchFreeDevice
NDISTest 6.5 - [2 Machine] - MultiCastAddress

win7-64:
NDISTest 6.5 - [2 Machine] - Stats
NDISTest 6.5 - [2 Machine] - SingleEtherType 
NDISTest 6.5 - [2 Machine] - InvalidPackets

win8-32:
NDISTest 6.5 - [2 Machine] - Stats
NDISTest 6.5 - [2 Machine] - SingleEtherType 
NDISTest 6.5 - [2 Machine] - InvalidPackets

win8-64:
NDISTest 6.5 - [2 Machine] - Stats
NDISTest 6.5 - [2 Machine] - SingleEtherType 
NDISTest 6.5 - [2 Machine] - InvalidPackets
NDISTest 6.5 - [2 Machine] - GlitchFreeDevice
NDISTest 6.5 - [2 Machine] - MultiCastAddress

Comment 7 Dmitry Fleytman 2013-07-05 11:04:07 UTC
Hi,

Thanks for the report. Please attach HCK logs (.hckx) for failed tests.
Also, could you please run the same tests on the same VMs/Hosts with Linux bridge and with openvswitch-1.7.1-7.el6?

Thanks in advance,
Dmitry

Comment 8 lijin 2013-07-08 01:52:37 UTC
Created attachment 770223 [details]
win2012-ovs-netkvm-build65.hckx

Comment 9 lijin 2013-07-08 01:58:59 UTC
attacment "win2012-ovs-netkvm-build65.hckx" is the HCK logs for win2012,I will uplaod more if it is needed.

Comment 10 lijin 2013-07-08 02:45:54 UTC
(In reply to Dmitry Fleytman from comment #7)
> Hi,
> 
> Thanks for the report. Please attach HCK logs (.hckx) for failed tests.
> Also, could you please run the same tests on the same VMs/Hosts with Linux
> bridge and with openvswitch-1.7.1-7.el6?
> 
> Thanks in advance,
> Dmitry

Run the same tests on win2012 guest:
1.With Linux bridge,not hit this issue;
2.With openvswitch-1.7.1-7.el6,only one job "NDISTest 6.5 - [2 Machine] - GlitchFreeDevice" passed,others are still failed with the same error "xx total breakpoints were hit in the protocol driver while this test was executing";
And please note that with openvswitch-1.9.0-3.el6.x86_64,there are also a few jobs can pass if we run several times.

Comment 14 Dmitry Fleytman 2013-07-18 15:13:12 UTC
The root cause of this (and some other) test failures on OVS configurations is out-of-order packets. Indeed, there is a problem in openvswitch - its user mode service may reorder incoming packets on some systems.

I've opened discussion on OVS mailing list and submitted a patch that fixes a user mode service probem:

http://openvswitch.org/pipermail/dev/2013-July/029742.html
http://openvswitch.org/pipermail/dev/2013-July/029743.html

While this patch fixes part of problems there are other scenarios that need to be investigated, not changing bug state for now.

Comment 15 Dmitry Fleytman 2013-10-01 12:16:37 UTC
This bug was fixed in openvswitch upstream.

commit 04a19fb8f4b8ba19a9805906aac7b30b65b57206
Author: Ben Pfaff <blp>
Date:   Thu Sep 19 11:03:47 2013 -0700

    ofproto-dpif-upcall: Forward packets in order of arrival.

    Until now, the code in ofproto-dpif-upcall (and the code that preceded it
    in ofproto-dpif) obtained a batch of incoming packets, inserted them into
    a hash table based on hashes of their flows, processed them, and then
    forwarded them in hash order.  Usually this maintains order within a single
    network connection, but because OVS's notion of a flow is so fine-grained,
    it can reorder packets within (e.g.) a TCP connection if two packets
    handled in a single batch have (e.g.) different ECN values.

    This commit fixes the problem by making ofproto-dpif-upcall always forward
    packets in the same order they were received.

    This is far from the minimal change necessary to avoid reordering packets.
    I think that the code is easier to understand afterward.

    Reported-by: Dmitry Fleytman <dfleytma>
    Signed-off-by: Ben Pfaff <blp>
    Acked-by: Jarno Rajahalme <jrajahalme>

Comment 17 Min Deng 2014-03-14 09:17:36 UTC
 I'm afraid that the bug wasn't fix the issue extremely,and QE will upload the related hck files to the bug as well.The issue was still there via builds openvswitch-2.0.0-7.el7.x86_64
kernel-3.10.0-84.el7.x86_64 or kernel-3.10.0-105.el7.x86_64
qemu-kvm-rhev-1.5.3-46.el7.x86_64

win8-32     - Stats
              SingleEtherType
              InvalidPackets
Win2k8-R2   - SingleEtherType

win7-64     - InvalidPackets
              SingleEtherType

win2012-R2  - Stats
              SingleEtherType

Comment 18 Min Deng 2014-03-14 10:08:43 UTC
Created attachment 874305 [details]
2012R2

Comment 19 Min Deng 2014-03-14 10:19:59 UTC
Created attachment 874309 [details]
764

Comment 20 Min Deng 2014-03-14 10:22:27 UTC
Created attachment 874310 [details]
2k8R2

Comment 27 Min Deng 2015-11-19 04:15:15 UTC
Re-test the bug with openvswitch 
openvswitch-2.4.0-1.el7.x86_64.rpm
kernel-3.10.0-330.el7.x86_64
qemu-kvm-rhev-2.3.0-31.el7.x86_64
The job named stats still failed and upload log to the bug.