Bug 1571496
Summary: | [Mellanox OVS offload] Always has packets loss | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | qding |
Component: | openvswitch | Assignee: | Marcelo Ricardo Leitner <mleitner> |
Status: | CLOSED NOTABUG | QA Contact: | qding |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 7.5 | CC: | atragler, ctrautma, mleitner, qding |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-04-28 07:58:37 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
qding
2018-04-25 01:53:21 UTC
Did you enable hugepages in the guest? (In reply to Marcelo Ricardo Leitner from comment #2) > Did you enable hugepages in the guest? Yes, the trace below excerpted from the beaker job [root@localhost ~]# grep -i huge /proc/meminfo Anon[01;31m[KHuge[m[KPages: 26624 kB [01;31m[KHuge[m[KPages_Total: 1 [01;31m[KHuge[m[KPages_Free: 1 [01;31m[KHuge[m[KPages_Rsvd: 0 [01;31m[KHuge[m[KPages_Surp: 0 [01;31m[KHuge[m[Kpagesize: 1048576 kB [root@localhost ~]# echo $? 0 You mentioned the card under test is 10Gbps: [root@dell-per730-04 ~]# lspci -m -s 0000:04:00.0 04:00.0 ... "ConnectX-4 Lx Stand-up dual-port 10GbE MCX4121A-XCAT" But the tests are performing higher than that: http://beaker-archive.app.eng.bos.redhat.com/beaker-logs/2018/04/24431/2443129/5073845/71093363/TESTOUT.log running trial 001, rate 24.000000 cmd: python trex-txrx.py ... frame_size='64', ... rate=24.0, rate_unit='mpps', .... For 10GbE it should be at most 14.4Mpps. Is the test doing some sort of warm-up period, for then measuring packet drops? You're using 1k flows and with OVS Offloading, the process of learning a new flow is considerably slower. That said, it is expected that some loss at the beginning of the test will happen. (In reply to Marcelo Ricardo Leitner from comment #4) > > For 10GbE it should be at most 14.4Mpps. After change the starting rate to 14.4, the result is a bit of better, but still not so good. Please see the log below. > > > Is the test doing some sort of warm-up period, for then measuring packet > drops? > You're using 1k flows and with OVS Offloading, the process of learning a new > flow is considerably slower. That said, it is expected that some loss at the > beginning of the test will happen. I'm afraid not do the warm-up. I'm using binary-search.py from https://github.com/atheurer/trafficgen, but have no idea how to do it. I'll try to figure it out. running trial 001, rate 14.400000 (trial failed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 40.889866%, lost packets: 176644213) (trial failed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 48.951702%, lost packets: 14686) (trial failed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 37.490327%, lost packets: 161958204) (trial failed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 47.655078%, lost packets: 14297) running trial 002, rate 7.200000 (trial failed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 15.561260%, lost packets: 33612321) (trial failed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 15.649478%, lost packets: 4695) (trial failed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 10.876432%, lost packets: 23493094) (trial failed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 10.546315%, lost packets: 3164) running trial 003, rate 3.600000 (trial failed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000773%, lost packets: 835) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial failed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.033429%, lost packets: 36103) (trial failed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.006666%, lost packets: 2) running trial 004, rate 1.800000 (trial failed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000500%, lost packets: 270) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 005, rate 0.900000 (trial failed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000074%, lost packets: 20) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 006, rate 0.720000 (trial passed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 007, rate 0.810000 (trial failed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000012%, lost packets: 3) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial failed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000041%, lost packets: 10) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 008, rate 0.765000 (trial passed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 009, rate 0.787500 (trial failed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000114%, lost packets: 9) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 010, rate 0.748125 (trial passed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 011, rate 0.767813 (trial passed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) =============================================== rx_pps when loss=0.0 "rx_pps": 766056.1863758774, "rx_pps": 766056.1863758774, rx_pps_total=1532112.3727517548 =============================================== Hi Marcelo, Finally I solved the mystery. The problem is in my script. It's too fast to start binary-search after start t-rex server. Now the result seems good. I close the bug. Thank you. =============================================== grep -e 'running trial' -e 'percent loss' /tmp/binary_search.log running trial 001, rate 14.400000 (trial failed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 7.037941%, lost packets: 30403905) (trial failed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 37.925402%, lost packets: 11378) (trial failed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 6.503911%, lost packets: 28096897) (trial failed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 37.135429%, lost packets: 11141) running trial 002, rate 7.200000 (trial passed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 003, rate 10.800000 (trial failed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000094%, lost packets: 303) (trial failed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 12.430000%, lost packets: 3729) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial failed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 18.366667%, lost packets: 5510) running trial 004, rate 9.000000 (trial failed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000047%, lost packets: 127) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 005, rate 8.100000 (trial failed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000018%, lost packets: 44) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 006, rate 7.650000 (trial failed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000013%, lost packets: 29) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 007, rate 7.267500 (trial failed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000013%, lost packets: 29) (trial failed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.003333%, lost packets: 1) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 008, rate 6.904125 (trial passed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 009, rate 7.085813 (trial failed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000004%, lost packets: 3) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 010, rate 6.731522 (trial passed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) running trial 011, rate 6.908667 (trial passed requirement, percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 0 -> 1, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) (trial passed requirement, latency percent loss, device pair: 1 -> 0, requested: 0.000000%, achieved: 0.000000%, lost packets: 0) =============================================== rx_pps when loss=0.0 "rx_pps": 6895643.298502119, "rx_pps": 6895643.298502119, rx_pps_total=13791286.597004238 =============================================== |