Bug 1776816 - [ovs2.11][RHEL7.7] PF/VF Port statistics get over-run in OVS offload datapath
Summary: [ovs2.11][RHEL7.7] PF/VF Port statistics get over-run in OVS offload datapath
Keywords:
Status: MODIFIED
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch2.11
Version: FDB 18.11
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Kevin Traynor
QA Contact: qding
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-26 12:43 UTC by Pradipta Kumar Sahoo
Modified: 2023-07-13 07:25 UTC (History)
8 users (show)

Fixed In Version: openvswitch2.17-2.17.0-70.el8fdp, openvswitch2.17-2.17.0-61.el9fdp, openvswitch3.1-3.1.0-2.el8fdp, openvswitch3.1-3.1.0-1.el9fdp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-342 0 None None None 2022-01-24 04:50:00 UTC

Description Pradipta Kumar Sahoo 2019-11-26 12:43:16 UTC
Description of problem:
By default, the OVS datapath packet counter is set 10 digits.

During offload performance test with 53Mpps, we noticed OVS offload datapath flow statistics get over-run within 100sec and maximum reach out ~53000000 packet counters.
We would expect the packet counter reach to at least 9999999999 before reset. Also, need to think on ideal size of counter digits when it is running over 100 gig bandwidth nic.


Version-Release number of selected component (if applicable):
	Red Hat OpenStack Platform release 13.0.9 (Queens)
	Red Hat Enterprise Linux Server release 7.7 (Maipo)
	openstack-neutron-openvswitch-12.1.0-2.el7ost.noarch
	openvswitch2.11-2.11.0-26.el7fdp.x86_64
	openvswitch2.11-test-2.11.0-26.el7fdp.noarch
	python-openvswitch2.11-2.11.0-26.el7fdp.x86_64

How reproducible:
100% reproducible in the lab.


Steps to Reproduce:
  # ovs-dpctl dump-flows -m type=offloaded

BR,
Pradipta

Comment 2 Pradipta Kumar Sahoo 2021-02-09 15:44:50 UTC
I still noticed the same behaviour in OSP16.1/16.2 ovs2.13. Updating the log for further review.

- OVS offload datapath counter usually set with 10 digits. “packets:4279836592”
- In throughput test, we noticed the datapath packet counter statistics get over run once it reached out ~5 Billion (~4200000000).
- We would expect the counter should reach to 9-Billion (~9999999999) before reset.
- Also, we need to think about the ideal size of the packet counter when it is running over 100G network bandwidth.

http://pbench.perf.lab.eng.bos.redhat.com/results/perf122.perf.lab.eng.bos.redhat.com/trafficgen_RHOSP16.1-RHEL8.2-OVS-OFFLOAD-PVP-Update-LossTests_tg:trex_r:none_fs:64,128,256,512,1024,1500_nf:1024_fm:si_td:bi_ml:0.002,0.0005,0.0001_tt:bs__2020-09-25T15:10:19/1-bidirectional-64B-1024flows-0.002pct_drop/sample3/tools-default/nfv-compute-rt-offload-0/openvswitch/ovs-offload-stats.txt

Comment 6 Mike Pattrick 2022-01-24 04:43:22 UTC
Hello Pradipta,

Good find on this. I've done a little investigation, and found that until recently, tc would only produce a 32 bit variable to indicate number of packets from the Linux kernel. These would max out at 4294967295 and then roll over.

However, in late 2019 an extension was added to make 64bit variables accessible.

I will investigate making this packet counter available to from dump-flows instead of the 32bit one.

This issue only affects offloaded datapaths, as OVS internals use 64bit numbers to represent packet counts.

Comment 7 Zhiqian Guan 2022-12-08 07:34:22 UTC
shift to qding

Comment 8 Kevin Traynor 2023-06-16 10:53:13 UTC
Thanks Mike. The patch merged here: https://github.com/openvswitch/ovs/commit/006e1c6dbfbadf474c17c8fa1ea358918d371588

I have updated the Fixed In Version field with the first downstream releases containing this patch.


Note You need to log in before you can comment on or make changes to this bug.