Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 2109452

Summary: [OVN][OVS HWOFFLOAD] It takes some time to start offloading packets
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Miguel Angel Nieto <mnietoji>
Component: openvswitchAssignee: Marcelo Ricardo Leitner <mleitner>
openvswitch sub component: ovs-hw-offload QA Contact:
Status: CLOSED DUPLICATE Docs Contact:
Severity: low    
Priority: low CC: apevec, chrisw, ctrautma, fleitner, hakhande, mlavalle, mleitner, oblaut, qding, ralonsoh, supadhya
Version: FDP 22.AKeywords: AutomationBlocker, Regression
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-05 10:39:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2006605    
Bug Blocks:    

Description Miguel Angel Nieto 2022-07-21 09:40:11 UTC
Description of problem:
In 16.2 I can see that when receiving a new UDP flow only the first packet is not offloaded (executing tcpdump on representor port, only 1 packet is captured).
However, in 17.0 there are more packets captured in representor port, so I think it is taking longer to offload the flow

I am doing the test with iperf, sending udp packets during 10 seconds. Around 850 pakets are sent and tcpdump captures around 80 packets, so I would say that offloading takes around 1 second while in 16.2 it was a few milliseconds.


Version-Release number of selected component (if applicable):
RHOS-17.0-RHEL-9-20220701.n.1
openvswitch2.17-2.17.0-30.el9fdp.x86_64


How reproducible:
1. deploy ovn hwoffload setup
2. create 2 instances using a vlan provider network
3. execute tcpdump in representor ports
4. execute iperf in vms

tcpdump should capture only the first packet, but it is capturing more



Actual results:
It takes  a long time to start offloading packets


Expected results:
Offloading should start inmediately after receiving the first packet of a flow


Additional info:
I will include logs and sos report

Comment 3 Marcelo Ricardo Leitner 2022-07-22 15:29:24 UTC
(In reply to Miguel Angel Nieto from comment #0)
> tcpdump should capture only the first packet, but it is capturing more
> 

This

> Expected results:
> Offloading should start inmediately after receiving the first packet of a
> flow

And this are wrong expectations. Offloading of tc flows is serialized with the upcall handling, but there is nothing saying that ovs kernel can't queue more packets in the upcall netlink socket while vswitchd wasn't scheduled yet to handle the first one. Similarly, while vswitchd might be handling the first packet, subsequent ones may be triggering a miss in hw already, and get handled by tc sw or even vswitchd.

ovs 2.15 had one way of spreading the upcall load, and from 2.16 onwards it changed. It did impact on the ovs learn rate, but reported mainly on how much it scales:
https://bugzilla.redhat.com/show_bug.cgi?id=2006605

I don't know if it also impacted the insert latency. Maybe it did.

> Actual results:
> It takes  a long time to start offloading packets

Is conntrack involved? If yes, this gets even worse, because offloading of conntrack entries is async.

Comment 4 Miguel Angel Nieto 2022-07-29 09:43:21 UTC
(In reply to Marcelo Ricardo Leitner from comment #3)
> (In reply to Miguel Angel Nieto from comment #0)
> > tcpdump should capture only the first packet, but it is capturing more
> > 
> 
> This
> 
> > Expected results:
> > Offloading should start inmediately after receiving the first packet of a
> > flow
> 
> And this are wrong expectations. Offloading of tc flows is serialized with
> the upcall handling, but there is nothing saying that ovs kernel can't queue
> more packets in the upcall netlink socket while vswitchd wasn't scheduled
> yet to handle the first one. Similarly, while vswitchd might be handling the
> first packet, subsequent ones may be triggering a miss in hw already, and
> get handled by tc sw or even vswitchd.

What I am pointing out here is a difference in the behaviour compared to  osp 16.2. Offfload testcases are tailored to the behaviour of osp16.2 and they are  failing in 17.0
My testcases uses a single flow, but i am wondering if this could  be an issue in a scenario in which there are many short duration flows. In that case, it could happen that most of the traffic is not offloaded.
> 
> ovs 2.15 had one way of spreading the upcall load, and from 2.16 onwards it
> changed. It did impact on the ovs learn rate, but reported mainly on how
> much it scales:
> https://bugzilla.redhat.com/show_bug.cgi?id=2006605
> 
> I don't know if it also impacted the insert latency. Maybe it did.
> 
> > Actual results:
> > It takes  a long time to start offloading packets
> 
> Is conntrack involved? If yes, this gets even worse, because offloading of
> conntrack entries is async.

I only tested it without conntrack.


What should I do with this bz, should it be futher investigated to check if it could cause issues in some scenarios or do I directly close it?

Comment 6 Marcelo Ricardo Leitner 2022-08-02 15:35:36 UTC
(In reply to Miguel Angel Nieto from comment #4)
> (In reply to Marcelo Ricardo Leitner from comment #3)
> > (In reply to Miguel Angel Nieto from comment #0)
> > > tcpdump should capture only the first packet, but it is capturing more
> > > 
> > 
> > This
> > 
> > > Expected results:
> > > Offloading should start inmediately after receiving the first packet of a
> > > flow
> > 
> > And this are wrong expectations. Offloading of tc flows is serialized with
> > the upcall handling, but there is nothing saying that ovs kernel can't queue
> > more packets in the upcall netlink socket while vswitchd wasn't scheduled
> > yet to handle the first one. Similarly, while vswitchd might be handling the
> > first packet, subsequent ones may be triggering a miss in hw already, and
> > get handled by tc sw or even vswitchd.
> 
> What I am pointing out here is a difference in the behaviour compared to 
> osp 16.2. Offfload testcases are tailored to the behaviour of osp16.2 and
> they are  failing in 17.0
> My testcases uses a single flow, but i am wondering if this could  be an
> issue in a scenario in which there are many short duration flows. In that
> case, it could happen that most of the traffic is not offloaded.

Yes, that's a possibility. Short flows can be a pain for HWOL.

> > 
> > ovs 2.15 had one way of spreading the upcall load, and from 2.16 onwards it
> > changed. It did impact on the ovs learn rate, but reported mainly on how
> > much it scales:
> > https://bugzilla.redhat.com/show_bug.cgi?id=2006605
> > 
> > I don't know if it also impacted the insert latency. Maybe it did.
> > 
> > > Actual results:
> > > It takes  a long time to start offloading packets
> > 
> > Is conntrack involved? If yes, this gets even worse, because offloading of
> > conntrack entries is async.
> 
> I only tested it without conntrack.

Then it should have pretty much the same behavior as before. I'm not aware of changes, other than CT, that could change things drastically like that.

> 
> 
> What should I do with this bz, should it be futher investigated to check if
> it could cause issues in some scenarios or do I directly close it?

I suggest you pick one of the use cases that are failing and we focus on it here.
Then, after some initial triaging, I believe you will be able to check the others and see if it's the same issue or not.

The issue that Rashid mentioned is this one:
https://bugzilla.redhat.com/show_bug.cgi?id=2110018

if you take 'ovs-appctl dpctl/dump-flows -m' outputs during the test, if you spot a clone() action in there, it's very likely the same issue.
So far, I only saw it affecting CT flows, but I think it could affect non-CT use cases as well.
And if it is, the packets wouldn't ever get offloaded.

So 1st lets pick an use case, ideally the simplest non working one, and take dpctl dump as above after the test is running for some time. We will use the stats in order to check which flows are getting used by the test, and understand where the issue might be.

Haresh, appreciate if you could take a look too.

Thanks!

Comment 7 Miguel Angel Nieto 2022-10-03 14:22:28 UTC
I think this issue is related with the kernel version

I do not reproduce this issue when using a newer  kernel
Linux computehwoffload-r730 5.14.0-168.el9.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Sep 23 09:31:26 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux

Comment 10 Miguel Angel Nieto 2023-01-05 10:39:56 UTC
I think this is a duplication of https://bugzilla.redhat.com/show_bug.cgi?id=2108213

*** This bug has been marked as a duplicate of bug 2108213 ***