Bug 1957786

Summary: Move blocked and natted bits from ct_label to ct_mark
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Alaa Hleihel (NVIDIA Mellanox) <ahleihel>
Component: OVNAssignee: Dumitru Ceara <dceara>
Status: CLOSED ERRATA QA Contact: Jianlin Shi <jishi>
Severity: unspecified Docs Contact:
Priority: high    
Version: RHEL 8.0CC: adrianc, ctrautma, dceara, mleitner, mmichels, zshi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2097221 (view as bug list) Environment:
Last Closed: 2022-06-30 17:59:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2097221    

Description Alaa Hleihel (NVIDIA Mellanox) 2021-05-06 13:31:29 UTC
As discussed during our meeting, NVIDIA asked to move blocked and natted bits from ct_label to ct_mark

Comment 1 Dumitru Ceara 2021-06-21 14:53:31 UTC
Hi Alaa,

Would it be OK to make this BZ public?  That, is remove the "mellanox" group?

Thanks,
Dumitru

Comment 2 Alaa Hleihel (NVIDIA Mellanox) 2021-06-21 14:55:58 UTC
(In reply to Dumitru Ceara from comment #1)
> Hi Alaa,
> 
> Would it be OK to make this BZ public?  That, is remove the "mellanox" group?
> 
> Thanks,
> Dumitru

Hi, Dumitru.
Sure, no problem.

Comment 3 Marcelo Ricardo Leitner 2021-07-07 21:41:29 UTC
Hey Alaa, it was agreed some mtgs ago that Han was going to look into this, so I'll reassign this bz to you to reflect that.
There is a bz account to Han Zhou but well, I'm not sure it is his.

Comment 4 Alaa Hleihel (NVIDIA Mellanox) 2021-07-08 06:29:27 UTC
Right.
Also, copying some info from the mail-thread:

````
Hardware offload won't work for flows that match on ct_label with a mask.

We currently do that in OVN because we use ct_label for a few things:

- match on ecmp_reply_eth address (48 bits)
- match on ecmp_reply_port (16 bits)
- match on ct_label.blocked (1 bit)
- match on ct_label.natted (1 bit)

The recommendation was to move the bits we match against with a mask (i.e., blocked, natted and ecmp_reply_port) to ct_label because the hardware supports masked matches of ct_label (32 bits).
````

Comment 5 Alaa Hleihel (NVIDIA Mellanox) 2021-07-13 07:53:47 UTC
note to self: internal ticket SDN-915

Comment 7 Marcelo Ricardo Leitner 2022-04-27 23:30:42 UTC
Sorry but I forgot, what are the plans again here now? Get it into downstream on the next version and test it?

Comment 8 Dumitru Ceara 2022-04-28 09:57:38 UTC
(In reply to Marcelo Ricardo Leitner from comment #7)
> Sorry but I forgot, what are the plans again here now? Get it into
> downstream on the next version and test it?

We have two options:

1. Let it be picked up by FDP when we rebase on top of upstream
   ovn22.06.0 (scheduled on June 3rd). This is a personal guess but I
   think this won't be picked up before OCP 4.12.

2. Have it backported upstream to the upstream LTS branch (branch-22.03),
   and then pick it up in the next FDP release (22.D). AFAIK we're
   passed feature freeze in OCP 4.11 so this potentially won't be picked
   up before OCP 4.12 either.

Han Zhou (NVidia) mentioned that he'll be looking into doing the
backport to the upstream LTS branch (point 2 above).

Comment 10 Dumitru Ceara 2022-05-23 19:42:43 UTC
Fix merged upstream to main, branch-22.03 and backport posted for branch-21.12:

https://patchwork.ozlabs.org/project/ovn/list/?series=301671&state=*

Comment 11 Marcelo Ricardo Leitner 2022-06-01 18:33:59 UTC
Hi folks. Can we be sure that the fix will be included in OCP 4.11?
Late last week Numan included the fix in the downstream branches, but that's where my knowledge ends.

Comment 12 Dumitru Ceara 2022-06-02 09:25:31 UTC
(In reply to Marcelo Ricardo Leitner from comment #11)
> Hi folks. Can we be sure that the fix will be included in OCP 4.11?
> Late last week Numan included the fix in the downstream branches, but that's
> where my knowledge ends.

Hi Marcelo,

It's already in there:
https://github.com/openshift/ovn-kubernetes/blob/f88113aa1c9a59936ce4d09f383f829a8f9f2c22/Dockerfile#L36

Since:
https://github.com/openshift/ovn-kubernetes/commit/7d557e065cae715cab1863b35e944c0dc934d371

I don't think we have an OVN 22.06 errata including this yet though.  Once that happens this BZ will also be moved to MODIFED.

Thanks,
Dumitru

Comment 15 Jianlin Shi 2022-06-13 03:02:47 UTC
run the basic ovn hw-offload test case on ovn22.03-22.03.0-52, no new issue found. set VERIFIED

Comment 16 Marcelo Ricardo Leitner 2022-06-21 13:00:18 UTC
FWIW, the request for OCP 4.10 to include this fix is being tracked at: https://bugzilla.redhat.com/show_bug.cgi?id=2097221

Comment 18 errata-xmlrpc 2022-06-30 17:59:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:5446