Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2097661

Summary: conntrack creates two rules for the same traffic in different zones
Product: Red Hat Enterprise Linux 8 Reporter: Rodolfo Alonso <ralonsoh>
Component: iptablesAssignee: Phil Sutter <psutter>
Status: CLOSED INSUFFICIENT_DATA QA Contact: qe-baseos-daemons
Severity: high Docs Contact:
Priority: medium    
Version: 8.4CC: fwestpha, todoleza
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-24 14:35:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2097324    

Comment 3 Phil Sutter 2022-06-22 12:44:12 UTC
Rodolfo,

Can you perhaps paste the relevant iptables rules involved in the communication
between the two VMs? I don't quite understand how conntrack kills communication
between them (in the original problem). Or the other way round, how it is
required to allow communication.

Thanks, Phil

Comment 4 Rodolfo Alonso 2022-06-29 09:41:58 UTC
Hello Phil:

Two VMs with one port each one. Both VMs are in the same host.
1) Security group rule in place that allows ingress traffic for TCP 5555 [1].
[root@compute-0 /]# conntrack -L | grep 5555
tcp      6 431892 ESTABLISHED src=192.168.0.92 dst=192.168.0.87 sport=53288 dport=5555 src=192.168.0.87 dst=192.168.0.92 sport=5555 dport=53288 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=4098 use=1
tcp      6 431892 ESTABLISHED src=192.168.0.92 dst=192.168.0.87 sport=53288 dport=5555 src=192.168.0.87 dst=192.168.0.92 sport=5555 dport=53288 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=4097 use=1


2) Security group rule deleted, the iptables rule has been deleted [2]. The CT rule of the client port is deleted; the CT of the server not.
[root@compute-0 /]# conntrack -L | grep 5555
tcp      6 431806 ESTABLISHED src=192.168.0.92 dst=192.168.0.87 sport=53288 dport=5555 src=192.168.0.87 dst=192.168.0.92 sport=5555 dport=53288 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=4097 use=1


3) From the server VM, I send new traffic. The CT rule in the client VM zone is created again:
[root@compute-0 /]# conntrack -L | grep 5555
tcp      6 431997 ESTABLISHED src=192.168.0.92 dst=192.168.0.87 sport=53288 dport=5555 src=192.168.0.87 dst=192.168.0.92 sport=5555 dport=53288 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=4098 use=1
tcp      6 431997 ESTABLISHED src=192.168.0.92 dst=192.168.0.87 sport=53288 dport=5555 src=192.168.0.87 dst=192.168.0.92 sport=5555 dport=53288 [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=4097 use=1


Is the CT entry in zone 4097 allowing the creation of the CR entry in zone 4098 in (3)?

Regards.

[1]https://paste.opendev.org/show/btS0JilajTRXsUnfGyT4/
[2]https://paste.opendev.org/show/bGDq4Ypzm4dNmIF2LNET/

Comment 6 Florian Westphal 2022-07-14 13:31:40 UTC
(In reply to Rodolfo Alonso from comment #4)
> Hello Phil:
> 
> Two VMs with one port each one. Both VMs are in the same host.
> 1) Security group rule in place that allows ingress traffic for TCP 5555 [1].
> [root@compute-0 /]# conntrack -L | grep 5555
> tcp      6 431892 ESTABLISHED src=192.168.0.92 dst=192.168.0.87 sport=53288
> dport=5555 src=192.168.0.87 dst=192.168.0.92 sport=5555 dport=53288
> [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=4098 use=1
> tcp      6 431892 ESTABLISHED src=192.168.0.92 dst=192.168.0.87 sport=53288
> dport=5555 src=192.168.0.87 dst=192.168.0.92 sport=5555 dport=53288
> [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=4097 use=1
> 
> 
> 2) Security group rule deleted, the iptables rule has been deleted [2]. The
> CT rule of the client port is deleted; the CT of the server not.
> [root@compute-0 /]# conntrack -L | grep 5555
> tcp      6 431806 ESTABLISHED src=192.168.0.92 dst=192.168.0.87 sport=53288
> dport=5555 src=192.168.0.87 dst=192.168.0.92 sport=5555 dport=53288
> [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=4097 use=1
> 
> 
> 3) From the server VM, I send new traffic. The CT rule in the client VM zone
> is created again:
> [root@compute-0 /]# conntrack -L | grep 5555
> tcp      6 431997 ESTABLISHED src=192.168.0.92 dst=192.168.0.87 sport=53288
> dport=5555 src=192.168.0.87 dst=192.168.0.92 sport=5555 dport=53288
> [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=4098 use=1
> tcp      6 431997 ESTABLISHED src=192.168.0.92 dst=192.168.0.87 sport=53288
> dport=5555 src=192.168.0.87 dst=192.168.0.92 sport=5555 dport=53288
> [ASSURED] mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=4097 use=1
> 
> 
> Is the CT entry in zone 4097 allowing the creation of the CR entry in zone
> 4098 in (3)?

From initial description:

> tcp      6 428926 ESTABLISHED src=10.100.0.23 dst=10.100.0.26 sport=35496 dport=6666 .. zone=4134 use=1
> tcp      6 428893 ESTABLISHED src=10.100.0.26 dst=10.100.0.23 sport=6666 dport=35496 .. zone=4112 use=1

If i understand correctly, setup should mimic this physical setup:

Client --> bridge1 ---> OVS ---> bridge2 ---->server
box1       box2         box3      box4          box5  
           "zone 4134"        "zone 4112" 

But its not what it its doing from conntrack perspective. If it would, above pair would look like this:

  tcp      6 428926 ESTABLISHED src=10.100.0.23 dst=10.100.0.26 sport=35496 dport=6666 .. zone=4134 use=1
  tcp      6 428926 ESTABLISHED src=10.100.0.23 dst=10.100.0.26 sport=35496 dport=6666 .. zone=4112 use=1

Because packet from box1 passes though both 2 and 4. (and 3, but that doesn't track).
But it doesn't, the tuples in zone 4112 are swapped.

So what is happening is that the zoning creates an unidirectional view:
Packets sent by box1 are placed in zone 4134.
Packets sent by box5 are placed in zone 4112.

OTOH, in comment #4 you provide:
> tcp      6 431892 ESTABLISHED src=192.168.0.92 dst=192.168.0.87 sport=53288 dport=5555 ... zone=4098 use=1
> tcp      6 431892 ESTABLISHED src=192.168.0.92 dst=192.168.0.87 sport=53288 dport=5555 ... zone=4097 use=1

which looks like the expected bidirectional view, i.e. box1 conntrack detects the packet,
then box2 detects it again in zone 4097.

Which is correct in your specific case?  Is the entry in the second zone expected to be inverted or not?

> Is the CT entry in zone 4097 allowing the creation of the CR entry in zone 4098 in (3)?

A zone serves as a "virtual" distinction marker, as far as conntrack is concerned those two entries are completely unrelated.
There is no internal logic in conntrack that would create the second entry "because the other-zone entry exists".

What is strange is that the re-creation happens with the correct, original tuple.
If the reply from the server would recreate the entry, you should see one where the direction is reversed.

I suspect this:

1. Flow gets flushed
2. box5 sends a packet back to box1
3. box4 has a conntrack entry, iptables "established" rule permits packet
4. the conntrack entry is not removed in ovs
5. box2 picks up the packet with the packet having retained its packet<->conntrack association
6. iptables "established" rule permits the packet (which is still in zone 4097)
7. box1 sends a packet back to box5
8. box2 picks this packet up as a new packet, in zone 4098

That explains why the "re-created" entry has the correct direction.
It doesn't explain why this client -> server packet is allows in absence of the "dport 5555 allowed"
allowed rule though. Perhaps another iptables rule allows the packet to pass?

You could try to use a rule like this to narrow this down:

iptables -t raw -I PREROUTING -i <bridge interface name> -p tcp --dport 5555 -j TRACE
or even 
iptables -t raw -I PREROUTING -p tcp --dport 5555 -j TRACE

and then run "xtables-monitor --trace" and see what evaluation order is shown for a packet.
if its too much output, you might want to embed a "-m limit --limit 6/min --limit-burst 1" or something.

Comment 7 Phil Sutter 2022-08-10 15:36:45 UTC
Any news here?

Comment 8 Rodolfo Alonso 2022-08-24 14:35:23 UTC
Hello Phil and Florian:

First of all, sorry for the delay. I've tried to test in the way you provided but I can't find any reason to justify what this is happening, and I can't provide better or more information on this BZ.

I've tested manually removing both rules (from both CT zones). That works as expected. However, I was also expecting to work also having a CT rule from another zone; the packet from one zone should not use a rule from another one.

In any case, we'll fix this issue in the Neutron firewall. However think you should try to reproduce it locally.

Regards.