Bug 1399987

Summary: [RFE] allow to limit conntrack entries per tenant to avoid "nf_conntrack: table full, dropping packet"
Product: Red Hat OpenStack Reporter: Pablo Iranzo Gómez <pablo.iranzo>
Component: openstack-neutronAssignee: OSP Team <rhos-maint>
Status: CLOSED UPSTREAM QA Contact: Toni Freger <tfreger>
Severity: medium Docs Contact:
Priority: high    
Version: 17.0 (Wallaby)CC: adhingra, afariasa, alisci, aruffin, astupnik, bcafarel, cfontain, chrisw, dhill, erinn.looneytriggs, fherrman, hakhande, ihrachys, jlibosva, johender, jraju, madgupta, majopela, pablo.iranzo, pbarta, ralonsoh, rhos-maint, rlondhe, shtiwari, skaplons, srevivo, yiche
Target Milestone: ---Keywords: FutureFeature, Reopened, ZStream
Target Release: ---Flags: johender: needinfo-
pablo.iranzo: needinfo-
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1531074 (view as bug list) Environment:
Last Closed: 2024-12-03 15:36:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1531074    
Bug Blocks: 1509630, 2049755    

Description Pablo Iranzo Gómez 2016-11-30 08:17:13 UTC
Description of problem:

A tenant can cause network issues for other tenants: nf_conntrack: table full, dropping packet.

In our cloud had a jmeter performance test running on two instances caused network issues for other tenants.

In the /var/log/messages on the compute node we see the following message:
"nf_conntrack: table full, dropping packet."


This gerrit https://review.openstack.org/#/c/275769/ increases the limit to 500.000 but this is a workaround as a tenant can still increase usage up to this new limit.


It's possible to limit bandwidth ( https://access.redhat.com/documentation/en/red-hat-openstack-platform/8/paged/networking-guide/chapter-10-configure-quality-of-service-qos ) on a port, but you cannot limit the conntrack sessions for an instance, port or tenant.

Comment 2 Nir Yechiel 2017-06-12 11:28:31 UTC
Assaf, can w have someone from the team look into this issue? If it makes sense, it seems like a straightforward fix on TripleO side.

Thanks,
Nir

Comment 10 Miguel Angel Ajo 2018-01-04 14:08:42 UTC
I have looked into the issue, and the only option, as @jlibosva said, is to have the kernel create separate hash tables (or at least counts [2]) per conntrack zone, checking [1] we can see that kernel creates an individual big table for the whole system.

In more recent kernels, the max count is still global [3] , and the hash table too [4]


[1] https://access.redhat.com/labs/psb/versions/kernel-3.10.0-693.11.1.el7/net/netfilter/nf_conntrack_core.c#line481

[2] https://access.redhat.com/labs/psb/versions/kernel-3.10.0-693.11.1.el7/net/netfilter/nf_conntrack_core.c#line868

[3] https://elixir.free-electrons.com/linux/v4.15-rc6/source/net/netfilter/nf_conntrack_core.c#L1109

[4] https://elixir.free-electrons.com/linux/v4.15-rc6/source/net/netfilter/nf_conntrack_core.c#L74

Comment 11 Miguel Angel Ajo 2018-01-04 14:20:52 UTC
I have registered a bug over kernel/netfilter on rhel8:

https://bugzilla.redhat.com/show_bug.cgi?id=1531074

Comment 12 Petr Barta 2018-01-09 09:56:17 UTC
Hello Miguel,
  Thanks for the information.

  From the customer point of view important question is, whether the request to implement this (creation of separate hash tables) is feasible, what are the requirements, when this can be implemented.

  Is there any info we are able to pass to the customer with regard to this? I see there was new BZ created, with target to RHEL8, so I guess it will take time. Do you thing that it can be backported to RHEL7 and related OSP environments?

Thanks,
Petr

Comment 13 Miguel Angel Ajo 2018-01-18 09:16:14 UTC
(In reply to Petr Barta from comment #12)
> Hello Miguel,
>   Thanks for the information.
> 
>   From the customer point of view important question is, whether the request
> to implement this (creation of separate hash tables) is feasible, what are
> the requirements, when this can be implemented.
> 
>   Is there any info we are able to pass to the customer with regard to this?
> I see there was new BZ created, with target to RHEL8, so I guess it will
> take time. Do you thing that it can be backported to RHEL7 and related OSP
> environments?
> 
> Thanks,
> Petr

Hey Petr, we need to ask on the RHEL bug I opened over the netfilter component. I know how it could be done, but I don't have the expertise about upstream development in kernel and backports. Let's ask the experts in that area.

Comment 14 Miguel Angel Ajo 2018-01-18 09:18:22 UTC
Petr, we already had a possitive answer from the kernel developers, please have an eye on https://bugzilla.redhat.com/show_bug.cgi?id=1531074#c1 and ask them about timelines please :)

Comment 15 Petr Barta 2018-01-18 09:26:22 UTC
Hello Miguel,
  ok, thanks for the info, will monitor the kernel bz and will ask there.

BR,
Petr

Comment 16 Jakub Libosvar 2018-05-15 13:57:38 UTC
*** Bug 1558462 has been marked as a duplicate of this bug. ***

Comment 19 David Hill 2019-07-11 20:02:33 UTC
This happened back again in OSP10 .

Comment 22 Mauro Oddi 2019-12-06 15:40:29 UTC
Hi Assaf, 

Could you provide any timeline to the customer in regard to this feature request ?
Now the kernel provides a way to limit connections per ct zone.
If there is any blueprint on this that can be shares will be appreciated too.


Thanks and Best Regards,
Mauro S. Oddi

Comment 25 Mike Burns 2020-08-26 21:25:10 UTC
This Release is retired.  If this bug is still relevant, please reopen and retarget to an open release.

Comment 26 David Hill 2020-08-27 12:32:05 UTC
It's a RFE.

Comment 41 Ihar Hrachyshka 2023-04-26 12:06:04 UTC
I believe OVN doesn't support zone-limits set per port, yet, and so additional work is due in OVN before we can implement it in neutron. A counterpart bug should be created in addition to this one against ovn component to track the RFE there.

Comment 42 Alex Stupnikov 2023-04-26 13:04:59 UTC
Thanks for your help and advice. I have reported bug #2189924 for OVN and would appreciate a second look from anyone involved. I think that this bug's metadata should be updated: there is a different blocker now and focus here should be switched to OVN.

Comment 53 Red Hat Bugzilla 2025-04-18 04:25:02 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days