Bug 1399987 - [RFE] allow to limit conntrack entries per tenant to avoid "nf_conntrack: table full, dropping packet"
Summary: [RFE] allow to limit conntrack entries per tenant to avoid "nf_conntrack: tab...
Keywords:
Status: NEW
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 17.0 (Wallaby)
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: ---
Assignee: OSP Team
QA Contact: Toni Freger
URL:
Whiteboard:
: 1558462 (view as bug list)
Depends On: 1531074
Blocks: 1509630 2049755
TreeView+ depends on / blocked
 
Reported: 2016-11-30 08:17 UTC by Pablo Iranzo Gómez
Modified: 2023-07-11 05:09 UTC (History)
26 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1531074 (view as bug list)
Environment:
Last Closed: 2020-08-26 21:25:10 UTC
Target Upstream Version:
Embargoed:
alisci: needinfo? (rhos-maint)
johender: needinfo-
pablo.iranzo: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-4277 0 None None None 2022-02-02 16:35:09 UTC
Red Hat Knowledge Base (Solution) 4280221 0 None None None 2019-07-11 20:24:12 UTC

Description Pablo Iranzo Gómez 2016-11-30 08:17:13 UTC
Description of problem:

A tenant can cause network issues for other tenants: nf_conntrack: table full, dropping packet.

In our cloud had a jmeter performance test running on two instances caused network issues for other tenants.

In the /var/log/messages on the compute node we see the following message:
"nf_conntrack: table full, dropping packet."


This gerrit https://review.openstack.org/#/c/275769/ increases the limit to 500.000 but this is a workaround as a tenant can still increase usage up to this new limit.


It's possible to limit bandwidth ( https://access.redhat.com/documentation/en/red-hat-openstack-platform/8/paged/networking-guide/chapter-10-configure-quality-of-service-qos ) on a port, but you cannot limit the conntrack sessions for an instance, port or tenant.

Comment 2 Nir Yechiel 2017-06-12 11:28:31 UTC
Assaf, can w have someone from the team look into this issue? If it makes sense, it seems like a straightforward fix on TripleO side.

Thanks,
Nir

Comment 10 Miguel Angel Ajo 2018-01-04 14:08:42 UTC
I have looked into the issue, and the only option, as @jlibosva said, is to have the kernel create separate hash tables (or at least counts [2]) per conntrack zone, checking [1] we can see that kernel creates an individual big table for the whole system.

In more recent kernels, the max count is still global [3] , and the hash table too [4]


[1] https://access.redhat.com/labs/psb/versions/kernel-3.10.0-693.11.1.el7/net/netfilter/nf_conntrack_core.c#line481

[2] https://access.redhat.com/labs/psb/versions/kernel-3.10.0-693.11.1.el7/net/netfilter/nf_conntrack_core.c#line868

[3] https://elixir.free-electrons.com/linux/v4.15-rc6/source/net/netfilter/nf_conntrack_core.c#L1109

[4] https://elixir.free-electrons.com/linux/v4.15-rc6/source/net/netfilter/nf_conntrack_core.c#L74

Comment 11 Miguel Angel Ajo 2018-01-04 14:20:52 UTC
I have registered a bug over kernel/netfilter on rhel8:

https://bugzilla.redhat.com/show_bug.cgi?id=1531074

Comment 12 Petr Barta 2018-01-09 09:56:17 UTC
Hello Miguel,
  Thanks for the information.

  From the customer point of view important question is, whether the request to implement this (creation of separate hash tables) is feasible, what are the requirements, when this can be implemented.

  Is there any info we are able to pass to the customer with regard to this? I see there was new BZ created, with target to RHEL8, so I guess it will take time. Do you thing that it can be backported to RHEL7 and related OSP environments?

Thanks,
Petr

Comment 13 Miguel Angel Ajo 2018-01-18 09:16:14 UTC
(In reply to Petr Barta from comment #12)
> Hello Miguel,
>   Thanks for the information.
> 
>   From the customer point of view important question is, whether the request
> to implement this (creation of separate hash tables) is feasible, what are
> the requirements, when this can be implemented.
> 
>   Is there any info we are able to pass to the customer with regard to this?
> I see there was new BZ created, with target to RHEL8, so I guess it will
> take time. Do you thing that it can be backported to RHEL7 and related OSP
> environments?
> 
> Thanks,
> Petr

Hey Petr, we need to ask on the RHEL bug I opened over the netfilter component. I know how it could be done, but I don't have the expertise about upstream development in kernel and backports. Let's ask the experts in that area.

Comment 14 Miguel Angel Ajo 2018-01-18 09:18:22 UTC
Petr, we already had a possitive answer from the kernel developers, please have an eye on https://bugzilla.redhat.com/show_bug.cgi?id=1531074#c1 and ask them about timelines please :)

Comment 15 Petr Barta 2018-01-18 09:26:22 UTC
Hello Miguel,
  ok, thanks for the info, will monitor the kernel bz and will ask there.

BR,
Petr

Comment 16 Jakub Libosvar 2018-05-15 13:57:38 UTC
*** Bug 1558462 has been marked as a duplicate of this bug. ***

Comment 19 David Hill 2019-07-11 20:02:33 UTC
This happened back again in OSP10 .

Comment 22 Mauro Oddi 2019-12-06 15:40:29 UTC
Hi Assaf, 

Could you provide any timeline to the customer in regard to this feature request ?
Now the kernel provides a way to limit connections per ct zone.
If there is any blueprint on this that can be shares will be appreciated too.


Thanks and Best Regards,
Mauro S. Oddi

Comment 25 Mike Burns 2020-08-26 21:25:10 UTC
This Release is retired.  If this bug is still relevant, please reopen and retarget to an open release.

Comment 26 David Hill 2020-08-27 12:32:05 UTC
It's a RFE.

Comment 41 Ihar Hrachyshka 2023-04-26 12:06:04 UTC
I believe OVN doesn't support zone-limits set per port, yet, and so additional work is due in OVN before we can implement it in neutron. A counterpart bug should be created in addition to this one against ovn component to track the RFE there.

Comment 42 Alex Stupnikov 2023-04-26 13:04:59 UTC
Thanks for your help and advice. I have reported bug #2189924 for OVN and would appreciate a second look from anyone involved. I think that this bug's metadata should be updated: there is a different blocker now and focus here should be switched to OVN.


Note You need to log in before you can comment on or make changes to this bug.