Bug 2123225 - [Octavia] Spam of "nf_conntrack: table full, dropping packet" messages during performance tests
Summary: [Octavia] Spam of "nf_conntrack: table full, dropping packet" messages during...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-octavia
Version: 16.1 (Train)
Hardware: All
OS: All
high
high
Target Milestone: z9
: 16.1 (Train on RHEL 8.2)
Assignee: Gregory Thiemonge
QA Contact: Omer Schwartz
URL:
Whiteboard:
Depends On: 2122016
Blocks: 2123226 2125612
TreeView+ depends on / blocked
 
Reported: 2022-09-01 07:40 UTC by Gregory Thiemonge
Modified: 2022-12-07 20:27 UTC (History)
12 users (show)

Fixed In Version: openstack-octavia-5.0.3-1.20220906163309.8c32d2e.el8ost
Doc Type: Bug Fix
Doc Text:
Before this update, Conntrack was enabled in the Amphora VM for any type of packet, but it is only required for the User Datagram Protocol (UDP) and Stream Control Transmission Protocol (SCTP). With this update, Conntrack is now disabled for Transmission Control Protocol (TCP) flows, preventing some performance issues when a user generates a lot of connections that fill the Conntrack table.
Clone Of: 2122016
: 2123226 (view as bug list)
Environment:
Last Closed: 2022-12-07 20:27:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 773601 0 None MERGED Fix nf_conntrack_buckets sysctl in Amphora 2022-09-01 08:00:04 UTC
OpenStack gerrit 807147 0 None MERGED Update nr_open limit value in the amphora 2022-09-01 08:00:04 UTC
OpenStack gerrit 847544 0 None MERGED Set sensible nf_conntrack_max value in amphora 2022-09-01 08:00:04 UTC
OpenStack gerrit 854928 0 None MERGED Disable conntrack for TCP flows in the amphora 2022-09-01 08:00:04 UTC
Red Hat Issue Tracker OSP-18496 0 None None None 2022-09-01 07:57:16 UTC
Red Hat Product Errata RHBA-2022:8795 0 None None None 2022-12-07 20:27:42 UTC

Description Gregory Thiemonge 2022-09-01 07:40:53 UTC
+++ This bug was initially created as a clone of Bug #2122016 +++

Description of problem:

One of our customers is running performance tests for his Web portal built on top of Shift on Stack environment. One of the problems we have found which has perfect correlation with client errors is spam of "nf_conntrack: table full, dropping packet" messages in amphora's console.

From the tests we can see that Octavia starts spamming this errors when "/proc/sys/net/netfilter/nf_conntrack_count" shows around 32000.

I have found two related bugs fixed in newer versions:

Bug/fix 1:
nf_conntrack: table full, dropping packet
https://bugzilla.redhat.com/show_bug.cgi?id=1869771      (fixed in RHOSP 16.2)
https://review.opendev.org/c/openstack/octavia/+/748749/ (fix)

Bug/fix 2:
https://storyboard.openstack.org/#!/story/2008979        (is not backported to RHOSP 16)
https://review.opendev.org/c/openstack/octavia/+/796608


It doesn't look like these fixes will be released for RHOSP 13, so I am wondering if there is some supported way to apply some workaround for this problem and prevent DoS situation for Amphora?

Version-Release number of selected component (if applicable):
Red Hat OpenStack Platform release 13.0.13 (Queens)

Comment 10 Omer Schwartz 2022-11-22 10:11:41 UTC
I ran the following verification steps on a SINGLE topology Octavia LB:

(overcloud) [stack@undercloud-0 ~]$ cat core_puddle_version 
RHOS-16.1-RHEL-8-20221116.n.1

(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer create --vip-subnet external_subnet --name lb1
+---------------------+--------------------------------------+
| Field               | Value                                |
+---------------------+--------------------------------------+
| admin_state_up      | True                                 |
| created_at          | 2022-11-22T10:03:51                  |
| description         |                                      |
| flavor_id           | None                                 |
| id                  | 3318fe17-d9c7-4e82-9104-48a5279352b2 |
| listeners           |                                      |
| name                | lb1                                  |
| operating_status    | OFFLINE                              |
| pools               |                                      |
| project_id          | 5fc6b2b45c3a4dd6858ccda5db8ecc2a     |
| provider            | amphora                              |
| provisioning_status | PENDING_CREATE                       |
| updated_at          | None                                 |
| vip_address         | 10.0.0.205                           |
| vip_network_id      | 73f82788-e281-409f-85cc-15c309a7da02 |
| vip_port_id         | 5b9a2346-c375-4228-86c2-7b78776a3079 |
| vip_qos_policy_id   | None                                 |
| vip_subnet_id       | c2256ac8-48e5-4788-adc6-57e5c7935691 |
+---------------------+--------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer listener create --protocol HTTP --protocol-port 80 --name listener1 lb1
+-----------------------------+--------------------------------------+
| Field                       | Value                                |
+-----------------------------+--------------------------------------+
| admin_state_up              | True                                 |
| connection_limit            | -1                                   |
| created_at                  | 2022-11-22T10:06:22                  |
| default_pool_id             | None                                 |
| default_tls_container_ref   | None                                 |
| description                 |                                      |
| id                          | e4849055-6963-4649-ac3b-1509575afc72 |
| insert_headers              | None                                 |
| l7policies                  |                                      |
| loadbalancers               | 3318fe17-d9c7-4e82-9104-48a5279352b2 |
| name                        | listener1                            |
| operating_status            | OFFLINE                              |
| project_id                  | 5fc6b2b45c3a4dd6858ccda5db8ecc2a     |
| protocol                    | HTTP                                 |
| protocol_port               | 80                                   |
| provisioning_status         | PENDING_CREATE                       |
| sni_container_refs          | []                                   |
| timeout_client_data         | 50000                                |
| timeout_member_connect      | 5000                                 |
| timeout_member_data         | 50000                                |
| timeout_tcp_inspect         | 0                                    |
| updated_at                  | None                                 |
| client_ca_tls_container_ref | None                                 |
| client_authentication       | NONE                                 |
| client_crl_container_ref    | None                                 |
| allowed_cidrs               | None                                 |
+-----------------------------+--------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer pool create --protocol HTTP --listener listener1 --lb-algorithm ROUND_ROBIN --name pool1
+----------------------+--------------------------------------+
| Field                | Value                                |
+----------------------+--------------------------------------+
| admin_state_up       | True                                 |
| created_at           | 2022-11-22T10:06:26                  |
| description          |                                      |
| healthmonitor_id     |                                      |
| id                   | 84ac5125-2148-49ca-af34-efc67e37a86e |
| lb_algorithm         | ROUND_ROBIN                          |
| listeners            | e4849055-6963-4649-ac3b-1509575afc72 |
| loadbalancers        | 3318fe17-d9c7-4e82-9104-48a5279352b2 |
| members              |                                      |
| name                 | pool1                                |
| operating_status     | OFFLINE                              |
| project_id           | 5fc6b2b45c3a4dd6858ccda5db8ecc2a     |
| protocol             | HTTP                                 |
| provisioning_status  | PENDING_CREATE                       |
| session_persistence  | None                                 |
| updated_at           | None                                 |
| tls_container_ref    | None                                 |
| ca_tls_container_ref | None                                 |
| crl_container_ref    | None                                 |
| tls_enabled          | False                                |
+----------------------+--------------------------------------+
(overcloud) [stack@undercloud-0 ~]$ curl 10.0.0.217




In the same time of cURLing the LB, I ssh the amphora and made sure the conntrack table did not contain any entries:
(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list --loadbalancer lb1
+--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+
| id                                   | loadbalancer_id                      | status    | role       | lb_network_ip | ha_ip      |
+--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+
| 41c4119f-4ffc-43b1-9e0d-7a7cb0ddcd0a | 3318fe17-d9c7-4e82-9104-48a5279352b2 | ALLOCATED | STANDALONE | 172.24.2.137  | 10.0.0.205 |
+--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+
(overcloud) [stack@undercloud-0 ~]$ eval $(ssh-agent)
Agent pid 636273
(overcloud) [stack@undercloud-0 ~]$ ssh-add
Identity added: /home/stack/.ssh/id_rsa (/home/stack/.ssh/id_rsa)
Identity added: /home/stack/.ssh/id_ecdsa (stack.local)
(overcloud) [stack@undercloud-0 ~]$ ssh -A controller-0.ctlplane
Warning: Permanently added 'controller-0.ctlplane,192.168.24.47' (ECDSA) to the list of known hosts.
This system is not registered to Red Hat Insights. See https://cloud.redhat.com/
To register this system, run: insights-client --register

Last login: Tue Nov 22 09:59:00 2022 from 192.168.24.1
[heat-admin@controller-0 ~]$ ssh cloud-user.2.137
The authenticity of host '172.24.2.137 (172.24.2.137)' can't be established.
ECDSA key fingerprint is SHA256:UQJnBheqp4I9MLdaA+r0B6DLRwSpDSODjXxZfw6onO4.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
[cloud-user@amphora-41c4119f-4ffc-43b1-9e0d-7a7cb0ddcd0a ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack
[cloud-user@amphora-41c4119f-4ffc-43b1-9e0d-7a7cb0ddcd0a ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack
[cloud-user@amphora-41c4119f-4ffc-43b1-9e0d-7a7cb0ddcd0a ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack
[cloud-user@amphora-41c4119f-4ffc-43b1-9e0d-7a7cb0ddcd0a ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack
[cloud-user@amphora-41c4119f-4ffc-43b1-9e0d-7a7cb0ddcd0a ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack
[cloud-user@amphora-41c4119f-4ffc-43b1-9e0d-7a7cb0ddcd0a ~]$


Looks good to me. I am moving the BZ status to VERIFIED.

Comment 16 errata-xmlrpc 2022-12-07 20:27:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.9 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8795


Note You need to log in before you can comment on or make changes to this bug.