Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 2123226

Summary: [Octavia] Spam of "nf_conntrack: table full, dropping packet" messages during performance tests
Product: Red Hat OpenStack Reporter: Gregory Thiemonge <gthiemon>
Component: openstack-octaviaAssignee: Gregory Thiemonge <gthiemon>
Status: CLOSED ERRATA QA Contact: Omer Schwartz <oschwart>
Severity: high Docs Contact:
Priority: high    
Version: 16.2 (Train)CC: astupnik, bbonguar, cmuresan, fpiccion, gkadam, gregraka, gthiemon, jraju, jvisser, lpeer, majopela, nilushko, njohnston, oschwart, rcernin, scohen, tvainio
Target Milestone: z4Keywords: Triaged
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: openstack-octavia-5.1.3-2.20220906154809.58e2e13.el8ost Doc Type: Bug Fix
Doc Text:
Before this update, VM instances (amphorae) for the Red Hat OpenStack Platform (RHOSP) Load-balancing service (octavia) could experience performance issues when a lot of connections filled the network connection tracking (conntrack) table. The cause for this was that conntrack was enabled for all packet types, including TCP, which does not require conntrack. In RHOSP 16.2.4, amphora performance has improved, because conntrack is disabled for TCP packets and is only enabled for UDP and SCTP packets.
Story Points: ---
Clone Of: 2123225
: 2125612 (view as bug list) Environment:
Last Closed: 2022-12-07 19:24:09 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2122016, 2123225    
Bug Blocks: 2125612    

Description Gregory Thiemonge 2022-09-01 07:42:29 UTC
+++ This bug was initially created as a clone of Bug #2123225 +++

+++ This bug was initially created as a clone of Bug #2122016 +++

Description of problem:

One of our customers is running performance tests for his Web portal built on top of Shift on Stack environment. One of the problems we have found which has perfect correlation with client errors is spam of "nf_conntrack: table full, dropping packet" messages in amphora's console.

From the tests we can see that Octavia starts spamming this errors when "/proc/sys/net/netfilter/nf_conntrack_count" shows around 32000.

I have found two related bugs fixed in newer versions:

Bug/fix 1:
nf_conntrack: table full, dropping packet
https://bugzilla.redhat.com/show_bug.cgi?id=1869771      (fixed in RHOSP 16.2)
https://review.opendev.org/c/openstack/octavia/+/748749/ (fix)

Bug/fix 2:
https://storyboard.openstack.org/#!/story/2008979        (is not backported to RHOSP 16)
https://review.opendev.org/c/openstack/octavia/+/796608


It doesn't look like these fixes will be released for RHOSP 13, so I am wondering if there is some supported way to apply some workaround for this problem and prevent DoS situation for Amphora?

Version-Release number of selected component (if applicable):
Red Hat OpenStack Platform release 13.0.13 (Queens)

Comment 13 Omer Schwartz 2022-11-22 10:32:09 UTC
I ran the following verification steps on a SINGLE topology Octavia LB:

(overcloud) [stack@undercloud-0 ~]$ cat core_puddle_version
RHOS-16.2-RHEL-8-20221104.n.0%


(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer create --vip-subnet external_subnet --name lb1
+---------------------+--------------------------------------+
| Field               | Value                                |
+---------------------+--------------------------------------+
| admin_state_up      | True                                 |
| created_at          | 2022-11-22T10:23:43                  |
| description         |                                      |
| flavor_id           | None                                 |
| id                  | c19b2b7f-8479-4111-ad17-8d7a967f96ec |
| listeners           |                                      |
| name                | lb1                                  |
| operating_status    | OFFLINE                              |
| pools               |                                      |
| project_id          | 55791abb3f5a43a2ad29f7ea68eca414     |
| provider            | amphora                              |
| provisioning_status | PENDING_CREATE                       |
| updated_at          | None                                 |
| vip_address         | 10.0.0.202                           |
| vip_network_id      | 74a35f12-fd6d-4daf-9582-9afb72ff3618 |
| vip_port_id         | 9c237af6-c825-44fb-ac1f-a17ca2d213ae |
| vip_qos_policy_id   | None                                 |
| vip_subnet_id       | 09e042c3-9c4b-492e-a47b-34ac6dd96a82 |
+---------------------+--------------------------------------+

(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer listener create --protocol HTTP --protocol-port 80 --name listener1 lb1
+-----------------------------+--------------------------------------+
| Field                       | Value                                |
+-----------------------------+--------------------------------------+
| admin_state_up              | True                                 |
| connection_limit            | -1                                   |
| created_at                  | 2022-11-22T10:26:43                  |
| default_pool_id             | None                                 |
| default_tls_container_ref   | None                                 |
| description                 |                                      |
| id                          | 70833ca7-91c7-42b7-a4d4-2a4a4643c94d |
| insert_headers              | None                                 |
| l7policies                  |                                      |
| loadbalancers               | c19b2b7f-8479-4111-ad17-8d7a967f96ec |
| name                        | listener1                            |
| operating_status            | OFFLINE                              |
| project_id                  | 55791abb3f5a43a2ad29f7ea68eca414     |
| protocol                    | HTTP                                 |
| protocol_port               | 80                                   |
| provisioning_status         | PENDING_CREATE                       |
| sni_container_refs          | []                                   |
| timeout_client_data         | 50000                                |
| timeout_member_connect      | 5000                                 |
| timeout_member_data         | 50000                                |
| timeout_tcp_inspect         | 0                                    |
| updated_at                  | None                                 |
| client_ca_tls_container_ref | None                                 |
| client_authentication       | NONE                                 |
| client_crl_container_ref    | None                                 |
| allowed_cidrs               | None                                 |
+-----------------------------+--------------------------------------+

(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer pool create --protocol HTTP --listener listener1 --lb-algorithm ROUND_ROBIN --name pool1
+----------------------+--------------------------------------+
| Field                | Value                                |
+----------------------+--------------------------------------+
| admin_state_up       | True                                 |
| created_at           | 2022-11-22T10:26:48                  |
| description          |                                      |
| healthmonitor_id     |                                      |
| id                   | 46e7db8f-1733-4035-abe9-a95d83edda25 |
| lb_algorithm         | ROUND_ROBIN                          |
| listeners            | 70833ca7-91c7-42b7-a4d4-2a4a4643c94d |
| loadbalancers        | c19b2b7f-8479-4111-ad17-8d7a967f96ec |
| members              |                                      |
| name                 | pool1                                |
| operating_status     | OFFLINE                              |
| project_id           | 55791abb3f5a43a2ad29f7ea68eca414     |
| protocol             | HTTP                                 |
| provisioning_status  | PENDING_CREATE                       |
| session_persistence  | None                                 |
| updated_at           | None                                 |
| tls_container_ref    | None                                 |
| ca_tls_container_ref | None                                 |
| crl_container_ref    | None                                 |
| tls_enabled          | False                                |
+----------------------+--------------------------------------+

(overcloud) [stack@undercloud-0 ~]$ curl 10.0.0.202




In the same time of cURLing the LB, I ssh the amphora and made sure the conntrack table did not contain any entries:
(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list --loadbalancer lb1
+--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+
| id                                   | loadbalancer_id                      | status    | role       | lb_network_ip | ha_ip      |
+--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+
| 40b2a023-67d5-4a50-be46-46775e641bd8 | c19b2b7f-8479-4111-ad17-8d7a967f96ec | ALLOCATED | STANDALONE | 172.24.0.173  | 10.0.0.202 |
+--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+
(overcloud) [stack@undercloud-0 ~]$ eval $(ssh-agent)
Agent pid 264617
(overcloud) [stack@undercloud-0 ~]$ ssh-add
Identity added: /home/stack/.ssh/id_rsa (/home/stack/.ssh/id_rsa)
Identity added: /home/stack/.ssh/id_ecdsa (stack.local)
(overcloud) [stack@undercloud-0 ~]$ ssh -A controller-0.ctlplane
Warning: Permanently added 'controller-0.ctlplane,192.168.24.29' (ECDSA) to the list of known hosts.
Last login: Tue Nov 22 10:20:03 2022 from 192.168.24.1
[heat-admin@controller-0 ~]$ ssh cloud-user.0.173
The authenticity of host '172.24.0.173 (172.24.0.173)' can't be established.
ECDSA key fingerprint is SHA256:wn+ML5k2TQCCVUzfg2M6AQGzc2jqDDi+wh0nu9D90Ho.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added '172.24.0.173' (ECDSA) to the list of known hosts.
[cloud-user@amphora-40b2a023-67d5-4a50-be46-46775e641bd8 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack
[cloud-user@amphora-40b2a023-67d5-4a50-be46-46775e641bd8 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack
[cloud-user@amphora-40b2a023-67d5-4a50-be46-46775e641bd8 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack
[cloud-user@amphora-40b2a023-67d5-4a50-be46-46775e641bd8 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack
[cloud-user@amphora-40b2a023-67d5-4a50-be46-46775e641bd8 ~]$



Looks good to me. I am moving the BZ status to VERIFIED.

Comment 23 errata-xmlrpc 2022-12-07 19:24:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.4), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8794

Comment 27 Red Hat Bugzilla 2023-09-19 04:25:38 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days