+++ This bug was initially created as a clone of Bug #2123225 +++ +++ This bug was initially created as a clone of Bug #2122016 +++ Description of problem: One of our customers is running performance tests for his Web portal built on top of Shift on Stack environment. One of the problems we have found which has perfect correlation with client errors is spam of "nf_conntrack: table full, dropping packet" messages in amphora's console. From the tests we can see that Octavia starts spamming this errors when "/proc/sys/net/netfilter/nf_conntrack_count" shows around 32000. I have found two related bugs fixed in newer versions: Bug/fix 1: nf_conntrack: table full, dropping packet https://bugzilla.redhat.com/show_bug.cgi?id=1869771 (fixed in RHOSP 16.2) https://review.opendev.org/c/openstack/octavia/+/748749/ (fix) Bug/fix 2: https://storyboard.openstack.org/#!/story/2008979 (is not backported to RHOSP 16) https://review.opendev.org/c/openstack/octavia/+/796608 It doesn't look like these fixes will be released for RHOSP 13, so I am wondering if there is some supported way to apply some workaround for this problem and prevent DoS situation for Amphora? Version-Release number of selected component (if applicable): Red Hat OpenStack Platform release 13.0.13 (Queens)
I ran the following verification steps on a SINGLE topology Octavia LB: (overcloud) [stack@undercloud-0 ~]$ cat core_puddle_version RHOS-16.2-RHEL-8-20221104.n.0% (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer create --vip-subnet external_subnet --name lb1 +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2022-11-22T10:23:43 | | description | | | flavor_id | None | | id | c19b2b7f-8479-4111-ad17-8d7a967f96ec | | listeners | | | name | lb1 | | operating_status | OFFLINE | | pools | | | project_id | 55791abb3f5a43a2ad29f7ea68eca414 | | provider | amphora | | provisioning_status | PENDING_CREATE | | updated_at | None | | vip_address | 10.0.0.202 | | vip_network_id | 74a35f12-fd6d-4daf-9582-9afb72ff3618 | | vip_port_id | 9c237af6-c825-44fb-ac1f-a17ca2d213ae | | vip_qos_policy_id | None | | vip_subnet_id | 09e042c3-9c4b-492e-a47b-34ac6dd96a82 | +---------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer listener create --protocol HTTP --protocol-port 80 --name listener1 lb1 +-----------------------------+--------------------------------------+ | Field | Value | +-----------------------------+--------------------------------------+ | admin_state_up | True | | connection_limit | -1 | | created_at | 2022-11-22T10:26:43 | | default_pool_id | None | | default_tls_container_ref | None | | description | | | id | 70833ca7-91c7-42b7-a4d4-2a4a4643c94d | | insert_headers | None | | l7policies | | | loadbalancers | c19b2b7f-8479-4111-ad17-8d7a967f96ec | | name | listener1 | | operating_status | OFFLINE | | project_id | 55791abb3f5a43a2ad29f7ea68eca414 | | protocol | HTTP | | protocol_port | 80 | | provisioning_status | PENDING_CREATE | | sni_container_refs | [] | | timeout_client_data | 50000 | | timeout_member_connect | 5000 | | timeout_member_data | 50000 | | timeout_tcp_inspect | 0 | | updated_at | None | | client_ca_tls_container_ref | None | | client_authentication | NONE | | client_crl_container_ref | None | | allowed_cidrs | None | +-----------------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer pool create --protocol HTTP --listener listener1 --lb-algorithm ROUND_ROBIN --name pool1 +----------------------+--------------------------------------+ | Field | Value | +----------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2022-11-22T10:26:48 | | description | | | healthmonitor_id | | | id | 46e7db8f-1733-4035-abe9-a95d83edda25 | | lb_algorithm | ROUND_ROBIN | | listeners | 70833ca7-91c7-42b7-a4d4-2a4a4643c94d | | loadbalancers | c19b2b7f-8479-4111-ad17-8d7a967f96ec | | members | | | name | pool1 | | operating_status | OFFLINE | | project_id | 55791abb3f5a43a2ad29f7ea68eca414 | | protocol | HTTP | | provisioning_status | PENDING_CREATE | | session_persistence | None | | updated_at | None | | tls_container_ref | None | | ca_tls_container_ref | None | | crl_container_ref | None | | tls_enabled | False | +----------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ curl 10.0.0.202 In the same time of cURLing the LB, I ssh the amphora and made sure the conntrack table did not contain any entries: (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list --loadbalancer lb1 +--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+ | id | loadbalancer_id | status | role | lb_network_ip | ha_ip | +--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+ | 40b2a023-67d5-4a50-be46-46775e641bd8 | c19b2b7f-8479-4111-ad17-8d7a967f96ec | ALLOCATED | STANDALONE | 172.24.0.173 | 10.0.0.202 | +--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+ (overcloud) [stack@undercloud-0 ~]$ eval $(ssh-agent) Agent pid 264617 (overcloud) [stack@undercloud-0 ~]$ ssh-add Identity added: /home/stack/.ssh/id_rsa (/home/stack/.ssh/id_rsa) Identity added: /home/stack/.ssh/id_ecdsa (stack.local) (overcloud) [stack@undercloud-0 ~]$ ssh -A controller-0.ctlplane Warning: Permanently added 'controller-0.ctlplane,192.168.24.29' (ECDSA) to the list of known hosts. Last login: Tue Nov 22 10:20:03 2022 from 192.168.24.1 [heat-admin@controller-0 ~]$ ssh cloud-user.0.173 The authenticity of host '172.24.0.173 (172.24.0.173)' can't be established. ECDSA key fingerprint is SHA256:wn+ML5k2TQCCVUzfg2M6AQGzc2jqDDi+wh0nu9D90Ho. Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '172.24.0.173' (ECDSA) to the list of known hosts. [cloud-user@amphora-40b2a023-67d5-4a50-be46-46775e641bd8 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-40b2a023-67d5-4a50-be46-46775e641bd8 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-40b2a023-67d5-4a50-be46-46775e641bd8 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-40b2a023-67d5-4a50-be46-46775e641bd8 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-40b2a023-67d5-4a50-be46-46775e641bd8 ~]$ Looks good to me. I am moving the BZ status to VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.4), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:8794
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days