+++ This bug was initially created as a clone of Bug #2122016 +++ Description of problem: One of our customers is running performance tests for his Web portal built on top of Shift on Stack environment. One of the problems we have found which has perfect correlation with client errors is spam of "nf_conntrack: table full, dropping packet" messages in amphora's console. From the tests we can see that Octavia starts spamming this errors when "/proc/sys/net/netfilter/nf_conntrack_count" shows around 32000. I have found two related bugs fixed in newer versions: Bug/fix 1: nf_conntrack: table full, dropping packet https://bugzilla.redhat.com/show_bug.cgi?id=1869771 (fixed in RHOSP 16.2) https://review.opendev.org/c/openstack/octavia/+/748749/ (fix) Bug/fix 2: https://storyboard.openstack.org/#!/story/2008979 (is not backported to RHOSP 16) https://review.opendev.org/c/openstack/octavia/+/796608 It doesn't look like these fixes will be released for RHOSP 13, so I am wondering if there is some supported way to apply some workaround for this problem and prevent DoS situation for Amphora? Version-Release number of selected component (if applicable): Red Hat OpenStack Platform release 13.0.13 (Queens)
I ran the following verification steps on a SINGLE topology Octavia LB: (overcloud) [stack@undercloud-0 ~]$ cat core_puddle_version RHOS-16.1-RHEL-8-20221116.n.1 (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer create --vip-subnet external_subnet --name lb1 +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2022-11-22T10:03:51 | | description | | | flavor_id | None | | id | 3318fe17-d9c7-4e82-9104-48a5279352b2 | | listeners | | | name | lb1 | | operating_status | OFFLINE | | pools | | | project_id | 5fc6b2b45c3a4dd6858ccda5db8ecc2a | | provider | amphora | | provisioning_status | PENDING_CREATE | | updated_at | None | | vip_address | 10.0.0.205 | | vip_network_id | 73f82788-e281-409f-85cc-15c309a7da02 | | vip_port_id | 5b9a2346-c375-4228-86c2-7b78776a3079 | | vip_qos_policy_id | None | | vip_subnet_id | c2256ac8-48e5-4788-adc6-57e5c7935691 | +---------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer listener create --protocol HTTP --protocol-port 80 --name listener1 lb1 +-----------------------------+--------------------------------------+ | Field | Value | +-----------------------------+--------------------------------------+ | admin_state_up | True | | connection_limit | -1 | | created_at | 2022-11-22T10:06:22 | | default_pool_id | None | | default_tls_container_ref | None | | description | | | id | e4849055-6963-4649-ac3b-1509575afc72 | | insert_headers | None | | l7policies | | | loadbalancers | 3318fe17-d9c7-4e82-9104-48a5279352b2 | | name | listener1 | | operating_status | OFFLINE | | project_id | 5fc6b2b45c3a4dd6858ccda5db8ecc2a | | protocol | HTTP | | protocol_port | 80 | | provisioning_status | PENDING_CREATE | | sni_container_refs | [] | | timeout_client_data | 50000 | | timeout_member_connect | 5000 | | timeout_member_data | 50000 | | timeout_tcp_inspect | 0 | | updated_at | None | | client_ca_tls_container_ref | None | | client_authentication | NONE | | client_crl_container_ref | None | | allowed_cidrs | None | +-----------------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer pool create --protocol HTTP --listener listener1 --lb-algorithm ROUND_ROBIN --name pool1 +----------------------+--------------------------------------+ | Field | Value | +----------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2022-11-22T10:06:26 | | description | | | healthmonitor_id | | | id | 84ac5125-2148-49ca-af34-efc67e37a86e | | lb_algorithm | ROUND_ROBIN | | listeners | e4849055-6963-4649-ac3b-1509575afc72 | | loadbalancers | 3318fe17-d9c7-4e82-9104-48a5279352b2 | | members | | | name | pool1 | | operating_status | OFFLINE | | project_id | 5fc6b2b45c3a4dd6858ccda5db8ecc2a | | protocol | HTTP | | provisioning_status | PENDING_CREATE | | session_persistence | None | | updated_at | None | | tls_container_ref | None | | ca_tls_container_ref | None | | crl_container_ref | None | | tls_enabled | False | +----------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ curl 10.0.0.217 In the same time of cURLing the LB, I ssh the amphora and made sure the conntrack table did not contain any entries: (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list --loadbalancer lb1 +--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+ | id | loadbalancer_id | status | role | lb_network_ip | ha_ip | +--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+ | 41c4119f-4ffc-43b1-9e0d-7a7cb0ddcd0a | 3318fe17-d9c7-4e82-9104-48a5279352b2 | ALLOCATED | STANDALONE | 172.24.2.137 | 10.0.0.205 | +--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+ (overcloud) [stack@undercloud-0 ~]$ eval $(ssh-agent) Agent pid 636273 (overcloud) [stack@undercloud-0 ~]$ ssh-add Identity added: /home/stack/.ssh/id_rsa (/home/stack/.ssh/id_rsa) Identity added: /home/stack/.ssh/id_ecdsa (stack.local) (overcloud) [stack@undercloud-0 ~]$ ssh -A controller-0.ctlplane Warning: Permanently added 'controller-0.ctlplane,192.168.24.47' (ECDSA) to the list of known hosts. This system is not registered to Red Hat Insights. See https://cloud.redhat.com/ To register this system, run: insights-client --register Last login: Tue Nov 22 09:59:00 2022 from 192.168.24.1 [heat-admin@controller-0 ~]$ ssh cloud-user.2.137 The authenticity of host '172.24.2.137 (172.24.2.137)' can't be established. ECDSA key fingerprint is SHA256:UQJnBheqp4I9MLdaA+r0B6DLRwSpDSODjXxZfw6onO4. Are you sure you want to continue connecting (yes/no/[fingerprint])? yes [cloud-user@amphora-41c4119f-4ffc-43b1-9e0d-7a7cb0ddcd0a ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-41c4119f-4ffc-43b1-9e0d-7a7cb0ddcd0a ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-41c4119f-4ffc-43b1-9e0d-7a7cb0ddcd0a ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-41c4119f-4ffc-43b1-9e0d-7a7cb0ddcd0a ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-41c4119f-4ffc-43b1-9e0d-7a7cb0ddcd0a ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-41c4119f-4ffc-43b1-9e0d-7a7cb0ddcd0a ~]$ Looks good to me. I am moving the BZ status to VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform 16.1.9 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:8795