+++ This bug was initially created as a clone of Bug #2123226 +++ +++ This bug was initially created as a clone of Bug #2123225 +++ +++ This bug was initially created as a clone of Bug #2122016 +++ Description of problem: One of our customers is running performance tests for his Web portal built on top of Shift on Stack environment. One of the problems we have found which has perfect correlation with client errors is spam of "nf_conntrack: table full, dropping packet" messages in amphora's console. From the tests we can see that Octavia starts spamming this errors when "/proc/sys/net/netfilter/nf_conntrack_count" shows around 32000. I have found two related bugs fixed in newer versions: Bug/fix 1: nf_conntrack: table full, dropping packet https://bugzilla.redhat.com/show_bug.cgi?id=1869771 (fixed in RHOSP 16.2) https://review.opendev.org/c/openstack/octavia/+/748749/ (fix) Bug/fix 2: https://storyboard.openstack.org/#!/story/2008979 (is not backported to RHOSP 16) https://review.opendev.org/c/openstack/octavia/+/796608 It doesn't look like these fixes will be released for RHOSP 13, so I am wondering if there is some supported way to apply some workaround for this problem and prevent DoS situation for Amphora? Version-Release number of selected component (if applicable): Red Hat OpenStack Platform release 13.0.13 (Queens)
I ran the following verification steps on a SINGLE topology Octavia LB: (overcloud) [stack@undercloud-0 ~]$ cat core_puddle_version RHOS-17.1-RHEL-9-20230131.n.2 (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer create --vip-subnet external_subnet --name lb1 --wait /usr/lib/python3.9/site-packages/osc_lib/utils/__init__.py:448: DeprecationWarning: The usage of formatter functions is now discouraged. Consider using cliff.columns.FormattableColumn instead. See reviews linked with bug 1687955 for more detail. warnings.warn( +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | admin_state_up | True | | availability_zone | None | | created_at | 2023-02-21T09:56:42 | | description | | | flavor_id | None | | id | 6c234d54-008d-4966-b2c4-f1bfd8a3d605 | | listeners | | | name | lb1 | | operating_status | ONLINE | | pools | | | project_id | 946cd27e13f14b7395cac4de6dc82abe | | provider | amphora | | provisioning_status | ACTIVE | | updated_at | 2023-02-21T09:57:51 | | vip_address | 10.0.0.159 | | vip_network_id | c0c8a991-388f-447c-9a9a-59d3d0a9290a | | vip_port_id | 4279ca68-48f5-4117-bea9-3b59458576a7 | | vip_qos_policy_id | None | | vip_subnet_id | c8e98308-413b-4a36-898d-7588327f02af | | tags | | +---------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer listener create --protocol HTTP --protocol-port 80 --name listener1 lb1 /usr/lib/python3.9/site-packages/osc_lib/utils/__init__.py:448: DeprecationWarning: The usage of formatter functions is now discouraged. Consider using cliff.columns.FormattableColumn instead. See reviews linked with bug 1687955 for more detail. warnings.warn( +-----------------------------+--------------------------------------+ | Field | Value | +-----------------------------+--------------------------------------+ | admin_state_up | True | | connection_limit | -1 | | created_at | 2023-02-21T09:59:17 | | default_pool_id | None | | default_tls_container_ref | None | | description | | | id | eefd33d3-14a3-4477-b09b-0f15f82dc76b | | insert_headers | None | | l7policies | | | loadbalancers | 6c234d54-008d-4966-b2c4-f1bfd8a3d605 | | name | listener1 | | operating_status | OFFLINE | | project_id | 946cd27e13f14b7395cac4de6dc82abe | | protocol | HTTP | | protocol_port | 80 | | provisioning_status | PENDING_CREATE | | sni_container_refs | [] | | timeout_client_data | 50000 | | timeout_member_connect | 5000 | | timeout_member_data | 50000 | | timeout_tcp_inspect | 0 | | updated_at | None | | client_ca_tls_container_ref | None | | client_authentication | NONE | | client_crl_container_ref | None | | allowed_cidrs | None | | tls_ciphers | None | | tls_versions | None | | alpn_protocols | None | | tags | | +-----------------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer pool create --protocol HTTP --listener listener1 --lb-algorithm ROUND_ROBIN --name pool1 /usr/lib/python3.9/site-packages/osc_lib/utils/__init__.py:448: DeprecationWarning: The usage of formatter functions is now discouraged. Consider using cliff.columns.FormattableColumn instead. See reviews linked with bug 1687955 for more detail. warnings.warn( +----------------------+--------------------------------------+ | Field | Value | +----------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2023-02-21T09:59:22 | | description | | | healthmonitor_id | | | id | 65276094-29b4-4832-b53c-307296d0f8e3 | | lb_algorithm | ROUND_ROBIN | | listeners | eefd33d3-14a3-4477-b09b-0f15f82dc76b | | loadbalancers | 6c234d54-008d-4966-b2c4-f1bfd8a3d605 | | members | | | name | pool1 | | operating_status | OFFLINE | | project_id | 946cd27e13f14b7395cac4de6dc82abe | | protocol | HTTP | | provisioning_status | PENDING_CREATE | | session_persistence | None | | updated_at | None | | tls_container_ref | None | | ca_tls_container_ref | None | | crl_container_ref | None | | tls_enabled | False | | tls_ciphers | None | | tls_versions | None | | tags | | | alpn_protocols | None | +----------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ for i in {1..500}; do curl 10.0.0.159; done In the same time of cURLing the LB, I ssh the amphora and made sure the conntrack table did not contain any entries: (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list --loadbalancer lb1 +--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+ | id | loadbalancer_id | status | role | lb_network_ip | ha_ip | +--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+ | 03deb08b-6062-42fd-b623-43fdbfc3dd78 | 6c234d54-008d-4966-b2c4-f1bfd8a3d605 | ALLOCATED | STANDALONE | 172.24.0.56 | 10.0.0.159 | +--------------------------------------+--------------------------------------+-----------+------------+---------------+------------+ [stack@undercloud-0 ~]$ eval $(ssh-agent) Agent pid 898688 [stack@undercloud-0 ~]$ sudo -E ssh-add /etc/octavia/ssh/octavia_id_rsa Identity added: /etc/octavia/ssh/octavia_id_rsa (root.local) [stack@undercloud-0 ~]$ ssh -A -t tripleo-admin ssh cloud-user.0.56 Warning: Permanently added 'controller-0.ctlplane' (ED25519) to the list of known hosts. [cloud-user@amphora-03deb08b-6062-42fd-b623-43fdbfc3dd78 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-03deb08b-6062-42fd-b623-43fdbfc3dd78 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-03deb08b-6062-42fd-b623-43fdbfc3dd78 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-03deb08b-6062-42fd-b623-43fdbfc3dd78 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-03deb08b-6062-42fd-b623-43fdbfc3dd78 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-03deb08b-6062-42fd-b623-43fdbfc3dd78 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-03deb08b-6062-42fd-b623-43fdbfc3dd78 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack [cloud-user@amphora-03deb08b-6062-42fd-b623-43fdbfc3dd78 ~]$ sudo ip netns exec amphora-haproxy cat /proc/net/nf_conntrack Looks good to me. I am moving the BZ status to VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:4577