Description of problem: Some IPv6 tests: octavia_tempest_plugin.tests.scenario.v2.test_ipv6_traffic_ops.IPv6TrafficOperationsScenarioTest - test_ipv6_http_LC_listener_with_allowed_cidrs - test_ipv6_http_SI_listener_with_allowed_cidrs - test_ipv6_tcp_SI_listener_with_allowed_cidrs Fail on CI when running with ACTIVE_STANDBY topology. Links to the failed tests: https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/network/view/octavia/job/DFG-network-octavia-17.0_director-rhel-virthost-3cont_3comp-ipv4-geneve-actstby/13/testReport/octavia_tempest_plugin.tests.scenario.v2.test_ipv6_traffic_ops/IPv6TrafficOperationsScenarioTest/Finally_Steps___test_ipv6_http_LC_listener_with_allowed_cidrs_id_9bead31b_0760_4c8f_b70a_f758fc5edd6a_/ https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/network/view/octavia/job/DFG-network-octavia-17.0_director-rhel-virthost-3cont_3comp-ipv4-geneve-actstby/13/testReport/octavia_tempest_plugin.tests.scenario.v2.test_ipv6_traffic_ops/IPv6TrafficOperationsScenarioTest/Finally_Steps___test_ipv6_http_SI_listener_with_allowed_cidrs_id_d1256195_3d85_4ffd_bda3_1c0ab78b8ce1_/ https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/network/view/octavia/job/DFG-network-octavia-17.0_director-rhel-virthost-3cont_3comp-ipv4-geneve-actstby/13/testReport/octavia_tempest_plugin.tests.scenario.v2.test_ipv6_traffic_ops/IPv6TrafficOperationsScenarioTest/Finally_Steps___test_ipv6_tcp_SI_listener_with_allowed_cidrs_id_bf8504b6_b95a_4f8a_9032_ab432db46eec_/ Version-Release number of selected component (if applicable): 17 How reproducible: 100% Steps to Reproduce: 1. Run Active standby job for OSP17. 2. 3. Actual results: Some IPv6 fail Expected results: All should pass
It looks like all the requests were forwarded to the same member: 2022-02-11 14:03:06,693 318518 DEBUG [octavia_tempest_plugin.tests.validators] Loadbalancer wait for load balancer response totals: {'1': 25636} Need to investigate, but this open review https://review.opendev.org/c/openstack/octavia/+/828606 fixes a similar issue with ipv6 members
One potential related issue that I reproduced in my env: Load balancer loses connectivity to its members when adding a new member (observed in the IPv6 Scenario tests) The IPv6 tests trigger a race condition in the management of the member ports in Octavia. I. Reproduction steps: Using the same load balancer: test 1: 1. Create member on subnet A A port on subnet A is attached to the amphora 2. Create member on subnet B A port on subnet B is attached to the amphora 3. Delete members Ports are not updated test 2: 4. Create member on subnet A Port is already attached, but Octavia notices that a port on subnet B is not used, so it unplugs this port 5. Create member on subnet B Octavia gets the list of ports attached to the amphora, it should create a port on subnet B but the port on subnet B is still in the list, so it doesn't update the ports 6. Port is removed from the list of attached ports and is missing from the amphora, the load balancer cannot create a connection to subnet B II. Details Unplugging a port from a server can take many seconds (the deletion of the port from the server is fast, but it takes up to 8 seconds on OSP for the changes to be committed in the DB) and Octavia doesn't wait for the removal to be completed. So if a new member is added between the unplug API call and the effective deletion from the DB, it may leave the amphora with a bad network configuration. Waiting for the DB update in Octavia would fix this issue, but it would also increase the duration of the member create flow. Change proposed in https://review.opendev.org/c/openstack/octavia/+/829805
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2022:6543