Description of problem: loadbalancer is not responding to request from pool member subnet when loadbalancer and pool members are in different subnet as it worked in Openstack 16.2.5. We suspect that the issue is related to some files deleted by [1]. Version-Release number of selected component (if applicable): Openstack 17.1.3 octavia-amphora-17.1-20240516.1.x86_64.qcow2 How reproducible: Always Steps to Reproduce: 1. In a running stack, create a loadbalancer in a private subnet. (eg; private-lb-subnet) 2. Adde a pool member to the loadbalancer which is in another private subnet (eg: private-pool-subnet) 3. create a router to route both subnet. Actual results: - loadbalancer is handling request from private-lb-subnet (expected behaviour) - loadbalancer is not handling request from private-pool-subnet Expected results: - loadbalancer is handling request from private-pool-subnet and private-lb-subnet. Additional info: [1] https://review.opendev.org/c/openstack/octavia/+/807310 Workaround: [root@amphora ~]# ip netns exec amphora-haproxy ip route add default via 10.10.10.1 dev eth1 onlink table 1
Some notes on this issue: A default route is missing from table 1 in ACTIVE_STANDBY: [cloud-user@amphora-46e9b6ca-82a2-4317-a54d-a0a7de2769d2 ~]$ sudo ip netns exec amphora-haproxy ip route default via 10.0.2.1 dev eth1 proto static onlink 10.0.1.0/24 dev eth1 proto kernel scope link src 10.0.1.89 10.0.2.0/24 dev eth1 proto kernel scope link src 10.0.2.55 [cloud-user@amphora-46e9b6ca-82a2-4317-a54d-a0a7de2769d2 ~]$ sudo ip netns exec amphora-haproxy ip route show table 1 10.0.2.0/24 dev eth1 proto keepalived scope link src 10.0.2.87 Before Wallaby, the route was added by the ifcfg scripts: https://github.com/openstack/octavia/blob/unmaintained/victoria/octavia/amphorae/backends/agent/api_server/templates/rh_route_ethX.conf.j2#L25 In Wallaby/Xena/Yoga, it's added only when not in ACTIVE_STANDBY https://github.com/openstack/octavia/blob/unmaintained/wallaby/octavia/amphorae/backends/utils/interface_file.py#L132-L139 Since Zed, it's handled by keepalived https://github.com/openstack/octavia/blob/master/octavia/amphorae/drivers/keepalived/jinja/templates/keepalived_base.template#L60
The bug occurs when - the VIP is on subnet1 (ex: 10.1.0.12 with VRRP IP 10.1.0.100) - a member is on subnet2, subnet2 is plugged into the amphora (ex: 10.2.0.91 in the amphora) - subnet1 and subnet2 are plugged into a router - a client on subnet2 (10.2.0.80) tries to reach the VIP In the amphora-haproxy namespace, the interfaces are configured: ``` $ sudo ip -n amphora-haproxy -br a eth1 UP 10.1.0.100/24 10.1.0.12/32 eth2 UP 10.2.0.91/24 ``` There's an explicit rule for the VIP: ``` $ sudo ip -n amphora-haproxy -br rules [..] 100: from 10.1.0.12 lookup 1 proto keepalived ``` Default routing table ``` $ sudo ip -n amphora-haproxy -br route default via 10.1.0.1 dev eth1 proto static onlink 10.1.0.0/24 dev eth1 proto kernel scope link src 10.1.0.100 10.2.0.0/24 dev eth2 proto kernel scope link src 10.2.0.91 ``` Table 1 ``` $ sudo ip -n amphora-haproxy -br route show table 1 10.1.0.0/24 dev eth1 proto keepalived scope link src 10.1.0.12 ``` A TCP SYN packet from the client (10.2.0.80) to the VIP (10.1.0.12) is received on eth1, the linux kernel must send a SYN-ACK packet on the same interface to the client to acknowledge the connection, as the SYN-ACK packet is emitted from 10.1.0.12, it uses routing table 1, but there's no route to 10.2.0.0/24 and no default route, the packet is dropped. A default route in table 1 via `10.1.0.1 dev eth1` would fix this issue.
Hello Gregory, Customer tried an hotfix on a testing infrastructure. They applied procedure detailed in https://access.redhat.com/solutions/6214931 to update the default amphora-image with a slight modification of code. So, they asked if this hotfix can be deployed on production infrastructure without side effect? diff interface_file.py interface_file.py.ori 132,140c132,139 < # OASIS CASE 03908793 < # if topology != consts.TOPOLOGY_ACTIVE_STANDBY: < self.routes.append({ < consts.DST: ( < "::/0" if ip_version == 6 else "0.0.0.0/0"), < consts.GATEWAY: gateway, < consts.FLAGS: [consts.ONLINK], < consts.TABLE: 1, < }) --- > if topology != consts.TOPOLOGY_ACTIVE_STANDBY: > self.routes.append({ > consts.DST: ( > "::/0" if ip_version == 6 else "0.0.0.0/0"), > consts.GATEWAY: gateway, > consts.FLAGS: [consts.ONLINK], > consts.TABLE: 1, > }) Regards, Conrado
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (RHOSP 17.1.4 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:9974
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days