Bug 2306799
| Summary: | [Openstack 17.1.3] Octavia loadbalancer not allow members to access the VIP by default | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Conrado Gusso Bozza <cgussobo> |
| Component: | openstack-octavia | Assignee: | Gregory Thiemonge <gthiemon> |
| Status: | CLOSED ERRATA | QA Contact: | Arkady Shtempler <ashtempl> |
| Severity: | medium | Docs Contact: | Greg Rakauskas <gregraka> |
| Priority: | medium | ||
| Version: | 17.1 (Wallaby) | CC: | bcafarel, beagles, chrisbro, dhill, gthiemon, mariel, mburns, parthee, tweining, yatanaka |
| Target Milestone: | z4 | Keywords: | Triaged |
| Target Release: | 17.1 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | openstack-octavia-8.0.2-17.1.20240829160807.8cbe692.el9ost | Doc Type: | Bug Fix |
| Doc Text: |
Before this update, load balancers could not reply to requests from hosts attached to a subnet that is also used as a member subnet in the same load balancer. This was caused by a missing default network route on the VIP interface of the load balancer. In RHOSP 17.1.4, the missing route has been added, and connectivity is restored.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2024-11-21 09:42:19 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Conrado Gusso Bozza
2024-08-21 16:54:49 UTC
Some notes on this issue: A default route is missing from table 1 in ACTIVE_STANDBY: [cloud-user@amphora-46e9b6ca-82a2-4317-a54d-a0a7de2769d2 ~]$ sudo ip netns exec amphora-haproxy ip route default via 10.0.2.1 dev eth1 proto static onlink 10.0.1.0/24 dev eth1 proto kernel scope link src 10.0.1.89 10.0.2.0/24 dev eth1 proto kernel scope link src 10.0.2.55 [cloud-user@amphora-46e9b6ca-82a2-4317-a54d-a0a7de2769d2 ~]$ sudo ip netns exec amphora-haproxy ip route show table 1 10.0.2.0/24 dev eth1 proto keepalived scope link src 10.0.2.87 Before Wallaby, the route was added by the ifcfg scripts: https://github.com/openstack/octavia/blob/unmaintained/victoria/octavia/amphorae/backends/agent/api_server/templates/rh_route_ethX.conf.j2#L25 In Wallaby/Xena/Yoga, it's added only when not in ACTIVE_STANDBY https://github.com/openstack/octavia/blob/unmaintained/wallaby/octavia/amphorae/backends/utils/interface_file.py#L132-L139 Since Zed, it's handled by keepalived https://github.com/openstack/octavia/blob/master/octavia/amphorae/drivers/keepalived/jinja/templates/keepalived_base.template#L60 The bug occurs when - the VIP is on subnet1 (ex: 10.1.0.12 with VRRP IP 10.1.0.100) - a member is on subnet2, subnet2 is plugged into the amphora (ex: 10.2.0.91 in the amphora) - subnet1 and subnet2 are plugged into a router - a client on subnet2 (10.2.0.80) tries to reach the VIP In the amphora-haproxy namespace, the interfaces are configured: ``` $ sudo ip -n amphora-haproxy -br a eth1 UP 10.1.0.100/24 10.1.0.12/32 eth2 UP 10.2.0.91/24 ``` There's an explicit rule for the VIP: ``` $ sudo ip -n amphora-haproxy -br rules [..] 100: from 10.1.0.12 lookup 1 proto keepalived ``` Default routing table ``` $ sudo ip -n amphora-haproxy -br route default via 10.1.0.1 dev eth1 proto static onlink 10.1.0.0/24 dev eth1 proto kernel scope link src 10.1.0.100 10.2.0.0/24 dev eth2 proto kernel scope link src 10.2.0.91 ``` Table 1 ``` $ sudo ip -n amphora-haproxy -br route show table 1 10.1.0.0/24 dev eth1 proto keepalived scope link src 10.1.0.12 ``` A TCP SYN packet from the client (10.2.0.80) to the VIP (10.1.0.12) is received on eth1, the linux kernel must send a SYN-ACK packet on the same interface to the client to acknowledge the connection, as the SYN-ACK packet is emitted from 10.1.0.12, it uses routing table 1, but there's no route to 10.2.0.0/24 and no default route, the packet is dropped. A default route in table 1 via `10.1.0.1 dev eth1` would fix this issue. Hello Gregory, Customer tried an hotfix on a testing infrastructure. They applied procedure detailed in https://access.redhat.com/solutions/6214931 to update the default amphora-image with a slight modification of code. So, they asked if this hotfix can be deployed on production infrastructure without side effect? diff interface_file.py interface_file.py.ori 132,140c132,139 < # OASIS CASE 03908793 < # if topology != consts.TOPOLOGY_ACTIVE_STANDBY: < self.routes.append({ < consts.DST: ( < "::/0" if ip_version == 6 else "0.0.0.0/0"), < consts.GATEWAY: gateway, < consts.FLAGS: [consts.ONLINK], < consts.TABLE: 1, < }) --- > if topology != consts.TOPOLOGY_ACTIVE_STANDBY: > self.routes.append({ > consts.DST: ( > "::/0" if ip_version == 6 else "0.0.0.0/0"), > consts.GATEWAY: gateway, > consts.FLAGS: [consts.ONLINK], > consts.TABLE: 1, > }) Regards, Conrado Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (RHOSP 17.1.4 bug fix and enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:9974 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |