Bug 1934643
| Summary: | Need BFD failover capability on ECMP routes | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Tim Rozet <trozet> | |
| Component: | Networking | Assignee: | Federico Paolinelli <fpaoline> | |
| Networking sub component: | ovn-kubernetes | QA Contact: | Ross Brattain <rbrattai> | |
| Status: | CLOSED ERRATA | Docs Contact: | ||
| Severity: | urgent | |||
| Priority: | urgent | CC: | aconstan, pibanezr, rbrattai, zzhao | |
| Version: | 4.7 | |||
| Target Milestone: | --- | |||
| Target Release: | 4.8.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | No Doc Update | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1934645 (view as bug list) | Environment: | ||
| Last Closed: | 2021-07-27 22:49:27 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1934645 | |||
|
Description
Tim Rozet
2021-03-03 15:41:22 UTC
I see the annotations causing the BFD to be created, but I don't have external BFD routers to test with.
4.8.0-0.nightly-2021-03-16-073618
oc annotate ns t1 k8s.ovn.org/routing-external-gws=10.242.0.1,10.242.0.2
oc annotate ns t1 k8s.ovn.org/bfd-enabled=""
GR_rbrattai-o48v24-2pkgr-worker-ghvzc
IPv4 Routes
10.128.2.10 10.242.0.1 src-ip ecmp ecmp-symmetric-reply
10.128.2.10 10.242.0.2 src-ip ecmp ecmp-symmetric-reply
10.128.0.0/14 100.64.0.1 dst-ip
0.0.0.0/0 172.31.248.1 dst-ip rtoe-GR_rbrattai-o48v24-2pkgr-worker-ghvzc
sh-4.4# ovn-nbctl --format=table find BFD
_uuid detect_mult dst_ip external_ids logical_port min_rx min_tx options status
------------------------------------ ----------- ------------ ------------ ------------------------------------------ ------ ------ ------- ------
94add884-a7b0-4f80-ae31-b36a369e3cc8 [] "10.242.0.1" {} rtoe-GR_rbrattai-o48v24-2pkgr-worker-ghvzc [] [] {} down
eb70052f-4ae0-4fa5-9bb5-9f7eb5605895 [] "10.242.0.2" {} rtoe-GR_rbrattai-o48v24-2pkgr-worker-ghvzc [] [] {} down
sh-4.4# ovn-nbctl --format=table find Logical_Router_Static_Route
_uuid bfd external_ids ip_prefix nexthop options output_port policy
------------------------------------ ------------------------------------ ------------ --------------- -------------- ----------------------------- ------------------------------------------ ------
89e72ba6-61f2-441a-a1a1-c3ec2bdbe185 [] {} "0.0.0.0/0" "172.31.248.1" {} rtoe-GR_rbrattai-o48v24-2pkgr-worker-c5qrb []
118c64c8-9966-49f9-81a9-1956dd031f37 94add884-a7b0-4f80-ae31-b36a369e3cc8 {} "10.128.2.10" "10.242.0.1" {ecmp_symmetric_reply="true"} rtoe-GR_rbrattai-o48v24-2pkgr-worker-ghvzc src-ip
87a76309-8ea6-46af-a300-e217b5e117e8 [] {} "0.0.0.0/0" "172.31.248.1" {} rtoe-GR_rbrattai-o48v24-2pkgr-master-2 []
a9faaa8a-0ee0-4671-bf8b-330b05b4eab3 [] {} "0.0.0.0/0" "172.31.248.1" {} rtoe-GR_rbrattai-o48v24-2pkgr-worker-ghvzc []
d517ca12-c978-4bf1-9896-f8dbc1686bfe [] {} "0.0.0.0/0" "172.31.248.1" {} rtoe-GR_rbrattai-o48v24-2pkgr-master-1 []
cbf0a1b4-6be7-43bf-94aa-50267735bb85 eb70052f-4ae0-4fa5-9bb5-9f7eb5605895 {} "10.128.2.10" "10.242.0.2" {ecmp_symmetric_reply="true"} rtoe-GR_rbrattai-o48v24-2pkgr-worker-ghvzc src-ip
sh-4.4# ovs-ofctl -O OpenFlow13 dump-flows br-ex | grep 3784
cookie=0xdeff105, duration=34429.320s, table=1, n_packets=0, n_bytes=0, priority=13,udp,in_port=1,tp_dst=3784 actions=output:2,LOCAL
sh-4.4# ovs-ofctl -O OpenFlow13 dump-flows br-int | grep tp_dst=3784
cookie=0xb1b9218c, duration=1797.314s, table=11, n_packets=0, n_bytes=0, priority=110,udp6,metadata=0x11,ipv6_dst=fe80::250:56ff:feac:61dc,tp_dst=3784 actions=controller(userdata=00.00.00.17.00.00.00.00)
cookie=0x8403fc30, duration=1797.314s, table=11, n_packets=0, n_bytes=0, priority=110,udp,metadata=0x11,nw_dst=172.31.249.221,tp_dst=3784 actions=controller(userdata=00.00.00.17.00.00.00.00)
cookie=0x92c08538, duration=1797.314s, table=11, n_packets=0, n_bytes=0, priority=110,udp6,metadata=0x11,ipv6_src=fe80::250:56ff:feac:61dc,tp_dst=3784 actions=resubmit(,12)
cookie=0xb450fcd2, duration=1797.314s, table=11, n_packets=4105, n_bytes=270930, priority=110,udp,metadata=0x11,nw_src=172.31.249.221,tp_dst=3784 actions=resubmit(,12)
cookie=0x30f403ae, duration=1797.316s, table=18, n_packets=0, n_bytes=0, priority=130,udp6,reg14=0x2,metadata=0x11,ipv6_dst=fe80::/64,tp_dst=3784 actions=load:0->OXM_OF_PKT_REG4[32..47],move:NXM_NX_IPV6_DST[]->NXM_NX_XXREG0[],set_field:0xfe80000000000000025056fffeac61dc->xxreg1,set_field:00:50:56:ac:61:dc->eth_src,set_field:0x2->reg15,load:0x1->NXM_NX_REG10[0],resubmit(,19)
cookie=0xc7f30157, duration=1797.316s, table=18, n_packets=0, n_bytes=0, priority=48,udp,metadata=0x11,nw_dst=172.31.248.0/23,tp_dst=3784 actions=load:0->OXM_OF_PKT_REG4[32..47],move:NXM_OF_IP_DST[]->NXM_NX_XXREG0[96..127],load:0xac1ff9dd->NXM_NX_XXREG0[64..95],set_field:00:50:56:ac:61:dc->eth_src,set_field:0x2->reg15,load:0x1->NXM_NX_REG10[0],resubmit(,19)
cookie=0x42621563, duration=1797.316s, table=18, n_packets=4105, n_bytes=270930, priority=2,udp,metadata=0x11,tp_dst=3784 actions=load:0->OXM_OF_PKT_REG4[32..47],load:0xac1ff801->NXM_NX_XXREG0[96..127],load:0xac1ff9dd->NXM_NX_XXREG0[64..95],set_field:00:50:56:ac:61:dc->eth_src,set_field:0x2->reg15,load:0x1->NXM_NX_REG10[0],resubmit(,19)
@trozet Could you help check if comment 3 is enough to verify this bug? thanks. In case it's not enough: I used frr to test and CI it, maybe it helps. http://docs.frrouting.org/en/latest/bfd.html You need to add sed -i 's/^bfdd=no/bfdd=yes/g' /etc/frr/daemons To enable bfdd, and cat << EOF >> /etc/frr/frr.conf bfd peer 172.18.0.4 no shutdown ! ! EOF to add a peer. vtysh -c "show bfd peers" would show the coupled peers ovn-nbctl find bfd show the peer status from the ovn point of view. If you are unable to run a bfd client to verify, then you can at minimum verify you see bfd control packets coming from the node. At least that provides some indication that bfd is functioning from the OVN side. frr was easy enough
sh-4.4# ovn-nbctl find bfd
_uuid : 82f83d39-5769-463e-88de-fae888bc480b
detect_mult : []
dst_ip : "10.0.24.40"
external_ids : {}
logical_port : rtoe-GR_ip-10-0-142-146.compute.internal
min_rx : []
min_tx : []
options : {}
status : up
sh-4.4# ovn-nbctl find Logical_Router_Static_Route bfd!=[]
_uuid : 7b542cfc-3853-4da5-b1ab-c45ecdcf5f51
bfd : 82f83d39-5769-463e-88de-fae888bc480b
external_ids : {}
ip_prefix : "10.129.3.121"
nexthop : "10.0.24.40"
options : {ecmp_symmetric_reply="true"}
output_port : rtoe-GR_ip-10-0-142-146.compute.internal
policy : src-ip
[root@ip-10-0-24-40 frr]# tcpdump -i eth0 port '(3784 or 3785)'
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
01:56:46.476071 IP ip-10-0-142-146.compute.internal.49152 > ip-10-0-24-40.compute.internal.bfd-control: BFDv1, Control, State Up, Flags: [none], length: 24
01:56:46.649478 IP ip-10-0-24-40.compute.internal.49152 > ip-10-0-142-146.compute.internal.bfd-control: BFDv1, Control, State Up, Flags: [none], length: 24
[root@ip-10-0-24-40 frr]# vtysh -c "show bfd peers"
BFD Peers:
peer 10.0.142.146
ID: 1
Remote ID: 3771289794
Status: up
Uptime: 2 minute(s), 3 second(s)
Diagnostics: ok
Remote diagnostics: ok
Local timers:
Receive interval: 300ms
Transmission interval: 300ms
Echo transmission interval: disabled
Remote timers:
Receive interval: 1000ms
Transmission interval: 1000ms
Echo transmission interval: 0ms
Internet Protocol Version 4, Src: 10.0.142.146, Dst: 10.0.24.40
0100 .... = Version: 4
.... 0101 = Header Length: 20 bytes (5)
Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
0000 00.. = Differentiated Services Codepoint: Default (0)
.... ..00 = Explicit Congestion Notification: Not ECN-Capable Transport (0)
Total Length: 52
Identification: 0x0000 (0)
Flags: 0x40, Don't fragment
0... .... = Reserved bit: Not set
.1.. .... = Don't fragment: Set
..0. .... = More fragments: Not set
Fragment Offset: 0
Time to Live: 255
Protocol: UDP (17)
Header Checksum: 0xc0fe [validation disabled]
[Header checksum status: Unverified]
Source Address: 10.0.142.146
Destination Address: 10.0.24.40
User Datagram Protocol, Src Port: 49152, Dst Port: 3784
Source Port: 49152
Destination Port: 3784
Length: 32
[Checksum: [missing]]
[Checksum Status: Not present]
[Stream index: 0]
[Timestamps]
[Time since first frame: 0.000000000 seconds]
[Time since previous frame: 0.000000000 seconds]
UDP payload (24 bytes)
BFD Control message
001. .... = Protocol Version: 1
...0 0000 = Diagnostic Code: No Diagnostic (0x00)
11.. .... = Session State: Up (0x3)
Message Flags: 0xc0
0... .. = Poll: Not set
.0.. .. = Final: Not set
..0. .. = Control Plane Independent: Not set
...0 .. = Authentication Present: Not set
.... 0. = Demand: Not set
.... .0 = Multipoint: Not set
Detect Time Multiplier: 5 (= 5000 ms Detection time)
Message Length: 24 bytes
My Discriminator: 0xe0c950c2
Your Discriminator: 0x00000001
Desired Min TX Interval: 1000 ms (1000000 us)
Required Min RX Interval: 1000 ms (1000000 us)
Required Min Echo Interval: 0 ms (0 us)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |