Bug 2109059

Summary: Reply to arp requests on interfaces with no ip
Product: OpenShift Container Platform Reporter: Federico Paolinelli <fpaoline>
Component: NetworkingAssignee: Federico Paolinelli <fpaoline>
Networking sub component: Metal LB QA Contact: Greg Kopels <gkopels>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium    
Version: 4.11   
Target Milestone: ---   
Target Release: 4.12.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-01-17 19:53:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2109489    

Description Federico Paolinelli 2022-07-20 11:03:10 UTC
Description of problem:

Raised upstream. A use case of metallb is in conjunction with vlan interfaces with no ip (to save ips from the lan).
MetalLB must be able to reply on interfaces with no ip assigned.

This is going to reopen https://bugzilla.redhat.com/show_bug.cgi?id=2068303 , and the right solution to that bz is what is being worked on in https://github.com/metallb/metallb/blob/main/design/layer2-bind-interfaces.md

Comment 3 Federico Paolinelli 2022-07-21 12:07:39 UTC
*** Bug 2107516 has been marked as a duplicate of this bug. ***

Comment 7 errata-xmlrpc 2023-01-17 19:53:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399

Comment 8 Greg Kopels 2023-01-18 09:47:15 UTC
OCP-4.12.0-rc.6

1. I created a VLAN interface using an interface not connected to br-ex (the node management interface - 10.46.56.14).

 767: vlan10@ens5f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
     link/ether 50:7c:6f:16:bc:a8 brd ff:ff:ff:ff:ff:ff
     inet6 fe80::ea96:cccc:bda0:ba6c/64 scope link noprefixroute 
        valid_lft forever preferred_lft forever

2. Deployed a pod on master0. The pod is connected to the management interface of the master using a MACVLAN interface (IP address 10.46.56.132)

3. Create a L2 service with gateway 10.46.56.131.

[gkopels@ ~]$ oc get service -n metallb-test 
NAME            TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)        AGE
service-bn5t9   LoadBalancer   172.30.141.98   10.46.56.131   80:31299/TCP   5m21s

4. From the pod on the master I sent a arping to the gateway 10.56.46.131
 [root@testpod-vcrss /]# arping -I net1 10.46.56.131

 ARPING 10.46.56.131 from 10.46.56.132 net1
 Unicast reply from 10.46.56.131 [52:54:00:9D:52:91]  0.957ms Master0 br-ex mac
 Unicast reply from 10.46.56.131 [34:48:ED:F3:E2:2C]  2.421ms
 Unicast reply from 10.46.56.131 [34:48:ED:F3:E2:2C]  1.313ms
 Unicast reply from 10.46.56.131 [34:48:ED:F3:E2:2C]  0.909ms
 Unicast reply from 10.46.56.131 [34:48:ED:F3:E2:2C]  0.876ms
 Unicast reply from 10.46.56.131 [34:48:ED:F3:E2:2C]  0.946ms

The only two interfaces that answered were the gateway on the announcing node and the master0 interface where the pod is deployed.
I am unable to cause the L2 vlan10 interface to answer the arp request.

Comment 9 Greg Kopels 2023-01-18 14:43:34 UTC
1. The new VLAN interface recieves the arp request

sh-4.4# tcpdump -i vlan10 arp
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on vlan10, link-type EN10MB (Ethernet), capture size 262144 bytes
14:15:11.456822 ARP, Request who-has 10.46.56.131 (Broadcast) tell 10.46.56.131, length 46
14:15:11.456851 ARP, Reply 10.46.56.131 is-at 50:7c:6f:16:bd:98 (oui Unknown), length 46
14:15:12.557653 ARP, Request who-has 10.46.56.131 (Broadcast) tell 10.46.56.131, length 46

2. However because it is configured with VLAN id 10 it is not propagated across the switch.

The bug fix is validated