Bug 1391666

Summary: [configuration] RHEL 7.3 bad networking behaviour on Hyper-V
Product: Red Hat Enterprise Linux 7 Reporter: Zsolt Dudás <v-zsduda>
Component: NetworkManagerAssignee: Lubomir Rintel <lrintel>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Desktop QE <desktop-qa-list>
Severity: medium Docs Contact:
Priority: medium    
Version: 7.3CC: ailan, atragler, bgalvani, boyang, cavery, fgiudici, hhei, jingli, jopoulso, kys, ldu, leiwang, lrintel, mleitner, nmeier, rkhan, sukulkar, thaller, v-chvale, vkuznets, v-zsduda, xiaofwan, xuli, yacao
Target Milestone: rcFlags: bgalvani: needinfo? (v-zsduda)
Target Release: 7.3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-07-12 13:03:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Zsolt Dudás 2016-11-03 17:25:51 UTC
Description of problem:
We observed unexpected behavior on RHEL 7.3 when it has multiple synthetic network adapters (NIC) attached. This can be most easily tested and confirmed when using VLAN Tagging. 
In this case we have 2 NICs attached to the VM, eth0 and eth1. We set eth1 NIC’s tag to 2 for example and the interface is still pingable from outside. However if we set eth0 NIC’s tag to 2, the results are as it would’ve been expected with eth1 and it won’t be pingable anymore, only from adapters that have VLAN ID set to 2. It acts the same way even if I do the routing through eth0 or eth1.

Version-Release number of selected component (if applicable):
RHEL 7.3 RC2

How reproducible: 100%


Steps to Reproduce:
1. Create a RHEL 7.3 VM with 2 NICs
2. Set them up, check if they are correctly configured, check their MAC addresses
3. ping the second NIC (eth1) from another VM
4. set the second NICs VLAN ID from Hyper-V manager to 2 for example (Set-VMNetworkAdapterVlan -VMNetworkAdapter $secondNIC -Access -VlanID 2)
5. at this point ping should fail (Destination Host Unreachable), however it won't
6. uncheck Enable virtual LAN identification on the second NIC (or Set-VMNetworkAdapterVlan -VMNetworkAdapter $secondNIC -Untagged)
7. set the first NICs (eth0) VLAN ID from Hyper-V manager to 2 for example (Set-VMNetworkAdapterVlan -VMNetworkAdapter $firstNIC -Access -VlanID 2)
8. at this point ping will fail, even though we are pinging the other interface which is untagged atm.
9. it will work again only if we set the sender's VLAN ID to the same number

Actual results:
I ping an interface's IP on which I set a VLAN ID from Hyper-V manager, the interface is still reachable.

Expected results:
Ping should fail on the interface that has VLAN ID set on the associated synthetic network adapter. It should only be reachable from other adapters with the same VLAN ID.

Additional info:
Please note that rp_filtering is off.
We did not hit this issue in any RHEL7.x version, it appeared in 7.3.
Through all my testing I always had issues with the second interface in line, however this might vary.

Comment 1 Yaju Cao 2016-11-04 08:44:47 UTC
Hi, Zsolt

Do the 2 NICs in the VM use the same vSwitch, and ping from outside?

If yes, I don't think this is an issue. Since the 2 NICs are in the same subnet, the route may be confused, and either of the NICs could be chosen to use.

In my test, below is the VM with 2 NICs:
--------------------------------------------------------------
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:15:5d:83:3b:33 brd ff:ff:ff:ff:ff:ff
    inet 10.73.131.111/23 brd 10.73.131.255 scope global dynamic eth0
       valid_lft 35557sec preferred_lft 35557sec
    inet6 2620:52:0:4982:215:5dff:fe83:3b33/64 scope global mngtmpaddr dynamic 
       valid_lft 2591999sec preferred_lft 604799sec
    inet6 fe80::215:5dff:fe83:3b33/64 scope link 
       valid_lft forever preferred_lft forever
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 00:15:5d:83:3b:4d brd ff:ff:ff:ff:ff:ff
    inet 10.73.131.202/23 brd 10.73.131.255 scope global dynamic eth1
       valid_lft 42921sec preferred_lft 42921sec
    inet6 2620:52:0:4982:215:5dff:fe83:3b4d/64 scope global mngtmpaddr dynamic 
       valid_lft 2591999sec preferred_lft 604799sec
    inet6 fe80::215:5dff:fe83:3b4d/64 scope link 
       valid_lft forever preferred_lft forever
------------------------------------------------------------

However, arping 10.73..131.202 (eth1), returns the MAC of the first NIC eth0. So even eth1 is set Vlan=2 by Hyper-V Manager(or even down the eth1), eth0 would reply the ping for 10.73.131.202.

------------------------------------------------------------
[root@bootp-73-131-123 ~]# arping -I eth0 10.73.131.111
ARPING 10.73.131.111 from 10.73.131.123 eth0
Unicast reply from 10.73.131.111 [00:15:5D:83:3B:33]  0.707ms
Unicast reply from 10.73.131.111 [00:15:5D:83:3B:33]  0.663ms
Unicast reply from 10.73.131.111 [00:15:5D:83:3B:33]  0.627ms
Unicast reply from 10.73.131.111 [00:15:5D:83:3B:33]  0.650ms
^CSent 4 probes (1 broadcast(s))
Received 4 response(s)
[root@bootp-73-131-123 ~]# arping -I eth0 10.73.131.202
ARPING 10.73.131.202 from 10.73.131.123 eth0
Unicast reply from 10.73.131.202 [00:15:5D:83:3B:33]  0.669ms
Unicast reply from 10.73.131.202 [00:15:5D:83:3B:33]  0.650ms
Unicast reply from 10.73.131.202 [00:15:5D:83:3B:33]  0.635ms
-------------------------------------------------------------

Also, I don't know the reason to configure like this. Any use scenarios?

Comment 3 Zsolt Dudás 2016-11-08 15:35:30 UTC
Hi Yaju,

We are doing some networking tests in Hyper-v and one of the NICs is communicating with our testing framework while on the other one we do the actual tests. In this case we test VLAN Tagging on the second NIC, the first remains untouched as it shouldn't interrupt communication with the framework. They are both connected to the same vSwitch.
We ran our tests like this all the time and didn't have any issues from RHEL 5.2 to RHEL 7.2, 7.3 is the first version where we hit it.

Thanks,
Zsolt

Comment 4 Beniamino Galvani 2016-11-08 17:12:17 UTC
Hi,

can you please share your network configuration, i.e. the output of:

grep ^ /etc/sysconfig/network-scripts/ifcfg-*
ip addr
ip route

and, while ping is running, a capture of the traffic on the
interface which is wrongly responding to ping?

Comment 5 Yaju Cao 2016-11-09 05:46:22 UTC
(In reply to Zsolt Dudás from comment #3)
> Hi Yaju,
> 
> We are doing some networking tests in Hyper-v and one of the NICs is
> communicating with our testing framework while on the other one we do the
> actual tests. In this case we test VLAN Tagging on the second NIC, the first
> remains untouched as it shouldn't interrupt communication with the
> framework. They are both connected to the same vSwitch.

If for testing purpose, I suggest using a different vSwitch in a different sub net. Or just add a NIC attached to a private vSwitch.

> We ran our tests like this all the time and didn't have any issues from RHEL
> 5.2 to RHEL 7.2, 7.3 is the first version where we hit it.

I tried with RHEL 7.2, and it also has the same behavior as in this issue. 

If you still consider this as an issue. I think it may be caused by the Hyper-V  host side, the vSwitch may have not clear the MAC-IP(ARP) mapping table after you set the Vlan on the VM NIC. I am not sure if it should do so or not, this is just my guess. You could compare the result on Hyper-V 2016 with Hyper-V 2012R2, is there some change in the vSwitch side?

> 
> Thanks,
> Zsolt

Comment 6 sushil kulkarni 2017-07-12 13:03:45 UTC
Hi,

I am closing this bug for now.. If you have further information, please re-open.

Thanks,
Sushil