RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1067802 - ixgbe in SR-IOV mode does not respect unicast promiscuous mode in internal switch
Summary: ixgbe in SR-IOV mode does not respect unicast promiscuous mode in internal sw...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Nikolay Aleksandrov
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-02-21 05:53 UTC by David Gibson
Modified: 2018-12-06 15:55 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2014-07-08 16:00:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 735403 0 None None None Never

Description David Gibson 2014-02-21 05:53:35 UTC
Description of problem:

I believe this is a hardware/firmware bug, not a driver bug - feel free to add Intel to the case.

When 

Version-Release number of selected component (if applicable):

kernel-2.6.32-431.5.1.el6.x86_64 (but I don't think it's relevant)

04:00.0 Ethernet controller: Intel Corporation Ethernet 10G 2P X520 Adapter (rev 01)

Haven't figured out how to retrieve the NIC firmware version..

Steps to Reproduce:
1. Configure an ixgbe card with at least one virtual function
2. Attach the ixgbe's physical function to a Linux bridge, br0
3. Add a VLAN to br0, as br0.NNN (haven't yet verified if the problem also exists without VLANs)
4. Attach another bridge, brNNN to br0.NNN
5. Use libvirt to create a vm guest0, and assign a virtual function of the NIC to it with vlan set to NNN.
6. Create another vm, guest1, with a virtio virtual NIC bridged to the host's brNNN.

Expected results:

Both guest0 and guest1 are on the same logical VLAN, by different methods.  They should have full virtual LAN connectivity between them.  Likewise they should have LAN connectivity with the host, which is also on the VLAN.

Actual results:

guest0 can ping the host via its VF NIC, but it cannot ping guest1.  tcpdump shows that guest successfully ARPs guest1, however its unicast ICMP packets thereafter never reach the host's br0, and therefore never make it into guest1.

Additional info:

Counter-intuitively, the ping will work if guest0 and guest1 are on different hosts, but otherwise identically configured.

In fact the behaviour matches the precise description given in the 82599 datasheet, ยง7.10.3.3.1.  The problem is that the loopback packet switching doesn't respect the unicast promiscuous mode bit set on the PF (because a bridge is attached).

Comment 1 Andy Gospodarek 2014-02-21 19:51:05 UTC
I suspect the VLAN spoof checking that exists in the hardware (and is enabled by default in RHEL6) is the issue here.  Unfortunately there is not an easy way to disable this for individual VFs.

Do you have a local reproducer for this?  If so, you could consider this patch as a test to see if this resolves the issue:

diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 6c449e7..87a1e3a 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -3348,7 +3348,7 @@ static void ixgbe_configure_virtualization(struct ixgbe_adapter *adapter)
        /* Enable MAC Anti-Spoofing */
        hw->mac.ops.set_mac_anti_spoofing(hw,
                                          (adapter->antispoofing_enabled =
-                                          (adapter->num_vfs != 0)),
+                                          false),
                                          adapter->num_vfs);
 }

Comment 5 David Gibson 2014-02-23 23:24:35 UTC
Andy,

* Yes, I have a local reproduce, although that's on borrowed hardware that I might not have for much longer.

* I don't think this is plausibly caused by the anti-spoofing:
    - guest0 (with the VF) isn't trying to send or receive packets for any VLAN it's not assigned to, nor any MAC other than the one it's assigned.
    - As noted, guest0 *can* ping guest1 if it's located on a different host.  So, the NIC is willing to send the packets out the physical interface, and send the replies from an external source back to the VF.  It just doesn't loopback the unicast packets from the VF to the PF.
    - The unicast packets that aren't getting through *are* addressed to a MAC that isn't the host's normal MAC (it's the MAC of guest1 on the bridge).  So they would usually be filtered, but the unicast promiscuous mode bit (which the bridge code enables) should allow them to come through.
    - Again, since this works with guest1 on another host, the card correctly receives the guest1 destined packets on the PF if they come from externally, just not if they come from a local VF.

* I'll attempt to test your patch anyway, but as above, I don't think it will help.

Comment 6 David Gibson 2014-02-24 23:46:36 UTC
There is a discussion on what appears to be the same problem at https://communities.intel.com/thread/38613

However, it doesn't quite make sense to me.  It implies this is a limitation in the driver which has been fixed upstream, however as described above the real problem seems to be that the internal switch routing logic in the firmware doesn't check individual VFs or PFs unicast promiscuous mode.

Reading that I can think of two possible workarounds:
 1) Put the NIC into VEPA mode, if you have a VEPA capable switch.  The driver will then tell the NIC to make no attempt to forward packets between VFs, instead all packets will go to the external switch which is expected to hairpin them back.

AFAICT VEPA mode is available upstream, but not in RHEL 6.5

 2) Manually add MAC addresses for any interfaces on the Linux bridge to the MAC filter on the PF.

AFAICT there isn't a userspace way of doing this, however, in either upstream or RHEL.


The parts of that thread about manually adding VF addresses to the bridge forwarding database make no sense to me - the problem isn't the Linux bridge forward to the wrong places, it's that the ixgbe internal switch doesn't present unicast packets from the VF to the PF unless they match the PF's mac, even though it is in promiscuous mode.

Comment 7 David Gibson 2014-02-25 00:00:27 UTC
My mistake, there is a way of using workaround (2) above in RHEL.

Assuming the ixgbe PF is ethX, then for each MAC address XX:XX:XX:XX:XX:XX used by a VM on the Linux bridge, run:
        # ip link add macXXXXXXXXXXXX link ethX type macvlan
        # ip link set macXXXXXXXXXXXX address XX:XX:XX:XX:XX:XX up

This forces the VM's MAC address onto the PF's unicast MAC filter, allowing packets from the VF to be received by the PF and thereby forwarded onto the bridged VM.

Comment 8 John Ronciak 2014-03-21 17:49:16 UTC
Exactly what "firmware" are you talking about.  The Intel 10 gig HW does not have firmware.  So I can't follow what you are saying about this issue.

Comment 9 David Gibson 2014-03-26 05:02:40 UTC
>  The Intel 10 gig HW does not have firmware.

Really?  I hard it hard to believe all the card's complex features are implemented without any firmware at all.

But I guess if the firmware isn't updatable, then there's nothing we can do.

Do you have any idea what sort of driver side fix was envisaged in the discussion at https://communities.intel.com/thread/38613.

As far as I can tell this is a hardware erratum, and the only way to fix it in the driver is with a very ugly workaround to detect with the interface is bridged, monitor MACs learned by the bridge and automatically add them to the hardware's MAC filter.

Comment 10 Vlad Yasevich 2014-03-26 13:23:36 UTC
(In reply to David Gibson from comment #6)
> 
> The parts of that thread about manually adding VF addresses to the bridge
> forwarding database make no sense to me - the problem isn't the Linux bridge
> forward to the wrong places, it's that the ixgbe internal switch doesn't
> present unicast packets from the VF to the PF unless they match the PF's
> mac, even though it is in promiscuous mode.

I can shed some light on this for you.  The above workaround has to do with
how the /sbin/bridge command operates.  It has 2 modes of operation:
  1) Operate on a master device
  2) Operate on the specified device itself.

The default mode of operation is 2 (device itself).  So when you issue a command:
   bridge fdb add XX:XX:XX:XX:XX:XX dev eth0

you are actually adding the mac address the eth0 MAC filter table, similar to
what macvlan does.  This happens even if eth0 is a bridge port.
This functionality is not in rhel6.

-vlad

Comment 11 David Gibson 2014-03-27 00:52:40 UTC
Vlad, thanks for that clarification.

Andy,

As noted in pasing in c#9, I think it's at least theoretically possible to make this an automated workaround in the driver, by monitoring the bridge forwarding db and adjusting the MAC filter accordingly.

That approach would certainly be ugly, but it's the only way I can see to work around this hardware bug.  Does that method seem at all feasible to you?

Comment 12 Vlad Yasevich 2014-03-27 00:59:12 UTC
(In reply to David Gibson from comment #11)
> Vlad, thanks for that clarification.
> 
> Andy,
> 
> As noted in pasing in c#9, I think it's at least theoretically possible to
> make this an automated workaround in the driver, by monitoring the bridge
> forwarding db and adjusting the MAC filter accordingly.
> 
> That approach would certainly be ugly, but it's the only way I can see to
> work around this hardware bug.  Does that method seem at all feasible to you?

Hi David

This approach has been considered upstream and rejected.  The currently proposed solution involves libvirt monitoring the configuration of the guest and programming things appropriately.  That's been deferred until 7.1.

-vlad

Comment 13 Terry Bowling 2014-06-28 12:12:59 UTC
What is the status of this related to Vlad's comment 12?  Is there another BZ we should track for this libvirt monitoring?  Is it still on track for 7.1?  Do we need this BZ updated to track 7.1?

Comment 14 Nikolay Aleksandrov 2014-07-08 08:44:44 UTC
After talking to Vlad, he pointed me at these two bugzillas:
https://bugzilla.redhat.com/show_bug.cgi?id=896669 (kernel)
https://bugzilla.redhat.com/show_bug.cgi?id=1099210 (user-space)

The kernel fixes for this have been rejected in upstream, so we're left relying on the user-space fix, thus I propose to close this bugzilla as WONTFIX and to
continue monitoring the user-space fix.

What do you think ?

Comment 15 John Ronciak 2014-07-08 15:56:24 UTC
Is there another choice?  I don't think there is.

Comment 16 Nikolay Aleksandrov 2014-07-08 16:00:54 UTC
Closing as WONTFIX and going forward with the user-space solution mentioned in comment #14


Note You need to log in before you can comment on or make changes to this bug.