Description of problem: we have deployed a wireshark monitoring server with latest F19, the mirroring sources are multiple and interfaces are bonded in "MONI" interface. further more the switches performing the mirroring show the 802.1Q tags in one way but not in the other, so we are used to launch wireshark in such a way when we need capture filters : export FILTER="net 10.1.1.1/28" wireshark -i MONI -k -f "$FILTER or ( vlan and $FILTER )" it quickly revealed that some packets with 802.1Q vlan tag were no more captured as the SCTP packets we are interested in were visible only one way. if capturing without filters, we can see some other packets with 802.1Q but not the packets we are interested in. we use kernel-PAE-3.10.9-200.fc19.i686, all is back to normal and works correctly when i execute the following and reboot : rpm -i kernel-PAE-3.9.5-301.fc19.i686.rpm --oldpackage (kernel-3.9.5-301.fc19.i686.rpm is also OK) i presume the libpcap-1.4.0-1.fc19.i686 and the kernel-PAE-3.10.9-200.fc19.i686 must have some kind of incompatibilities but i don't know where to begin to analyse more deeply this issue. thanks for your help, Regards, Yann. Version-Release number of selected component (if applicable): # rpm -q kernel-PAE kernel-PAE-3.10.9-200.fc19.i686 kernel-PAE-3.9.5-301.fc19.i686 # rpm -q libpcap libpcap-1.4.0-1.fc19.i686 How reproducible: don't know how to reproduce outside of our lab Steps to Reproduce: 1. boot on kernel-PAE-3.10.9-200.fc19.i686 , launch wireshark -i MONI -k -f "$FILTER or ( vlan and $FILTER )" as usual Actual results: -> some particular SCTP 802.1Q packets are not captured Expected results: -> all packets are captured as when we boot on kernel-PAE-3.9.5-301.fc19.i686 Additional info:
as a workaround it is possible to create all possible vlans on the machine: for i in {0..4094}; do vconfig add MONI $i ; done but after that the ifconfig command is rather slow to display the configuration. it looks as if the interface is not in full promiscuous mode, kernel will only allow packet capture if the corresponding vlan is created.
Hi Yann, could you please share the pcap of one of those problematic packets that are not captured via 3.10.9 but are captured via 3.9.5 ? Do they have vlan prio bits set? thanks, Michele
Created attachment 793507 [details] capture with the 3.9.5 kernel in this capture with the 3.9.5 kernel, the 802.1Q packets do not have the vlan prio bits set
Hi, below a few new clues : last OK kernel version : 3.9.11 first NOK kernel version : 3.10.1 the MONI interface is a bonding of 6 interfaces on a TIGW1U server : 2 with driver e1000 + 4 with driver igb packets on e1000 -> OK, 802.1Q packets captured packets on igb -> NOK, 802.1Q packets captured only if vlan is created i tried to compile a 3.10.1 kernel with drivers/net/ethernet/intel/igb taken from 3.9.11 but i fail. the bonding subsystem is probably not involved as capturing without any bonding is giving the same results. Regards, Yann.
erratum : it is OK with e1000e driver (not e1000)
comparing 3.9.11 and 3.10.1, thanks to comments, i tried removing some code about "VT mode", and it allowed to go back to OK situation (capturing 802.1Q packets with igb driver): --- linux-3.10.1/drivers/net/ethernet/intel/igb/igb_main.c +++ linux-3.10.1.yBO/drivers/net/ethernet/intel/igb/igb_main.c @@ -3738,8 +3738,8 @@ static void igb_set_rx_mode(struct net_d if (netdev->flags & IFF_PROMISC) { u32 mrqc = rd32(E1000_MRQC); /* retain VLAN HW filtering if in VT mode */ - if (mrqc & E1000_MRQC_ENABLE_VMDQ) - rctl |= E1000_RCTL_VFE; rctl |= (E1000_RCTL_UPE | E1000_RCTL_MPE); vmolr |= (E1000_VMOLR_ROPE | E1000_VMOLR_MPME); } else { sadly i don't even understand why it worked and what could be the side effects ...
Hi Yann, thanks for the additional info. So the commit that brought in this change in behaviour is the following: commit 6f3dc319ec5c101e1e927e55d593ad6637648fe5 Author: Greg Rose <gregory.v.rose> Date: Tue Mar 26 06:19:41 2013 +0000 igb: Retain HW VLAN filtering while in promiscuous + VT mode When using the new bridge FDB interface to allow SR-IOV virtual function network devices to communicate with SW bridged network devices the physical function is placed into promiscuous mode and hardware VLAN filtering is disabled. This defeats the ability to use VLAN tagging to isolate user networks. When the device is in promiscuous mode and VT mode simultaneously ensure that VLAN hardware filtering remains enabled. Signed-off-by: Greg Rose <gregory.v.rose> Tested-by: Sibai Li <sibai.li> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher> Ideally we can come up with a small contained reproducer (maybe injecting some traffic via http://netsniff-ng.org/ and collecting it via tshark) to demonstrate the issue. If it's too complex we can just try to ping e1000-devel. I'll see if I can give it a shot in the next days. cheers, Michele
unfortunatly, replaying packets locally with netsniff-ng does not allow to reproduce the issue, however i can update the "How reproducible:" section using a secondary PC. How reproducible: on a server ( for exemple TIGW1U ) with some network interfaces using igb driver ( Intel Corporation Gigabit VT Quad Port Server Adapter ) with kernel 3.10.1 , launch a capture , for exemple : tcpdump -i p6p6 "vlan and host 10.32.112.169" on interface p6p6, plug an ethernet cable to a PC, on the PC replay some .pcap with some 802.1Q packets : netsniff-ng --in /tmp/802.1Q.pcap --out enp0s25 expected result : 802.1Q packets are captured actual result : 802.1Q packets are not captured possible workaround 1 -> fallback to 3.9.11 kernel possible workaround 2 -> rather capture on an interface not using igb driver (e1000e is OK) possible workaround 3 -> create all possible vlan ( for i in {0..4094}; do vconfig add p6p6 $i ; done ) possible workaround 4 -> recompile kernel applying below patch : --- linux-3.10.1/drivers/net/ethernet/intel/igb/igb_main.c +++ linux-3.10.1.new/drivers/net/ethernet/intel/igb/igb_main.c @@ -3738,8 +3738,8 @@ static void igb_set_rx_mode(struct net_d if (netdev->flags & IFF_PROMISC) { u32 mrqc = rd32(E1000_MRQC); /* retain VLAN HW filtering if in VT mode */ - if (mrqc & E1000_MRQC_ENABLE_VMDQ) - rctl |= E1000_RCTL_VFE; rctl |= (E1000_RCTL_UPE | E1000_RCTL_MPE); vmolr |= (E1000_VMOLR_ROPE | E1000_VMOLR_MPME); } else {
Created attachment 794046 [details] 802.1Q pcap for replaying with netsniff-ng 802.1Q pcap for replaying with netsniff-ng (used in "How reproducible:" in comment #8)
Thanks Yann, that's perfect. I'll raise it to e1000-devel in the next days. Will keep you posted.
Hi Yann, nevermind. No need to harass e1000. This has been fixed upstream aldeady: commit 7e44892c1b6bb499cb2f6d5c0f4afcc077a26074 Author: Emil Tantilov <emil.s.tantilov> Date: Fri Jul 26 05:46:36 2013 -0700 igb: fix vlan filtering in promisc mode when not in VT mode This patch fixes a VT mode check to make sure VLAN filters are disabled when in promisc mode and VT is not enabled. The problem with the previous check was that: E1000_MRQC_ENABLE_VMDQ is defined as 0x00000003 but when not in VT mode: mrqc |= E1000_MRQC_ENABLE_RSS_4Q (0x00000002) So the above check will trigger regardless if VT mode is being used or not. Signed-off-by: Emil Tantilov <emil.s.tantilov> Tested-by: Aaron Brown <aaron.f.brown> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher> Signed-off-by: David S. Miller <davem> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 6a0c1b6..c1d72c0 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -3739,9 +3739,8 @@ static void igb_set_rx_mode(struct net_device *netdev) rctl &= ~(E1000_RCTL_UPE | E1000_RCTL_MPE | E1000_RCTL_VFE); if (netdev->flags & IFF_PROMISC) { - u32 mrqc = rd32(E1000_MRQC); /* retain VLAN HW filtering if in VT mode */ - if (mrqc & E1000_MRQC_ENABLE_VMDQ) + if (adapter->vfs_allocated_count) rctl |= E1000_RCTL_VFE; rctl |= (E1000_RCTL_UPE | E1000_RCTL_MPE); vmolr |= (E1000_VMOLR_ROPE | E1000_VMOLR_MPME); I don't think it's really material for stable (aka 3.10.x) so either you move to 3.11/3.9 or someone from the fedora kernel maintainers includes this one in an update. cheers, Michele
We can look at grabbing that soon.
we moved to 3.9.11 and issue is solved, thanks for your help.
Fedora 19 has been rebased to 3.11.1 in git. An update should make it out with the patch mentioned in comment #11 soon.
Hi, all OK running 3.11.1-200.fc19.i686.PAE, can be closed, thanks a lot
Thank you.