Description of problem: Though removing vlan ether device, original ether's status is keeping PROMISC MULTICAST mode. How reproducible: Always Steps to Reproduce: 1.Create new vlan ether. # vconfig etadd eth2 100 # ifconfig eth2.100 eth2.100 Link encap:Ethernet HWaddr 00:00:87:E2:D4:02 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) # cat /sys/class/net/eth2/flags 0x1003 2.Change new ether's MAC address. # ifconfig eth2.100 hw ether 22.:22:22:44:44:44 # ifconfig eth2.100 eth2.100 Link encap:Ethernet HWaddr 22:22:22:44:44:44 BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) 3.Check eth2's status. # ifconfig eth2 eth2 Link encap:Ethernet HWaddr 00:00:87:E2:D4:02 UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 <<< RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Memory:80100000-80120000 # cat /sys/class/net/eth2.100/flags 0x1002 # cat /sys/calass/net/eth2/flags 0x1103 <<< PROMISC MULTICAST BIT ON 4.Delete vlan ether # vconfig rem eth2.100 # ifconfig eth2. eth2 Link encap:Ethernet HWaddr 00:00:87:E2:D4:02 UP BROADCAST RUNNING PROMISC MULTICAST MTU:1500 Metric:1 << STILL PROMISC MULTICAST RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Memory:80100000-80120000 # cat /proc/sys/class/net/bond/eth2/flags 0x1103 <<< # ifconfig eth2 -rpromisc # ifconfig eth2 eth2 Link encap:Ethernet HWaddr 00:00:87:E2:D4:02 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 <<< RELEASE PROMISC MULTICAST RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Memory:80100000-80120000 # cat /sys/class/net/eth2/flags 0x1103 <<< STILL PROMISC MULTICAST Actual results: Not release PROMISC MULTICAST Expected results: Release PROMISC MULTICAST Environment: e1000: eth2: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None 802.1Q VLAN Support v1.8 Ben Greear <greearb> Additional info: Our problem have already fixed on current upstream kernel. --- snip --- [VLAN]: Fix MAC address handling The VLAN MAC address handling is broken in multiple ways. When the address differs when setting it, the real device is put in promiscous mode twice, but never taken out again. Additionally it doesn't resync when the real device's address is changed and needlessly puts it in promiscous mode when the vlan device is still down. Fix by moving address handling to vlan_dev_open/vlan_dev_stop and properly deal with address changes in the device notifier. Also switch to dev_unicast_add (which needs the exact same handling). Since the set_mac_address handler is identical to the generic ethernet one with these changes, kill it and use ether_setup(). http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=8c979c26a0f093c13290320edda799d8335e50ae
4. State specific action requested of SEG I have looked in the git. Below is upstream patch requested by the vendor: 8c979c26a0f093c13290320edda799d8335e50ae Apparently this may need some more patches as: (vlan_group_get_device) 5c15bdec5c38f4ccf73ef2585fc80a6164de9554 (dev_unicast_delete/dev_unicast_add) 4417da668c0021903464f92db278ddae348e0299 Issue escalated to Support Engineering Group by: tumeya.
Created attachment 329816 [details] 5c15bdec5c38f4ccf73ef2585fc80a6164de9554.patch
Created attachment 329817 [details] 8c979c26a0f093c13290320edda799d8335e50ae.patch
Created attachment 329818 [details] 4417da668c0021903464f92db278ddae348e0299.patch
Created attachment 329819 [details] patch by customer based on kernel-2.6.18-92.1.6.el5.ia64
Updating PM score.
I'll try this myself on monday, but just in case you already have test results, is the problem reproducible upstream (or on an F10/F11 system)?
I'm going to try this on an upstream kernel to see if there is already a fix in existance that we can use. I should note however, that the state change on the promisc bit is a bit suspect, but the multicast bit should not have any relevance to this problem. All the multicast bit indicates is that the interface can send and recieve multicast frames, and for hardware that supports it, that flag should be set automatically, and only cleared administratively .
Created attachment 344499 [details] patch to track and undo promisc count on vlans Ok, looks like upstream fixed this via a pretty big rewrite on how we handle unicast lists. Can you try this patch, it seems somewhat more consice to me. I've not tested it yet, but it should work just fine. Also, I noticed in the reprouder given, the customer tried to undo the promisc flag by setting -rpromisc. Not sure if that was just a typo, but it should be -promisc instead. Either way, if lots of vlans are being used, it would still be broken, as the promisc count will be off (you would have to issue that command several times to clear the flag)
Created attachment 344731 [details] vlan-with-fixes-1.patch I did try fedora 10 running 2.6.10-rc6 and it worked fine. Actually, because of vlan_dev_set_mac_address() uses dev_unicast_*() it doesn't set promisc anymore. I also tried -128.el5 and could see the problem happening then I applied your patch with two extra changes. One is a simple typo and another is moving 'VLAN_DEV_INFO(dev)->promisc_count * -1;' to inside of if clause, otherwise dev is NULL and it would oops. ... struct net_device *dev = NULL; int ret; - + int undo_count = VLAN_DEV_INFO(dev)->promisc_count * -1; dev = dev_get_by_name(vlan_IF_name); ... modified patch is attached in case you need to check something. Anyway, it didn't work because on -128.el5 does: vlan_dev_set_mac_address() /* Increment our in-use promiscuity counter */ dev_set_promiscuity(VLAN_DEV_INFO(dev)->real_dev, 1); <-- increments 1 VLAN_DEV_INFO(dev)->promisc_count++; <-- patch counts that flgs |= IFF_PROMISC; dev_change_flags(VLAN_DEV_INFO(dev)->real_dev, flgs); which does: if ((flags ^ dev->gflags) & IFF_PROMISC) { int inc = (flags & IFF_PROMISC) ? +1 : -1; dev->gflags ^= IFF_PROMISC; dev_set_promiscuity(dev, inc); <--- increments another one. } so, the counter ends with 2. Removing vlan device leaves the counter with 1. Flavio
Ok, so you identified a spot I missed, thats great. Did you test with the obvious fix for the missed spot, and did it work? :)
It will work regarding with dev->flags but I'm not finding how it will work with dev->gflags. vlan_ioctl_handler() unregister_vlan_device() dev_set_promiscuity() <-- flags only. Another thing, changing macaddr will set promisc, then if you change the macaddr back to the real_dev's macaddr, it will leave promisc flag set on both flags and gflags too. See vlan_dev_set_mac_address(). Flavio
yes, you found another problem. You took the time to analyze it. Thank you, again now, please take the extra 5 minutes to fix it. You should just need to check the promiscuity value of the physical dev after chaning its promiscuity and change gflags if it returns to zero.
Created attachment 345357 [details] vlan patch with gflags fixed Another patch. On vlan_dev_set_mac_address() if there is no IFF_PROMISC in flags, call dev_change_flags() and set gflags if needed. Otherwise assume flags and gflags are okay and just increment one. On unregister_vlan_device() undo vlan promisc changes and if flags has left without IFF_PROMISC takes it out from gflags too. works here. Flavio
Thank you, yes. This looks good to me. I'll post it shortly. Thank you for fixing up those missing bits.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
in kernel-2.6.18-157.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-1243.html