Bug 481283 - [RHEL5.3] Original ether's status is keeping PROMISC MULTICAST mode
[RHEL5.3] Original ether's status is keeping PROMISC MULTICAST mode
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel (Show other bugs)
5.3
All Linux
urgent Severity urgent
: rc
: ---
Assigned To: Neil Horman
Red Hat Kernel QE team
:
Depends On:
Blocks: 483701
  Show dependency treegraph
 
Reported: 2009-01-23 07:33 EST by Flavio Leitner
Modified: 2011-05-27 13:03 EDT (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 481292 (view as bug list)
Environment:
Last Closed: 2009-09-02 04:10:15 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
5c15bdec5c38f4ccf73ef2585fc80a6164de9554.patch (23.07 KB, patch)
2009-01-23 07:37 EST, Flavio Leitner
no flags Details | Diff
8c979c26a0f093c13290320edda799d8335e50ae.patch (7.67 KB, patch)
2009-01-23 07:38 EST, Flavio Leitner
no flags Details | Diff
4417da668c0021903464f92db278ddae348e0299.patch (9.46 KB, patch)
2009-01-23 07:38 EST, Flavio Leitner
no flags Details | Diff
patch by customer based on kernel-2.6.18-92.1.6.el5.ia64 (6.93 KB, patch)
2009-01-23 07:41 EST, Flavio Leitner
no flags Details | Diff
patch to track and undo promisc count on vlans (2.11 KB, patch)
2009-05-18 14:49 EDT, Neil Horman
no flags Details | Diff
vlan-with-fixes-1.patch (2.30 KB, patch)
2009-05-19 21:20 EDT, Flavio Leitner
no flags Details | Diff
vlan patch with gflags fixed (3.35 KB, patch)
2009-05-25 17:14 EDT, Flavio Leitner
no flags Details | Diff

  None (edit)
Description Flavio Leitner 2009-01-23 07:33:32 EST
Description of problem:
Though  removing vlan ether device,
original ether's status is keeping PROMISC MULTICAST mode.

How reproducible:
Always

Steps to Reproduce:
1.Create new vlan ether.
# vconfig etadd eth2 100
# ifconfig eth2.100
eth2.100  Link encap:Ethernet  HWaddr 00:00:87:E2:D4:02
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
# cat /sys/class/net/eth2/flags
0x1003

2.Change new ether's MAC address.
# ifconfig eth2.100 hw ether 22.:22:22:44:44:44
# ifconfig eth2.100
eth2.100  Link encap:Ethernet  HWaddr 22:22:22:44:44:44  
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

3.Check eth2's status.
# ifconfig eth2
eth2      Link encap:Ethernet  HWaddr 00:00:87:E2:D4:02  
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1 <<<
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Memory:80100000-80120000
# cat /sys/class/net/eth2.100/flags
0x1002
# cat /sys/calass/net/eth2/flags
0x1103  <<< PROMISC MULTICAST BIT ON

4.Delete vlan ether
# vconfig rem eth2.100
# ifconfig eth2.
eth2      Link encap:Ethernet  HWaddr 00:00:87:E2:D4:02  
          UP BROADCAST RUNNING PROMISC MULTICAST  MTU:1500  Metric:1 << STILL PROMISC MULTICAST
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Memory:80100000-80120000
# cat /proc/sys/class/net/bond/eth2/flags
0x1103 <<<
# ifconfig eth2 -rpromisc
# ifconfig eth2
eth2      Link encap:Ethernet  HWaddr 00:00:87:E2:D4:02  
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1  <<< RELEASE PROMISC MULTICAST
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Memory:80100000-80120000

# cat /sys/class/net/eth2/flags
0x1103         <<< STILL PROMISC MULTICAST

Actual results:
Not release PROMISC MULTICAST

Expected results:
Release PROMISC MULTICAST

Environment:

e1000: eth2: e1000_watchdog_task: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>


Additional info:
Our problem have already fixed on current upstream kernel.

--- snip ---
[VLAN]: Fix MAC address handling

The VLAN MAC address handling is broken in multiple ways. When the address
differs when setting it, the real device is put in promiscous mode twice,
but never taken out again. Additionally it doesn't resync when the real
device's address is changed and needlessly puts it in promiscous mode when
the vlan device is still down.

Fix by moving address handling to vlan_dev_open/vlan_dev_stop and properly
deal with address changes in the device notifier. Also switch to
dev_unicast_add (which needs the exact same handling).

Since the set_mac_address handler is identical to the generic ethernet one
with these changes, kill it and use ether_setup().
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=8c979c26a0f093c13290320edda799d8335e50ae
Comment 1 Flavio Leitner 2009-01-23 07:35:41 EST
 4. State specific action requested of SEG
   I have looked in the git. Below is upstream patch requested by the vendor:
     8c979c26a0f093c13290320edda799d8335e50ae
   Apparently this may need some more patches as:
     (vlan_group_get_device)
     5c15bdec5c38f4ccf73ef2585fc80a6164de9554
     (dev_unicast_delete/dev_unicast_add)
     4417da668c0021903464f92db278ddae348e0299
   
Issue escalated to Support Engineering Group by: tumeya.
Comment 2 Flavio Leitner 2009-01-23 07:37:53 EST
Created attachment 329816 [details]
5c15bdec5c38f4ccf73ef2585fc80a6164de9554.patch
Comment 3 Flavio Leitner 2009-01-23 07:38:31 EST
Created attachment 329817 [details]
8c979c26a0f093c13290320edda799d8335e50ae.patch
Comment 4 Flavio Leitner 2009-01-23 07:38:56 EST
Created attachment 329818 [details]
4417da668c0021903464f92db278ddae348e0299.patch
Comment 5 Flavio Leitner 2009-01-23 07:41:36 EST
Created attachment 329819 [details]
patch by customer based on kernel-2.6.18-92.1.6.el5.ia64
Comment 6 RHEL Product and Program Management 2009-02-16 10:25:23 EST
Updating PM score.
Comment 8 Neil Horman 2009-05-16 13:08:14 EDT
I'll try this myself on monday, but just in case you already have test results, is the problem reproducible upstream (or on an F10/F11 system)?
Comment 9 Neil Horman 2009-05-18 13:19:12 EDT
I'm going to try this on an upstream kernel to see if there is already a fix in existance that we can use.  I should note however, that the state change on the promisc bit is a bit suspect, but the multicast bit should not have any relevance to this problem.  All the multicast bit indicates is that the interface can send and recieve multicast frames, and for hardware that supports it, that flag should be set automatically, and only cleared administratively .
Comment 10 Neil Horman 2009-05-18 14:49:38 EDT
Created attachment 344499 [details]
patch to track and undo promisc count on vlans

Ok, looks like upstream fixed this via  a pretty big rewrite on how we handle unicast lists.  Can you try this patch, it seems somewhat more consice to me.  I've not tested it yet, but it should work just fine.

Also, I noticed in the reprouder given, the customer tried to undo the promisc flag by setting -rpromisc.  Not sure if that was just a typo, but it should be -promisc instead.  Either way, if lots of vlans are being used, it would still be broken, as the promisc count will be off (you would have to issue that command several times to clear the flag)
Comment 11 Flavio Leitner 2009-05-19 21:20:54 EDT
Created attachment 344731 [details]
vlan-with-fixes-1.patch

I did try fedora 10 running 2.6.10-rc6 and it worked fine. Actually, because of vlan_dev_set_mac_address() uses dev_unicast_*() it doesn't set promisc anymore.

I also tried -128.el5 and could see the problem happening then I applied your patch with two extra changes. One is a simple typo and another is moving  'VLAN_DEV_INFO(dev)->promisc_count * -1;' to inside of if clause, otherwise dev is NULL and it would oops.
...
 	struct net_device *dev = NULL;
 	int ret;
-
+	int undo_count = VLAN_DEV_INFO(dev)->promisc_count * -1;
 
 	dev = dev_get_by_name(vlan_IF_name);
...
modified patch is attached in case you need to check something.

Anyway, it didn't work because on -128.el5 does:
vlan_dev_set_mac_address()
  /* Increment our in-use promiscuity counter */
  dev_set_promiscuity(VLAN_DEV_INFO(dev)->real_dev, 1); <-- increments 1
  VLAN_DEV_INFO(dev)->promisc_count++;                  <-- patch counts that
  flgs |= IFF_PROMISC;
  dev_change_flags(VLAN_DEV_INFO(dev)->real_dev, flgs);
  which does:
        if ((flags ^ dev->gflags) & IFF_PROMISC) {
                int inc = (flags & IFF_PROMISC) ? +1 : -1;
                dev->gflags ^= IFF_PROMISC;
                dev_set_promiscuity(dev, inc);   <--- increments another one.
        }

so, the counter ends with 2. Removing vlan device leaves the counter with 1.
Flavio
Comment 12 Neil Horman 2009-05-19 22:16:19 EDT
Ok, so you identified a spot I missed, thats great.  Did you test with the obvious fix for the missed spot, and did it work? :)
Comment 13 Flavio Leitner 2009-05-21 18:12:55 EDT
It will work regarding with dev->flags but I'm not finding how it will work with dev->gflags.

vlan_ioctl_handler()
   unregister_vlan_device()
      dev_set_promiscuity() <-- flags only.

Another thing, changing macaddr will set promisc, then if you change the macaddr back to the real_dev's macaddr, it will leave promisc flag set on both flags and gflags too. See vlan_dev_set_mac_address().

Flavio
Comment 14 Neil Horman 2009-05-21 20:32:54 EDT
yes, you found another problem.

You took the time to analyze it.  Thank you, again now, please take the extra 5 minutes to fix it.  You should just need to check the promiscuity value of the physical dev after chaning its promiscuity and change gflags if it returns to zero.
Comment 15 Flavio Leitner 2009-05-25 17:14:17 EDT
Created attachment 345357 [details]
vlan patch with gflags fixed

Another patch.

On vlan_dev_set_mac_address() if there is no IFF_PROMISC in flags, call dev_change_flags() and set gflags if needed. Otherwise assume flags and gflags are okay and just increment one.

On unregister_vlan_device() undo vlan promisc changes and if flags has left without IFF_PROMISC takes it out from gflags too.

works here.
Flavio
Comment 16 Neil Horman 2009-05-26 06:41:43 EDT
Thank you, yes.  This looks good to me.  I'll post it shortly.   Thank you for fixing up those missing bits.
Comment 19 RHEL Product and Program Management 2009-05-28 10:10:25 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 32 Don Zickus 2009-07-07 11:04:56 EDT
in kernel-2.6.18-157.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.
Comment 37 errata-xmlrpc 2009-09-02 04:10:15 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-1243.html

Note You need to log in before you can comment on or make changes to this bug.