Bug 702479 - Addition of bond1 to modules and network-scripts causes kernel panic
Summary: Addition of bond1 to modules and network-scripts causes kernel panic
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.1
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: rc
: ---
Assignee: Andy Gospodarek
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-05-05 19:06 UTC by Greg Charles
Modified: 2014-06-29 23:03 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-08-30 20:20:33 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Greg Charles 2011-05-05 19:06:19 UTC
Description of problem:
Hardware:  Proliant BL465c G7 Blade
Ethernet devices eth0 - eth5 configured.  Initial interface is configured using VLAN tagging; eth0/eth1 are bonded with ifcfg-bond0 and ifcfg-bond0.56.  Attempting to create another bonded interface, bond1, using eth2/eth3.  (Using RedHat doc http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Deployment_Guide/sec-Using_Channel_Bonding.html as a reference.)  After adding necessary entries to /etc/modprobe.d/bonding.conf and creating ifcfg-bond1 and altering ifcfg-eth2 and ifcfg-eth3 as slaves, then rebooting, kernel will panic.

Version-Release number of selected component (if applicable):
net-tools-1.60-104.el6.x86_64


How reproducible:
Input necessary entries into /etc/modprobe.d/bonding.conf;
Add file /etc/sysconfig/network-scripts/ifcfg-bond1
Alter /etc/sysconfig/network-scripts/ifcfg-eth2 and ifcfg-eth3 as slaves to bond1
Reboot

Steps to Reproduce:
1. Add "alias bond1 bonding" to /etc/modprobe.d/bonding.conf
2. Create /etc/sysconfig/network-scripts/ifcfg-bond1:
     DEVICE="bond0.56"
     BOOTPROTO="static"
     IPADDR="1.2.3.4"
     NETMASK="255.255.252.0"
     ONBOOT="yes"
     BONDING_OPTS="mode=1 miimon=200"
3. Alter /etc/sysconfig/network-scripts/ifcfg-eth2 (and eth3) as follows:
     DEVICE="eth2"
     HWADDR="78:AC:C0:FE:09:FC"
     MASTER="bond1"
     SLAVE="yes"
     ONBOOT="yes"
4. Reboot (init 6)
  
Actual results:
Kernel will panic with the following message at console:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa04912e0>] bond_inet6addr_event+0x90/0x220 [bonding]
PGD 0
Oops: 0000 [#1] SMP
las sysfs file: /sys/devices/virtual/net/bond1/bonding/slaves
CPU 12
Modules linked in: bonding 8021q garp stp llc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state


Expected results:
bond1 is created as an extra bonded interface using eth2/eth3 as slaves.

Additional results:
I can create the bond1 interface manually using the above steps and then "ifup bond1"; then interface will be created with no panic.  However, it will not survive a reboot.  


Additional info:
Public interface is using bond0 and since this is a blade server plugged into a chassis using Flex10 fabric, the public interface is VLAN-tagged, so the interface is bond0.56.  That is working as expected.  When bond1 (non-VLAN-tagged) is introduced, the kernel will panic on every boot.

Comment 2 RHEL Program Management 2011-05-06 06:00:26 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.

Comment 3 Greg Charles 2011-05-16 17:28:23 UTC
Tried to create the second bonded interface using eth2 and eth3 and create it as another VLAN tagged device by using default tag # of "0" (bond1.0).  Server still panicked on boot with the following error:

BUG: unable to handle kernel NULL pointer dereference 0000000000000020
IP: [<ffffffffa04962e0>] bond_inet6addr_event+0x90x0x220 [bonding]
PGD 0
Oops: 0000 [#1]

Comment 4 Andy Gospodarek 2011-08-10 15:05:37 UTC
(In reply to comment #0)
> Description of problem:
> Hardware:  Proliant BL465c G7 Blade
> Ethernet devices eth0 - eth5 configured.  Initial interface is configured using
> VLAN tagging; eth0/eth1 are bonded with ifcfg-bond0 and ifcfg-bond0.56. 
> Attempting to create another bonded interface, bond1, using eth2/eth3.  (Using
> RedHat doc
> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Deployment_Guide/sec-Using_Channel_Bonding.html
> as a reference.)  After adding necessary entries to
> /etc/modprobe.d/bonding.conf and creating ifcfg-bond1 and altering ifcfg-eth2
> and ifcfg-eth3 as slaves, then rebooting, kernel will panic.
> 
> Version-Release number of selected component (if applicable):
> net-tools-1.60-104.el6.x86_64
> 
> 
> How reproducible:
> Input necessary entries into /etc/modprobe.d/bonding.conf;
> Add file /etc/sysconfig/network-scripts/ifcfg-bond1
> Alter /etc/sysconfig/network-scripts/ifcfg-eth2 and ifcfg-eth3 as slaves to
> bond1
> Reboot
> 
> Steps to Reproduce:
> 1. Add "alias bond1 bonding" to /etc/modprobe.d/bonding.conf
> 2. Create /etc/sysconfig/network-scripts/ifcfg-bond1:
>      DEVICE="bond0.56"

Are you sure you meant to put "bond0.56" here.  It seems like it should say "bond1".

Comment 5 Greg Charles 2011-08-10 15:22:30 UTC
Unfortunately the environment where I started this configuration had to be rebuilt for other deployments, so I do not have that original configuratio handy.  However, I will be installing 4 new G7 blades in the near future using RHEL 6.1 and RHCS, and I will configure them as stated above to address the issue of whether to enter bond0.56 or bond1.  The URL I had referred to for how to set up VLAN tagging was http://www.cyberciti.biz/tips/howto-configure-linux-virtual-local-area-network-vlan.html.  

BTW, yhe bond0.56 interface worked without a problem; it was the introduction of a second bonded interface, no matter what I named it, that caused the panic.

Comment 6 Andy Gospodarek 2011-08-10 15:40:32 UTC
(In reply to comment #5)
> Unfortunately the environment where I started this configuration had to be
> rebuilt for other deployments, so I do not have that original configuratio
> handy.  However, I will be installing 4 new G7 blades in the near future using
> RHEL 6.1 and RHCS, and I will configure them as stated above to address the
> issue of whether to enter bond0.56 or bond1. 

I asked because I think you may be hitting an odd issue because you are naming the device in the ifcfg-bond1 file that is the same as the vlan interface for bond0 might be a problem.  I'm still investigating.

Comment 7 Andy Gospodarek 2011-08-10 17:14:05 UTC
Greg, do you remember what were you running when you saw this?  If this was a fresh RHEL6.1 install that is fine, but I'm seeing a different version of net-tools (and you didn't indicate what kernel was running), so I wasn't sure.

Comment 8 Greg Charles 2011-08-10 17:19:03 UTC
Kernel version is 2.6.32-131.2.1.el6.x86_64

Comment 9 Andy Gospodarek 2011-08-10 18:09:57 UTC
That's interesting.  I was ready to close this as a duplicate of 683496, but the fix for that bug was added in 2.6.32-128.el6, so it must not be that.  Odd that the part of the backtrace you have pasted is exactly the same:

(from https://bugzilla.redhat.com/show_bug.cgi?id=683496#c0)

BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa04ff2e0>] bond_inet6addr_event+0x90/0x220 [bonding]
PGD 3bb63067 PUD 3bba9067 PMD 0 
Oops: 0000 [#1] SMP 
last sysfs file: /sys/devices/virtual/net/bla0/bridge/forward_delay
CPU 0 
Modules linked in: 8021q garp bonding be2iscsi iscsi_boot_sysfs bnx2i cnic uio
cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa
ib_mad ip6table_filter ib_core ip6_tables ib_addr iscsi_tcp libiscsi_tcp
libiscsi ebtable_nat scsi_transport_iscsi ebtables ipt_MASQUERADE iptable_nat
nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT
xt_CHECKSUM iptable_mangle bridge stp llc autofs4 sunrpc ipv6 xt_physdev
iptable_filter ip_tables dm_mirror dm_region_hash dm_log dm_multipath vhost_net
macvtap macvlan tun microcode virtio_console virtio_balloon snd_intel8x0
snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore
snd_page_alloc virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk
virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last
unloaded: speedstep_lib]

Modules linked in: 8021q garp bonding be2iscsi iscsi_boot_sysfs bnx2i cnic uio
cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa
ib_mad ip6table_filter ib_core ip6_tables ib_addr iscsi_tcp libiscsi_tcp
libiscsi ebtable_nat scsi_transport_iscsi ebtables ipt_MASQUERADE iptable_nat
nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT
xt_CHECKSUM iptable_mangle bridge stp llc autofs4 sunrpc ipv6 xt_physdev
iptable_filter ip_tables dm_mirror dm_region_hash dm_log dm_multipath vhost_net
macvtap macvlan tun microcode virtio_console virtio_balloon snd_intel8x0
snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore
snd_page_alloc virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk
virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last
unloaded: speedstep_lib]
Pid: 3164, comm: ip Not tainted 2.6.32-117.el6.x86_64 #1 Bochs
RIP: 0010:[<ffffffffa04ff2e0>]  [<ffffffffa04ff2e0>]
bond_inet6addr_event+0x90/0x220 [bonding]
RSP: 0018:ffff88003bb65588  EFLAGS: 00010246
RAX: ffff88003d1a7280 RBX: ffff88003ce9ea40 RCX: ffff88003ac38140
RDX: ffff88003ce5aec0 RSI: 0000000000000001 RDI: ffff88003bbea000
RBP: ffff88003bb65598 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: ffff88003ce9e6c0 R12: ffffffffa0506aa0
R13: 00000000ffffffff R14: 0000000000000000 R15: 0000000000000000
FS:  00007fb2df94d700(0000) GS:ffff88000a200000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 000000003b8be000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ip (pid: 3164, threadinfo ffff88003bb64000, task ffff88003d08cb00)
Stack:
 0000000000000001 ffff88003ce5aec0 ffff88003bb655d8 ffffffff814de605
<0> ffff88003bb655e8 0000000000000000 ffff88003bb30c00 ffff88003bb30c08
<0> 00000000ffffffff 0000000000000002 ffff88003bb655e8 ffffffff814de66a
Call Trace:
 [<ffffffff814de605>] notifier_call_chain+0x55/0x80
 [<ffffffff814de66a>] atomic_notifier_call_chain+0x1a/0x20
 [<ffffffffa0278968>] ipv6_add_addr+0x1a8/0x450 [ipv6]
 [<ffffffffa027cb0a>] addrconf_add_linklocal+0x4a/0xa0 [ipv6]
 [<ffffffffa027ebcd>] addrconf_notify+0x52d/0x980 [ipv6]
 [<ffffffff8144a21a>] ? rt_cache_flush+0x2a/0x210
 [<ffffffff8142baeb>] ? rtmsg_ifinfo+0x18b/0x260
 [<ffffffff814de605>] notifier_call_chain+0x55/0x80
 [<ffffffff81093c46>] raw_notifier_call_chain+0x16/0x20
 [<ffffffff8141c84b>] call_netdevice_notifiers+0x1b/0x20
 [<ffffffff8141d51e>] dev_open+0xce/0x100
 [<ffffffff8141cc21>] dev_change_flags+0xa1/0x1d0
 [<ffffffff81429c75>] do_setlink+0x1f5/0x860
 [<ffffffffa04f08eb>] ? bond_get_stats+0x17b/0x190 [bonding]
 [<ffffffff81277194>] ? nla_parse+0x34/0x110
 [<ffffffff8142a72e>] rtnl_newlink+0x44e/0x530
 [<ffffffff8115a88b>] ? __kmalloc_node+0x7b/0x100
 [<ffffffff81429860>] rtnetlink_rcv_msg+0x1e0/0x220
 [<ffffffff81429680>] ? rtnetlink_rcv_msg+0x0/0x220
 [<ffffffff81442c59>] netlink_rcv_skb+0xa9/0xd0
 [<ffffffff81429665>] rtnetlink_rcv+0x25/0x40
 [<ffffffff814428be>] netlink_unicast+0x2de/0x2f0
 [<ffffffff81443250>] netlink_sendmsg+0x200/0x2e0
 [<ffffffff8140a4a3>] ? sock_recvmsg+0x133/0x160
 [<ffffffff8140b97e>] sock_sendmsg+0x11e/0x150
 [<ffffffff8108dce0>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8115394a>] ? alloc_pages_current+0x9a/0x100
 [<ffffffff8140a0b4>] ? move_addr_to_kernel+0x64/0x70
 [<ffffffff81415786>] ? verify_iovec+0x66/0xd0
 [<ffffffff8140bc43>] sys_sendmsg+0x233/0x3a0
 [<ffffffff81041364>] ? __do_page_fault+0x1e4/0x480
 [<ffffffff8113ce9c>] ? do_brk+0x26c/0x350
 [<ffffffff810d1572>] ? audit_syscall_entry+0x272/0x2a0
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b
Code: 80 03 00 00 49 8d 9b 80 03 00 00 eb 3e 66 0f 1f 44 00 00 44 0f b7 50 14
4d 8b 83 90 03 00 00 45 89 d1 66 41 c1 e9 09 45 0f b7 c9 <4f> 8b 4c c8 20 45 31
c0 4d 85 c9 74 0b 41 81 e2 ff 01 00 00 4f 
RIP  [<ffffffffa04ff2e0>] bond_inet6addr_event+0x90/0x220 [bonding]
 RSP <ffff88003bb65588>
CR2: 0000000000000020

Comment 10 Andy Gospodarek 2011-08-10 19:32:43 UTC
I was able to reproduce this once with the part of the patch that I suspected would resolve this from bug 683496, but on shutdown.  Trying to see how easy it is to reproduce again.

Here is the log:

bonding: bond0: releasing active interface eth1
[  OK  ]
Shutting down interface bond1:  bonding: bond1: Removing slave eth4
bonding: bond1: releasing active interface eth4
BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa01ff790>] bond_inet6addr_event+0x90/0x220 [bonding]
PGD c6b4e6067 PUD c6e752067 PMD 0 
Oops: 0000 [#1] SMP 
last sysfs file: /sys/devices/virtual/net/bond1/bonding/arp_ip_target
CPU 37 
Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table 8021q garp stp llc bonding(U) ipv6 dm_mirror dm_region_hash dm_log microcode ghes hed i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core sg igb dca ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ahci megaraid_sas dm_mod [last unloaded: scsi_wait_scan]

Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table 8021q garp stp llc bonding(U) ipv6 dm_mirror dm_region_hash dm_log microcode ghes hed i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core sg igb dca ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ahci megaraid_sas dm_mod [last unloaded: scsi_wait_scan]
Pid: 6869, comm: ip Tainted: G   M       ----------------   2.6.32-131.0.15.el6.x86_64 #1 QSSC-S4R
RIP: 0010:[<ffffffffa01ff790>]  [<ffffffffa01ff790>] bond_inet6addr_event+0x90/0x220 [bonding]
RSP: 0018:ffff880465831988  EFLAGS: 00010256
RAX: ffff880c6e7547c0 RBX: ffff88086bcb9a88 RCX: ffff88086bcb9a88
RDX: ffff88046591d5c0 RSI: 0000000000000002 RDI: ffff88086bc05000
RBP: ffff880465831998 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: ffff88086bcb9700 R12: ffff88086bc05810
R13: 00000000ffffffff R14: 0000000000000000 R15: 0000000000000000
FS:  00007f8b21cb7700(0000) GS:ffff88088e640000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000020 CR3: 0000000c6d24e000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process ip (pid: 6869, threadinfo ffff880465830000, task ffff88046b07ab40)
Stack:
 0000000000000002 ffff88046591d5c0 ffff8804658319d8 ffffffff814e0cf5
<0> ffff88046591d5c0 ffff88046591d5c0 00000000fffd8111 0000000000000000
<0> ffff88046591d501 ffff88046591d5dc ffff8804658319e8 ffffffff814e0d5a
Call Trace:
 [<ffffffff814e0cf5>] notifier_call_chain+0x55/0x80
 [<ffffffff814e0d5a>] atomic_notifier_call_chain+0x1a/0x20
 [<ffffffffa0195198>] ipv6_del_addr+0x1c8/0x4a0 [ipv6]
 [<ffffffffa0196ab8>] inet6_addr_del+0xd8/0x120 [ipv6]
 [<ffffffffa0196b84>] inet6_rtm_deladdr+0x84/0x90 [ipv6]
 [<ffffffff8142bbb0>] rtnetlink_rcv_msg+0x1e0/0x220
 [<ffffffff8142b9d0>] ? rtnetlink_rcv_msg+0x0/0x220
 [<ffffffff81444fc9>] netlink_rcv_skb+0xa9/0xd0
 [<ffffffff8142b9b5>] rtnetlink_rcv+0x25/0x40
 [<ffffffff81444c2e>] netlink_unicast+0x2de/0x2f0
 [<ffffffff814455c0>] netlink_sendmsg+0x200/0x2e0
 [<ffffffff8140dabe>] sock_sendmsg+0x11e/0x150
 [<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40
 [<ffffffff8140be83>] ? move_addr_to_user+0x93/0xb0
 [<ffffffff8140c00e>] ? __sys_recvmsg+0x16e/0x2f0
 [<ffffffff810415d4>] ? __do_page_fault+0x1e4/0x480
 [<ffffffff8140e029>] sys_sendto+0x139/0x190
 [<ffffffff810d1b62>] ? audit_syscall_entry+0x272/0x2a0
 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b
Code: 88 03 00 00 49 8d 9b 88 03 00 00 eb 3e 66 0f 1f 44 00 00 44 0f b7 50 14 4d 8b 83 98 03 00 00 45 89 d1 66 41 c1 e9 09 45 0f b7 c9 <4f> 8b 4c c8 20 45 31 c0 4d 85 c9 74 0b 41 81 e2 ff 01 00 00 4f 
RIP  [<ffffffffa01ff790>] bond_inet6addr_event+0x90/0x220 [bonding]
 RSP <ffff880465831988>
CR2: 0000000000000020

Comment 11 Andy Gospodarek 2011-08-11 19:45:12 UTC
It looks like this is easy to reproduce with a RHEL6.1 kernel with the bits that should fix this reverted, but I was *unable* to reproduce with a stock 6.1 kernel.

It may not change the fact that you can reproduce a panic, but I want to point out a few things from the configuration that should be changed.

1.  I see in an email conversation that these lines are in bonding.conf:

alias bond0 bonding
alias bond0.56 bonding
alias bond1 bonding
alias bond1.42 bonding

The aliases for the bond0.56 and bond1.42 are not needed.  Remove those.

2.  Make sure that BONDING_OPTS is *not* set in ifcfg-bond0.56 and ifcfg-bond1.42. It is not needed as VLANs are just stacked on top of the existing interfaces and do not need the same parameters.

3.  As Matthew pointed out in the email, the names of the ifcfg files should match the device entries.

So ifcfg-bond0 should list DEVICE=bond0, ifcfg-bond1 should list DEVICE=bond1, ifcfg-bond0.56 should list DEVICE=bond0.56, etc.

Comment 12 Andy Gospodarek 2011-08-11 19:47:01 UTC
Here's a full log of my most recent panic.  You can see the bonding driver is version 3.6.0.1, which is my modified driver with a patch reverted.

Bringing up loopback interface:  [  OK  ]
Bringing up interface bond0:  type=1400 audit(1313034184.438:4): avc:  denied  { sys_module } for  pid=4996 comm="ethtool" capability=16  scontext=system_u:system_r:ifconfig_t:s0 tcontext=system_u:system_r:ifconfig_t:s0 tclass=capability
Ethernet Channel Bonding Driver: v3.6.0.1 (September 26, 2009)
bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details.
bonding: bond0 is being created...
bonding: cannot add bond bond0; already exists
bonding: Bond creation failed.
bonding: bond0: setting mode to active-backup (1).
bonding: bond0: Setting MII monitoring interval to 200.
ADDRCONF(NETDEV_UP): bond0: link is not ready
bonding: bond0: Adding slave eth1.
bonding: bond0: enslaving eth1 as a backup interface with a down link.
igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
bonding: bond0: link status definitely up for interface eth1.
bonding: bond0: making interface eth1 the new active one.
bonding: bond0: first active interface up!
ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[  OK  ]
Bringing up interface bond1:  type=1400 audit(1313034188.964:5): avc:  denied  { sys_module } for  pid=5110 comm="ethtool" capability=16  scontext=system_u:system_r:ifconfig_t:s0 tcontext=system_u:system_r:ifconfig_t:s0 tclass=capability
bonding: bond1 is being created...
bonding: bond1: setting mode to active-backup (1).
bonding: bond1: Setting MII monitoring interval to 200.
ADDRCONF(NETDEV_UP): bond1: link is not ready
bonding: bond1: Adding slave eth4.
bonding: bond1: enslaving eth4 as a backup interface with a down link.
igb: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
bonding: bond1: link status definitely up for interface eth4.
bonding: bond1: making interface eth4 the new active one.
bonding: bond1: first active interface up!
ADDRCONF(NETDEV_CHANGE): bond1: link becomes ready
[  OK  ]
Bringing up interface eth0:
Determining IP information for eth0...igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
ADDRCONF(NETDEV_UP): eth0: link is not ready
ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
 done.
[  OK  ]
Bringing up interface bond0.100:  802.1Q VLAN Support v1.8 Ben Greear <greearb>
All bugs added by David S. Miller <davem>
8021q: adding VLAN 0 to HW filter on device eth0
8021q: adding VLAN 0 to HW filter on device eth1
8021q: adding VLAN 0 to HW filter on device eth4
8021q: adding VLAN 0 to HW filter on device bond0
8021q: adding VLAN 0 to HW filter on device bond1
BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
IP: [<ffffffffa01ff790>] bond_inet6addr_event+0x90/0x220 [bonding]
PGD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/virtual/net/bond1/broadcast
CPU 0
Modules linked in: 8021q garp stp llc bonding(U) ipv6 dm_mirror dm_region_hash dm_log microcode ghes hed i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core sg igb dca ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ahci megaraid_sas dm_mod [last unloaded: scsi_wait_scan]

Modules linked in: 8021q garp stp llc bonding(U) ipv6 dm_mirror dm_region_hash dm_log microcode ghes hed i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core sg igb dca ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ahci megaraid_sas dm_mod [last unloaded: scsi_wait_scan]
Pid: 0, comm: swapper Tainted: G   M       ----------------   2.6.32-131.0.15.el6.x86_64 #1 QSSC-S4R
RIP: 0010:[<ffffffffa01ff790>]  [<ffffffffa01ff790>] bond_inet6addr_event+0x90/0x220 [bonding]
RSP: 0018:ffff8800366037b0  EFLAGS: 00010256
RAX: ffff88046cba2b40 RBX: ffff88106dc69a88 RCX: ffff88106dc69a88
RDX: ffff88046c12b380 RSI: 0000000000000001 RDI: ffff88106bc1c000
RBP: ffff8800366037c0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: ffff88106dc69700 R12: ffffffffa02070a0
R13: 00000000ffffffff R14: 0000000000000000 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff880036600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000020 CR3: 0000000868f0d000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a2d020)
Stack:
 0000000000000001 ffff88046c12b380 ffff880036603800 ffffffff814e0cf5
<0> ffff880036603810 ffff88106440d980 ffff88086dd84c00 ffff88086dd84c08
<0> 00000000ffffffff 000000000000000e ffff880036603810 ffffffff814e0d5a
Call Trace:
 <IRQ>
 [<ffffffff814e0cf5>] notifier_call_chain+0x55/0x80
 [<ffffffff814e0d5a>] atomic_notifier_call_chain+0x1a/0x20
 [<ffffffffa0191968>] ipv6_add_addr+0x1a8/0x450 [ipv6]
 [<ffffffffa01962d2>] addrconf_prefix_rcv+0x292/0x9a0 [ipv6]
 [<ffffffff81429b57>] ? neigh_update+0x247/0x540
 [<ffffffffa01a37ed>] ndisc_rcv+0x63d/0x1070 [ipv6]
 [<ffffffff814ddc79>] ? _write_unlock_bh+0x19/0x20
 [<ffffffff81416ba6>] ? skb_checksum+0x56/0x2e0
 [<ffffffff81417afd>] ? __skb_checksum_complete_head+0x1d/0x70
 [<ffffffffa01aabd8>] icmpv6_rcv+0x3f8/0x840 [ipv6]
 [<ffffffffa019c189>] ? ip6_pol_route_input+0x19/0x20 [ipv6]
 [<ffffffffa01bf9e1>] ? fib6_rule_action+0xd1/0x1f0 [ipv6]
 [<ffffffffa019c170>] ? ip6_pol_route_input+0x0/0x20 [ipv6]
 [<ffffffff814ddc95>] ? _read_unlock_bh+0x15/0x20
 [<ffffffffa01ae105>] ? ipv6_chk_mcast_addr+0x85/0x190 [ipv6]
 [<ffffffffa019087a>] ip6_input_finish+0x16a/0x410 [ipv6]
 [<ffffffffa0190b78>] ip6_input+0x58/0x60 [ipv6]
 [<ffffffffa0190c98>] ip6_mc_input+0x118/0x250 [ipv6]
 [<ffffffffa019025f>] ip6_rcv_finish+0x3f/0x50 [ipv6]
 [<ffffffffa01905f8>] ipv6_rcv+0x388/0x4a0 [ipv6]
 [<ffffffff8141dd6b>] __netif_receive_skb+0x39b/0x6b0
 [<ffffffff81420048>] netif_receive_skb+0x58/0x60
 [<ffffffff81420150>] napi_skb_finish+0x50/0x70
 [<ffffffff81422489>] napi_gro_receive+0x39/0x50
 [<ffffffffa0118d27>] igb_poll+0x887/0xe60 [igb]
 [<ffffffff8100df89>] ? handle_irq+0x49/0xa0
 [<ffffffff8106f4c8>] ? irq_exit+0x48/0x90
 [<ffffffff814e3505>] ? do_IRQ+0x75/0xf0
 [<ffffffff814225a3>] net_rx_action+0x103/0x2f0
 [<ffffffff8106f717>] __do_softirq+0xb7/0x1e0
 [<ffffffff810d6940>] ? handle_IRQ_event+0x60/0x170
 [<ffffffff8100c2cc>] call_softirq+0x1c/0x30
 [<ffffffff8100df05>] do_softirq+0x65/0xa0
 [<ffffffff8106f505>] irq_exit+0x85/0x90
 [<ffffffff814e3505>] do_IRQ+0x75/0xf0
 [<ffffffff8100bad3>] ret_from_intr+0x0/0x11
 <EOI>
 [<ffffffff812bb86e>] ? intel_idle+0xde/0x170
 [<ffffffff812bb851>] ? intel_idle+0xc1/0x170
 [<ffffffff8109519d>] ? sched_clock_cpu+0xcd/0x110
 [<ffffffff813eccb7>] cpuidle_idle_call+0xa7/0x140
 [<ffffffff81009e96>] cpu_idle+0xb6/0x110
 [<ffffffff814c376a>] rest_init+0x7a/0x80
 [<ffffffff81bbdf28>] start_kernel+0x41d/0x429
 [<ffffffff81bbd33a>] x86_64_start_reservations+0x125/0x129
 [<ffffffff81bbd438>] x86_64_start_kernel+0xfa/0x109

Comment 13 Andy Gospodarek 2011-08-15 13:25:57 UTC
Greg, if you can please post the log showing what was happening with bonding and vlans as well as the full panic log (similar to what I posted in comment #12) on 2.6.32-131-based kernel that would probably be helpful.  Though the panic trace looks the same, I would like to see what the system was doing when this hit.

Can you also post the information from 'modinfo bonding' on that system once it is installed again?

Thanks!

Comment 14 Andy Gospodarek 2011-08-30 20:20:33 UTC
I'm closing this as INSUFFICIENT DATA I have a lot of bugs just hanging in limbo like this.

Please reopen if you are still having the problem and I'll be happy to take a look.

Thanks!

Comment 15 Greg Charles 2011-08-31 11:19:10 UTC
My apologies for not updating this report sooner, but we have finally installed some blades in our infrastructure and re-tested creation of the bond1 interface in addition to the bond0 VLAN config, following Andy's matching device names in the config files recommendation.  There were no problems in either creation of the device or subsequent reboots.

Comment 16 Andy Gospodarek 2011-08-31 12:41:54 UTC
No problem, Greg.  Thanks for the update.

Glad to hear all is working well!


Note You need to log in before you can comment on or make changes to this bug.