Hide Forgot
Description of problem: Hardware: Proliant BL465c G7 Blade Ethernet devices eth0 - eth5 configured. Initial interface is configured using VLAN tagging; eth0/eth1 are bonded with ifcfg-bond0 and ifcfg-bond0.56. Attempting to create another bonded interface, bond1, using eth2/eth3. (Using RedHat doc http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Deployment_Guide/sec-Using_Channel_Bonding.html as a reference.) After adding necessary entries to /etc/modprobe.d/bonding.conf and creating ifcfg-bond1 and altering ifcfg-eth2 and ifcfg-eth3 as slaves, then rebooting, kernel will panic. Version-Release number of selected component (if applicable): net-tools-1.60-104.el6.x86_64 How reproducible: Input necessary entries into /etc/modprobe.d/bonding.conf; Add file /etc/sysconfig/network-scripts/ifcfg-bond1 Alter /etc/sysconfig/network-scripts/ifcfg-eth2 and ifcfg-eth3 as slaves to bond1 Reboot Steps to Reproduce: 1. Add "alias bond1 bonding" to /etc/modprobe.d/bonding.conf 2. Create /etc/sysconfig/network-scripts/ifcfg-bond1: DEVICE="bond0.56" BOOTPROTO="static" IPADDR="1.2.3.4" NETMASK="255.255.252.0" ONBOOT="yes" BONDING_OPTS="mode=1 miimon=200" 3. Alter /etc/sysconfig/network-scripts/ifcfg-eth2 (and eth3) as follows: DEVICE="eth2" HWADDR="78:AC:C0:FE:09:FC" MASTER="bond1" SLAVE="yes" ONBOOT="yes" 4. Reboot (init 6) Actual results: Kernel will panic with the following message at console: BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa04912e0>] bond_inet6addr_event+0x90/0x220 [bonding] PGD 0 Oops: 0000 [#1] SMP las sysfs file: /sys/devices/virtual/net/bond1/bonding/slaves CPU 12 Modules linked in: bonding 8021q garp stp llc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state Expected results: bond1 is created as an extra bonded interface using eth2/eth3 as slaves. Additional results: I can create the bond1 interface manually using the above steps and then "ifup bond1"; then interface will be created with no panic. However, it will not survive a reboot. Additional info: Public interface is using bond0 and since this is a blade server plugged into a chassis using Flex10 fabric, the public interface is VLAN-tagged, so the interface is bond0.56. That is working as expected. When bond1 (non-VLAN-tagged) is introduced, the kernel will panic on every boot.
Since RHEL 6.1 External Beta has begun, and this bug remains unresolved, it has been rejected as it is not proposed as exception or blocker. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux.
Tried to create the second bonded interface using eth2 and eth3 and create it as another VLAN tagged device by using default tag # of "0" (bond1.0). Server still panicked on boot with the following error: BUG: unable to handle kernel NULL pointer dereference 0000000000000020 IP: [<ffffffffa04962e0>] bond_inet6addr_event+0x90x0x220 [bonding] PGD 0 Oops: 0000 [#1]
(In reply to comment #0) > Description of problem: > Hardware: Proliant BL465c G7 Blade > Ethernet devices eth0 - eth5 configured. Initial interface is configured using > VLAN tagging; eth0/eth1 are bonded with ifcfg-bond0 and ifcfg-bond0.56. > Attempting to create another bonded interface, bond1, using eth2/eth3. (Using > RedHat doc > http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Deployment_Guide/sec-Using_Channel_Bonding.html > as a reference.) After adding necessary entries to > /etc/modprobe.d/bonding.conf and creating ifcfg-bond1 and altering ifcfg-eth2 > and ifcfg-eth3 as slaves, then rebooting, kernel will panic. > > Version-Release number of selected component (if applicable): > net-tools-1.60-104.el6.x86_64 > > > How reproducible: > Input necessary entries into /etc/modprobe.d/bonding.conf; > Add file /etc/sysconfig/network-scripts/ifcfg-bond1 > Alter /etc/sysconfig/network-scripts/ifcfg-eth2 and ifcfg-eth3 as slaves to > bond1 > Reboot > > Steps to Reproduce: > 1. Add "alias bond1 bonding" to /etc/modprobe.d/bonding.conf > 2. Create /etc/sysconfig/network-scripts/ifcfg-bond1: > DEVICE="bond0.56" Are you sure you meant to put "bond0.56" here. It seems like it should say "bond1".
Unfortunately the environment where I started this configuration had to be rebuilt for other deployments, so I do not have that original configuratio handy. However, I will be installing 4 new G7 blades in the near future using RHEL 6.1 and RHCS, and I will configure them as stated above to address the issue of whether to enter bond0.56 or bond1. The URL I had referred to for how to set up VLAN tagging was http://www.cyberciti.biz/tips/howto-configure-linux-virtual-local-area-network-vlan.html. BTW, yhe bond0.56 interface worked without a problem; it was the introduction of a second bonded interface, no matter what I named it, that caused the panic.
(In reply to comment #5) > Unfortunately the environment where I started this configuration had to be > rebuilt for other deployments, so I do not have that original configuratio > handy. However, I will be installing 4 new G7 blades in the near future using > RHEL 6.1 and RHCS, and I will configure them as stated above to address the > issue of whether to enter bond0.56 or bond1. I asked because I think you may be hitting an odd issue because you are naming the device in the ifcfg-bond1 file that is the same as the vlan interface for bond0 might be a problem. I'm still investigating.
Greg, do you remember what were you running when you saw this? If this was a fresh RHEL6.1 install that is fine, but I'm seeing a different version of net-tools (and you didn't indicate what kernel was running), so I wasn't sure.
Kernel version is 2.6.32-131.2.1.el6.x86_64
That's interesting. I was ready to close this as a duplicate of 683496, but the fix for that bug was added in 2.6.32-128.el6, so it must not be that. Odd that the part of the backtrace you have pasted is exactly the same: (from https://bugzilla.redhat.com/show_bug.cgi?id=683496#c0) BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa04ff2e0>] bond_inet6addr_event+0x90/0x220 [bonding] PGD 3bb63067 PUD 3bba9067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/virtual/net/bla0/bridge/forward_delay CPU 0 Modules linked in: 8021q garp bonding be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ip6table_filter ib_core ip6_tables ib_addr iscsi_tcp libiscsi_tcp libiscsi ebtable_nat scsi_transport_iscsi ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle bridge stp llc autofs4 sunrpc ipv6 xt_physdev iptable_filter ip_tables dm_mirror dm_region_hash dm_log dm_multipath vhost_net macvtap macvlan tun microcode virtio_console virtio_balloon snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speedstep_lib] Modules linked in: 8021q garp bonding be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i cxgb4 cxgb3i libcxgbi cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ip6table_filter ib_core ip6_tables ib_addr iscsi_tcp libiscsi_tcp libiscsi ebtable_nat scsi_transport_iscsi ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle bridge stp llc autofs4 sunrpc ipv6 xt_physdev iptable_filter ip_tables dm_mirror dm_region_hash dm_log dm_multipath vhost_net macvtap macvlan tun microcode virtio_console virtio_balloon snd_intel8x0 snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc virtio_net i2c_piix4 i2c_core ext4 mbcache jbd2 virtio_blk virtio_pci virtio_ring virtio pata_acpi ata_generic ata_piix dm_mod [last unloaded: speedstep_lib] Pid: 3164, comm: ip Not tainted 2.6.32-117.el6.x86_64 #1 Bochs RIP: 0010:[<ffffffffa04ff2e0>] [<ffffffffa04ff2e0>] bond_inet6addr_event+0x90/0x220 [bonding] RSP: 0018:ffff88003bb65588 EFLAGS: 00010246 RAX: ffff88003d1a7280 RBX: ffff88003ce9ea40 RCX: ffff88003ac38140 RDX: ffff88003ce5aec0 RSI: 0000000000000001 RDI: ffff88003bbea000 RBP: ffff88003bb65598 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: ffff88003ce9e6c0 R12: ffffffffa0506aa0 R13: 00000000ffffffff R14: 0000000000000000 R15: 0000000000000000 FS: 00007fb2df94d700(0000) GS:ffff88000a200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000020 CR3: 000000003b8be000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process ip (pid: 3164, threadinfo ffff88003bb64000, task ffff88003d08cb00) Stack: 0000000000000001 ffff88003ce5aec0 ffff88003bb655d8 ffffffff814de605 <0> ffff88003bb655e8 0000000000000000 ffff88003bb30c00 ffff88003bb30c08 <0> 00000000ffffffff 0000000000000002 ffff88003bb655e8 ffffffff814de66a Call Trace: [<ffffffff814de605>] notifier_call_chain+0x55/0x80 [<ffffffff814de66a>] atomic_notifier_call_chain+0x1a/0x20 [<ffffffffa0278968>] ipv6_add_addr+0x1a8/0x450 [ipv6] [<ffffffffa027cb0a>] addrconf_add_linklocal+0x4a/0xa0 [ipv6] [<ffffffffa027ebcd>] addrconf_notify+0x52d/0x980 [ipv6] [<ffffffff8144a21a>] ? rt_cache_flush+0x2a/0x210 [<ffffffff8142baeb>] ? rtmsg_ifinfo+0x18b/0x260 [<ffffffff814de605>] notifier_call_chain+0x55/0x80 [<ffffffff81093c46>] raw_notifier_call_chain+0x16/0x20 [<ffffffff8141c84b>] call_netdevice_notifiers+0x1b/0x20 [<ffffffff8141d51e>] dev_open+0xce/0x100 [<ffffffff8141cc21>] dev_change_flags+0xa1/0x1d0 [<ffffffff81429c75>] do_setlink+0x1f5/0x860 [<ffffffffa04f08eb>] ? bond_get_stats+0x17b/0x190 [bonding] [<ffffffff81277194>] ? nla_parse+0x34/0x110 [<ffffffff8142a72e>] rtnl_newlink+0x44e/0x530 [<ffffffff8115a88b>] ? __kmalloc_node+0x7b/0x100 [<ffffffff81429860>] rtnetlink_rcv_msg+0x1e0/0x220 [<ffffffff81429680>] ? rtnetlink_rcv_msg+0x0/0x220 [<ffffffff81442c59>] netlink_rcv_skb+0xa9/0xd0 [<ffffffff81429665>] rtnetlink_rcv+0x25/0x40 [<ffffffff814428be>] netlink_unicast+0x2de/0x2f0 [<ffffffff81443250>] netlink_sendmsg+0x200/0x2e0 [<ffffffff8140a4a3>] ? sock_recvmsg+0x133/0x160 [<ffffffff8140b97e>] sock_sendmsg+0x11e/0x150 [<ffffffff8108dce0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8115394a>] ? alloc_pages_current+0x9a/0x100 [<ffffffff8140a0b4>] ? move_addr_to_kernel+0x64/0x70 [<ffffffff81415786>] ? verify_iovec+0x66/0xd0 [<ffffffff8140bc43>] sys_sendmsg+0x233/0x3a0 [<ffffffff81041364>] ? __do_page_fault+0x1e4/0x480 [<ffffffff8113ce9c>] ? do_brk+0x26c/0x350 [<ffffffff810d1572>] ? audit_syscall_entry+0x272/0x2a0 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b Code: 80 03 00 00 49 8d 9b 80 03 00 00 eb 3e 66 0f 1f 44 00 00 44 0f b7 50 14 4d 8b 83 90 03 00 00 45 89 d1 66 41 c1 e9 09 45 0f b7 c9 <4f> 8b 4c c8 20 45 31 c0 4d 85 c9 74 0b 41 81 e2 ff 01 00 00 4f RIP [<ffffffffa04ff2e0>] bond_inet6addr_event+0x90/0x220 [bonding] RSP <ffff88003bb65588> CR2: 0000000000000020
I was able to reproduce this once with the part of the patch that I suspected would resolve this from bug 683496, but on shutdown. Trying to see how easy it is to reproduce again. Here is the log: bonding: bond0: releasing active interface eth1 [ OK ] Shutting down interface bond1: bonding: bond1: Removing slave eth4 bonding: bond1: releasing active interface eth4 BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa01ff790>] bond_inet6addr_event+0x90/0x220 [bonding] PGD c6b4e6067 PUD c6e752067 PMD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/virtual/net/bond1/bonding/arp_ip_target CPU 37 Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table 8021q garp stp llc bonding(U) ipv6 dm_mirror dm_region_hash dm_log microcode ghes hed i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core sg igb dca ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ahci megaraid_sas dm_mod [last unloaded: scsi_wait_scan] Modules linked in: autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table 8021q garp stp llc bonding(U) ipv6 dm_mirror dm_region_hash dm_log microcode ghes hed i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core sg igb dca ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ahci megaraid_sas dm_mod [last unloaded: scsi_wait_scan] Pid: 6869, comm: ip Tainted: G M ---------------- 2.6.32-131.0.15.el6.x86_64 #1 QSSC-S4R RIP: 0010:[<ffffffffa01ff790>] [<ffffffffa01ff790>] bond_inet6addr_event+0x90/0x220 [bonding] RSP: 0018:ffff880465831988 EFLAGS: 00010256 RAX: ffff880c6e7547c0 RBX: ffff88086bcb9a88 RCX: ffff88086bcb9a88 RDX: ffff88046591d5c0 RSI: 0000000000000002 RDI: ffff88086bc05000 RBP: ffff880465831998 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: ffff88086bcb9700 R12: ffff88086bc05810 R13: 00000000ffffffff R14: 0000000000000000 R15: 0000000000000000 FS: 00007f8b21cb7700(0000) GS:ffff88088e640000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000020 CR3: 0000000c6d24e000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process ip (pid: 6869, threadinfo ffff880465830000, task ffff88046b07ab40) Stack: 0000000000000002 ffff88046591d5c0 ffff8804658319d8 ffffffff814e0cf5 <0> ffff88046591d5c0 ffff88046591d5c0 00000000fffd8111 0000000000000000 <0> ffff88046591d501 ffff88046591d5dc ffff8804658319e8 ffffffff814e0d5a Call Trace: [<ffffffff814e0cf5>] notifier_call_chain+0x55/0x80 [<ffffffff814e0d5a>] atomic_notifier_call_chain+0x1a/0x20 [<ffffffffa0195198>] ipv6_del_addr+0x1c8/0x4a0 [ipv6] [<ffffffffa0196ab8>] inet6_addr_del+0xd8/0x120 [ipv6] [<ffffffffa0196b84>] inet6_rtm_deladdr+0x84/0x90 [ipv6] [<ffffffff8142bbb0>] rtnetlink_rcv_msg+0x1e0/0x220 [<ffffffff8142b9d0>] ? rtnetlink_rcv_msg+0x0/0x220 [<ffffffff81444fc9>] netlink_rcv_skb+0xa9/0xd0 [<ffffffff8142b9b5>] rtnetlink_rcv+0x25/0x40 [<ffffffff81444c2e>] netlink_unicast+0x2de/0x2f0 [<ffffffff814455c0>] netlink_sendmsg+0x200/0x2e0 [<ffffffff8140dabe>] sock_sendmsg+0x11e/0x150 [<ffffffff8108e160>] ? autoremove_wake_function+0x0/0x40 [<ffffffff8140be83>] ? move_addr_to_user+0x93/0xb0 [<ffffffff8140c00e>] ? __sys_recvmsg+0x16e/0x2f0 [<ffffffff810415d4>] ? __do_page_fault+0x1e4/0x480 [<ffffffff8140e029>] sys_sendto+0x139/0x190 [<ffffffff810d1b62>] ? audit_syscall_entry+0x272/0x2a0 [<ffffffff8100b172>] system_call_fastpath+0x16/0x1b Code: 88 03 00 00 49 8d 9b 88 03 00 00 eb 3e 66 0f 1f 44 00 00 44 0f b7 50 14 4d 8b 83 98 03 00 00 45 89 d1 66 41 c1 e9 09 45 0f b7 c9 <4f> 8b 4c c8 20 45 31 c0 4d 85 c9 74 0b 41 81 e2 ff 01 00 00 4f RIP [<ffffffffa01ff790>] bond_inet6addr_event+0x90/0x220 [bonding] RSP <ffff880465831988> CR2: 0000000000000020
It looks like this is easy to reproduce with a RHEL6.1 kernel with the bits that should fix this reverted, but I was *unable* to reproduce with a stock 6.1 kernel. It may not change the fact that you can reproduce a panic, but I want to point out a few things from the configuration that should be changed. 1. I see in an email conversation that these lines are in bonding.conf: alias bond0 bonding alias bond0.56 bonding alias bond1 bonding alias bond1.42 bonding The aliases for the bond0.56 and bond1.42 are not needed. Remove those. 2. Make sure that BONDING_OPTS is *not* set in ifcfg-bond0.56 and ifcfg-bond1.42. It is not needed as VLANs are just stacked on top of the existing interfaces and do not need the same parameters. 3. As Matthew pointed out in the email, the names of the ifcfg files should match the device entries. So ifcfg-bond0 should list DEVICE=bond0, ifcfg-bond1 should list DEVICE=bond1, ifcfg-bond0.56 should list DEVICE=bond0.56, etc.
Here's a full log of my most recent panic. You can see the bonding driver is version 3.6.0.1, which is my modified driver with a patch reverted. Bringing up loopback interface: [ OK ] Bringing up interface bond0: type=1400 audit(1313034184.438:4): avc: denied { sys_module } for pid=4996 comm="ethtool" capability=16 scontext=system_u:system_r:ifconfig_t:s0 tcontext=system_u:system_r:ifconfig_t:s0 tclass=capability Ethernet Channel Bonding Driver: v3.6.0.1 (September 26, 2009) bonding: Warning: either miimon or arp_interval and arp_ip_target module parameters must be specified, otherwise bonding will not detect link failures! see bonding.txt for details. bonding: bond0 is being created... bonding: cannot add bond bond0; already exists bonding: Bond creation failed. bonding: bond0: setting mode to active-backup (1). bonding: bond0: Setting MII monitoring interval to 200. ADDRCONF(NETDEV_UP): bond0: link is not ready bonding: bond0: Adding slave eth1. bonding: bond0: enslaving eth1 as a backup interface with a down link. igb: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX bonding: bond0: link status definitely up for interface eth1. bonding: bond0: making interface eth1 the new active one. bonding: bond0: first active interface up! ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready [ OK ] Bringing up interface bond1: type=1400 audit(1313034188.964:5): avc: denied { sys_module } for pid=5110 comm="ethtool" capability=16 scontext=system_u:system_r:ifconfig_t:s0 tcontext=system_u:system_r:ifconfig_t:s0 tclass=capability bonding: bond1 is being created... bonding: bond1: setting mode to active-backup (1). bonding: bond1: Setting MII monitoring interval to 200. ADDRCONF(NETDEV_UP): bond1: link is not ready bonding: bond1: Adding slave eth4. bonding: bond1: enslaving eth4 as a backup interface with a down link. igb: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX bonding: bond1: link status definitely up for interface eth4. bonding: bond1: making interface eth4 the new active one. bonding: bond1: first active interface up! ADDRCONF(NETDEV_CHANGE): bond1: link becomes ready [ OK ] Bringing up interface eth0: Determining IP information for eth0...igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX ADDRCONF(NETDEV_UP): eth0: link is not ready ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready done. [ OK ] Bringing up interface bond0.100: 802.1Q VLAN Support v1.8 Ben Greear <greearb> All bugs added by David S. Miller <davem> 8021q: adding VLAN 0 to HW filter on device eth0 8021q: adding VLAN 0 to HW filter on device eth1 8021q: adding VLAN 0 to HW filter on device eth4 8021q: adding VLAN 0 to HW filter on device bond0 8021q: adding VLAN 0 to HW filter on device bond1 BUG: unable to handle kernel NULL pointer dereference at 0000000000000020 IP: [<ffffffffa01ff790>] bond_inet6addr_event+0x90/0x220 [bonding] PGD 0 Oops: 0000 [#1] SMP last sysfs file: /sys/devices/virtual/net/bond1/broadcast CPU 0 Modules linked in: 8021q garp stp llc bonding(U) ipv6 dm_mirror dm_region_hash dm_log microcode ghes hed i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core sg igb dca ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ahci megaraid_sas dm_mod [last unloaded: scsi_wait_scan] Modules linked in: 8021q garp stp llc bonding(U) ipv6 dm_mirror dm_region_hash dm_log microcode ghes hed i2c_i801 i2c_core iTCO_wdt iTCO_vendor_support ioatdma i7core_edac edac_core sg igb dca ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif ahci megaraid_sas dm_mod [last unloaded: scsi_wait_scan] Pid: 0, comm: swapper Tainted: G M ---------------- 2.6.32-131.0.15.el6.x86_64 #1 QSSC-S4R RIP: 0010:[<ffffffffa01ff790>] [<ffffffffa01ff790>] bond_inet6addr_event+0x90/0x220 [bonding] RSP: 0018:ffff8800366037b0 EFLAGS: 00010256 RAX: ffff88046cba2b40 RBX: ffff88106dc69a88 RCX: ffff88106dc69a88 RDX: ffff88046c12b380 RSI: 0000000000000001 RDI: ffff88106bc1c000 RBP: ffff8800366037c0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: ffff88106dc69700 R12: ffffffffa02070a0 R13: 00000000ffffffff R14: 0000000000000000 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff880036600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000020 CR3: 0000000868f0d000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a2d020) Stack: 0000000000000001 ffff88046c12b380 ffff880036603800 ffffffff814e0cf5 <0> ffff880036603810 ffff88106440d980 ffff88086dd84c00 ffff88086dd84c08 <0> 00000000ffffffff 000000000000000e ffff880036603810 ffffffff814e0d5a Call Trace: <IRQ> [<ffffffff814e0cf5>] notifier_call_chain+0x55/0x80 [<ffffffff814e0d5a>] atomic_notifier_call_chain+0x1a/0x20 [<ffffffffa0191968>] ipv6_add_addr+0x1a8/0x450 [ipv6] [<ffffffffa01962d2>] addrconf_prefix_rcv+0x292/0x9a0 [ipv6] [<ffffffff81429b57>] ? neigh_update+0x247/0x540 [<ffffffffa01a37ed>] ndisc_rcv+0x63d/0x1070 [ipv6] [<ffffffff814ddc79>] ? _write_unlock_bh+0x19/0x20 [<ffffffff81416ba6>] ? skb_checksum+0x56/0x2e0 [<ffffffff81417afd>] ? __skb_checksum_complete_head+0x1d/0x70 [<ffffffffa01aabd8>] icmpv6_rcv+0x3f8/0x840 [ipv6] [<ffffffffa019c189>] ? ip6_pol_route_input+0x19/0x20 [ipv6] [<ffffffffa01bf9e1>] ? fib6_rule_action+0xd1/0x1f0 [ipv6] [<ffffffffa019c170>] ? ip6_pol_route_input+0x0/0x20 [ipv6] [<ffffffff814ddc95>] ? _read_unlock_bh+0x15/0x20 [<ffffffffa01ae105>] ? ipv6_chk_mcast_addr+0x85/0x190 [ipv6] [<ffffffffa019087a>] ip6_input_finish+0x16a/0x410 [ipv6] [<ffffffffa0190b78>] ip6_input+0x58/0x60 [ipv6] [<ffffffffa0190c98>] ip6_mc_input+0x118/0x250 [ipv6] [<ffffffffa019025f>] ip6_rcv_finish+0x3f/0x50 [ipv6] [<ffffffffa01905f8>] ipv6_rcv+0x388/0x4a0 [ipv6] [<ffffffff8141dd6b>] __netif_receive_skb+0x39b/0x6b0 [<ffffffff81420048>] netif_receive_skb+0x58/0x60 [<ffffffff81420150>] napi_skb_finish+0x50/0x70 [<ffffffff81422489>] napi_gro_receive+0x39/0x50 [<ffffffffa0118d27>] igb_poll+0x887/0xe60 [igb] [<ffffffff8100df89>] ? handle_irq+0x49/0xa0 [<ffffffff8106f4c8>] ? irq_exit+0x48/0x90 [<ffffffff814e3505>] ? do_IRQ+0x75/0xf0 [<ffffffff814225a3>] net_rx_action+0x103/0x2f0 [<ffffffff8106f717>] __do_softirq+0xb7/0x1e0 [<ffffffff810d6940>] ? handle_IRQ_event+0x60/0x170 [<ffffffff8100c2cc>] call_softirq+0x1c/0x30 [<ffffffff8100df05>] do_softirq+0x65/0xa0 [<ffffffff8106f505>] irq_exit+0x85/0x90 [<ffffffff814e3505>] do_IRQ+0x75/0xf0 [<ffffffff8100bad3>] ret_from_intr+0x0/0x11 <EOI> [<ffffffff812bb86e>] ? intel_idle+0xde/0x170 [<ffffffff812bb851>] ? intel_idle+0xc1/0x170 [<ffffffff8109519d>] ? sched_clock_cpu+0xcd/0x110 [<ffffffff813eccb7>] cpuidle_idle_call+0xa7/0x140 [<ffffffff81009e96>] cpu_idle+0xb6/0x110 [<ffffffff814c376a>] rest_init+0x7a/0x80 [<ffffffff81bbdf28>] start_kernel+0x41d/0x429 [<ffffffff81bbd33a>] x86_64_start_reservations+0x125/0x129 [<ffffffff81bbd438>] x86_64_start_kernel+0xfa/0x109
Greg, if you can please post the log showing what was happening with bonding and vlans as well as the full panic log (similar to what I posted in comment #12) on 2.6.32-131-based kernel that would probably be helpful. Though the panic trace looks the same, I would like to see what the system was doing when this hit. Can you also post the information from 'modinfo bonding' on that system once it is installed again? Thanks!
I'm closing this as INSUFFICIENT DATA I have a lot of bugs just hanging in limbo like this. Please reopen if you are still having the problem and I'll be happy to take a look. Thanks!
My apologies for not updating this report sooner, but we have finally installed some blades in our infrastructure and re-tested creation of the bond1 interface in addition to the bond0 VLAN config, following Andy's matching device names in the config files recommendation. There were no problems in either creation of the device or subsequent reboots.
No problem, Greg. Thanks for the update. Glad to hear all is working well!