Bug 145959
Summary: | kernel BUG at net/core/skbuff.c:228! | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Xavier Mertens <xavier> |
Component: | kernel | Assignee: | John W. Linville <linville> |
Status: | CLOSED WONTFIX | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | rawhide | CC: | davej, wtogami, xavier |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-05-13 15:25:35 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Xavier Mertens
2005-01-24 13:29:25 UTC
[root@rproxy2 ~]# ifconfig eth0 down Warning: kfree_skb passed an skb still on a list (from c0267d93). ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:228! invalid operand: 0000 [#1] SMP Modules linked in: ipt_state ip_conntrack ipt_LOG iptable_filter ip_tables md5 ipv6 i2c_dev i2c_core dm_mod video button battery ac uhci_hcd ehci_hcd e1000 floppy sg ext3 jbd megaraid_mbox megaraid_mm sd_mod scsi_mod CPU: 1 EIP: 0060:[<c0267cb8>] Not tainted VLI EFLAGS: 00010206 (2.6.10-1.741_FC3smp) EIP is at __kfree_skb+0x19/0xf7 eax: 00000045 ebx: f79c33b8 ecx: c66cceb8 edx: c02ef542 esi: f79c3240 edi: f88ad1f4 ebp: 00000019 esp: c66cceb4 ds: 007b es: 007b ss: 0068 Process ifconfig (pid: 20127, threadinfo=c66cc000 task=c3fc9a40) Stack: c02ef542 c0267d93 dd8a8c80 f79c33b8 f88e5627 00000100 000001f4 f79c3240 f79c3000 00001003 00000000 f88e4121 f88e4a1a f79c3240 00001042 f88e4a25 f79c3000 c026c38a f79c3000 c026d444 f7205780 ffffff9d f72057ac f7446080 Call Trace: [<c0267d93>] __kfree_skb+0xf4/0xf7 [<f88e5627>] e1000_clean_rx_ring+0x48/0xf2 [e1000] [<f88e4121>] e1000_down+0x82/0xc1 [e1000] [<f88e4a1a>] e1000_close+0x0/0x1d [e1000] [<f88e4a25>] e1000_close+0xb/0x1d [e1000] [<c026c38a>] dev_close+0x57/0x77 [<c026d444>] dev_change_flags+0x48/0xee [<c02a2345>] devinet_ioctl+0x26e/0x4de [<c02a3e3d>] inet_ioctl+0x79/0xa5 [<c026522b>] sock_ioctl+0x22a/0x238 [<c0160adb>] sys_ioctl+0x1d5/0x1f2 [<c0103c97>] syscall_call+0x7/0xb Code: e8 93 ff ff ff 89 da 5b a1 58 2e 43 c0 e9 5b 75 ed ff 53 52 89 04 24 83 78 08 00 74 18 ff 74 24 fc 68 42 f5 2e c0 e8 a5 62 eb ff <0f> 0b e4 00 0c f5 2e c0 59 5b 8b 04 24 8b 58 30 85 db 74 2c 8b Segmentation fault Hi, Less than 24 hours later, same problem: new kernel panic! Here is the dump: Unable to handle kernel NULL pointer dereference at virtual address 0000000c printing eip: f89710e0 *pde = 375d6001 Oops: 0000 [#1] SMP Modules linked in: ipt_state ip_conntrack ipt_LOG iptable_filter ip_tables md5 ipv6 i2c_dev i2c_core dm_mod video button battery ac uhci_hcd ehci_hcd e1000 floppy sg ext3 jbd megaraid_mbox megaraid_mm sd_mod scsi_mod CPU: 1 EIP: 0060:[<f89710e0>] Not tainted VLI EFLAGS: 00010246 (2.6.10-1.741_FC3smp) EIP is at ipt_do_table+0xc4/0x2fc [ip_tables] eax: 00000000 ebx: 00000000 ecx: 00000000 edx: 00000000 esi: f8b9c04c edi: e0585020 ebp: 00000070 esp: c03ace50 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c03ac000 task=f7f48530) Stack: f8b9bc08 f8b9b180 f8976080 f7547000 ffffffff 00000000 00000000 f7547000 00000001 c03aced8 00000000 c03aced8 c03acedc c0433b88 00000001 f893b017 00000000 f893bac0 00000000 f893bb00 c027476f 00000000 c0282153 c03aced8 Call Trace: [<f893b017>] ipt_hook+0x17/0x1c [iptable_filter] [<c027476f>] nf_iterate+0x40/0x81 [<c0282153>] ip_local_deliver_finish+0x0/0x188 [<c0274a6d>] nf_hook_slow+0x47/0xb4 [<c0282153>] ip_local_deliver_finish+0x0/0x188 [<c028214c>] ip_local_deliver+0x1d7/0x1de [<c0282153>] ip_local_deliver_finish+0x0/0x188 [<c0282891>] ip_rcv_finish+0x1b7/0x202 [<c0274aa9>] nf_hook_slow+0x83/0xb4 [<c0282695>] ip_rcv+0x3ba/0x3ff [<c02826da>] ip_rcv_finish+0x0/0x202 [<c026cda7>] netif_receive_skb+0x1de/0x20c [<f88e75d7>] e1000_clean_rx_irq+0x2fe/0x36b [e1000] [<f88e6fba>] e1000_clean+0x3d/0xaf [e1000] [<c026cf33>] net_rx_action+0x61/0xd8 [<c0121f60>] __do_softirq+0x4c/0xb1 [<c0105d9f>] do_softirq+0x41/0x48 ======================= [<c0105cd0>] do_IRQ+0x74/0x7e [<c010467e>] common_interrupt+0x1a/0x20 [<c01020e8>] mwait_idle+0x33/0x42 [<c01020a0>] cpu_idle+0x26/0x3b Code: 54 24 20 8b 7c 24 04 03 7c 91 20 03 74 91 0c 89 3c 24 8b 44 24 24 8b 10 8b 46 54 09 82 84 00 00 00 8b 54 24 14 8b 0e 0f b6 5e 53 <8b> 42 0c 8b 56 08 f6 c3 08 74 0c 21 d0 39 c8 0f 84 e7 01 00 00 <0>Kernel panic - not syncing: Fatal exception in interrupt Let's try an updated e1000 driver...available here: http://people.redhat.com/linville/kernels/fc3/ Please give that a try and see if it works any better for you...let me know...thanks! Xavier, any word on the results of testing the kernels w/ the update e1000 driver? Hi John, I was quite busy until today... :-/ I'll test your patched kernel tomorrow or today. I keep you informed. Regards, Xavier Hi John, Seems to be ok. Server is up for more than 5 hours without problem. Let's wait 24h. Thanks for providing a patch! Hi John, Server crashed once again :( Unable to handle kernel NULL pointer dereference at virtual address 0000000c printing eip: f89710e0 *pde = 171cc001 Oops: 0000 [#1] SMP Modules linked in: ipt_state ip_conntrack ipt_LOG iptable_filter ip_tables md5 CPU: 1 EIP: 0060:[<f89710e0>] Not tainted VLI EFLAGS: 00010246 (2.6.10-1.769.2.3_FC3.jwltest.1smp) EIP is at ipt_do_table+0xc4/0x2fc [ip_tables] eax: 00000000 ebx: 00000000 ecx: 00000000 edx: 00000000 esi: f8bc018c edi: d883c020 ebp: 00000070 esp: c03abe50 ds: 007b es: 007b ss: 0068 Process swapper (pid: 0, threadinfo=c03ab000 task=f7f48540) Stack: f8bbfd48 f8bbf200 f8976080 f79ff000 ffffffff 00000000 00000000 f79ff000 00000001 c03abed8 00000000 c03abed8 c03abedc c0433ce8 00000001 f893b017 00000000 f893bac0 00000000 f893bb00 c027517b 00000000 c0282bfb c03abed8 Call Trace: [<f893b017>] ipt_hook+0x17/0x1c [iptable_filter] [<c027517b>] nf_iterate+0x40/0x81 [<c0282bfb>] ip_local_deliver_finish+0x0/0x188 [<c0275479>] nf_hook_slow+0x47/0xb4 [<c0282bfb>] ip_local_deliver_finish+0x0/0x188 [<c0282bf4>] ip_local_deliver+0x1d7/0x1de [<c0282bfb>] ip_local_deliver_finish+0x0/0x188 [<c0283339>] ip_rcv_finish+0x1b7/0x202 [<c02754b5>] nf_hook_slow+0x83/0xb4 [<c028313d>] ip_rcv+0x3ba/0x3ff [<c0283182>] ip_rcv_finish+0x0/0x202 [<c026d7b3>] netif_receive_skb+0x1de/0x20c [<f88e7125>] e1000_clean_rx_irq+0x34a/0x3b9 [e1000] [<f88e6b7d>] e1000_clean+0x40/0xd4 [e1000] [<c026d93f>] net_rx_action+0x61/0xd8 [<c012211c>] __do_softirq+0x4c/0xb1 [<c0105db7>] do_softirq+0x41/0x48 ======================= [<c0105ce8>] do_IRQ+0x74/0x7e [<c010467e>] common_interrupt+0x1a/0x20 [<c01020e8>] mwait_idle+0x33/0x42 [<c01020a0>] cpu_idle+0x26/0x3b Code: 54 24 20 8b 7c 24 04 03 7c 91 20 03 74 91 0c 89 3c 24 8b 44 24 24 8b 10 <0>Kernel panic - not syncing: Fatal exception in interrupt Hmmm...the crash is happening in the netfilter code...it isn't clear to me that this is actually an e1000 problem... Would you mind attaching your iptables configuration? If you don't want to do so publicly, you can send them directly to me via e-mail. If you are even too paranoid for that, we can probably work-out something else... :-) Here we go... IPT="/sbin/iptables" MPB="/sbin/modprobe" LSM="/sbin/lsmod" # Get out IP config LAN_IF=eth0 IP=`/sbin/ifconfig $LAN_IF | grep inet | cut -d : -f 2 | cut -d \ -f 1` MASK=`/sbin/ifconfig $LAN_IF | grep Mas | cut -d : -f 4` NET=$IP/$MASK echo "Firewall applied on: $LAN_IF/$NET" # Flush and zero the chains. $IPT -F $IPT -X $IPT -Z # Delete `nat' and `mangle' chains. if ( $LSM | /bin/grep iptable_mangle > /dev/null ); then $IPT -t mangle -F fi if ( $LSM | /bin/grep iptable_nat > /dev/null ); then $IPT -t nat -F fi # Create a new log and drop (LD) convenience chain. $IPT -N LD $IPT -A LD -j LOG $IPT -A LD -j DROP STOP=LD TOSOPT=8 # Allow all traffic on the loopback interface $IPT -t filter -A INPUT -i lo -s 127.0.0.0/8 -d 127.0.0.0/8 -j ACCEPT $IPT -t filter -A OUTPUT -o lo -s 127.0.0.0/8 -d 127.0.0.0/8 -j ACCEPT # Turn on source address verification in kernel if [ -e /proc/sys/net/ipv4/conf/all/rp_filter ]; then for f in /proc/sys/net/ipv4/conf/*/rp_filter do echo 2 > $f done fi # Turn on syn cookies protection in kernel if [ -e /proc/sys/net/ipv4/tcp_syncookies ]; then echo 1 > /proc/sys/net/ipv4/tcp_syncookies fi # ICMP Dead Error Messages protection if [ -e /proc/sys/net/ipv4/icmp_ignore_bogus_error_responses ]; then echo 1 > /proc/sys/net/ipv4/icmp_ignore_bogus_error_responses fi # ICMP Broadcasting protection if [ -e /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts ]; then echo 1 > /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts fi # Turn off dynamic TCP/IP address hacking if [ -e /proc/sys/net/ipv4/ip_dynaddr ]; then echo 0 > /proc/sys/net/ipv4/ip_dynaddr fi # Doubling current limit for ip_conntrack if [ -e /proc/sys/net/ipv4/ip_conntrack_max ]; then echo 16384 > /proc/sys/net/ipv4/ip_conntrack_max fi # Accept ESTABLISHED sessions $IPT -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT # Allow SSH $IPT -t filter -A INPUT -p tcp -s 0/0 -d $NET --dport 22 -j ACCEPT # Allow remote backup start from depot001 $IPT -t filter -A INPUT -p tcp -s 10.50.10.215/32 -d $NET --dport 2988 -j ACCEPT # Monitoring # tcp/1040 -> Nagios # icmp $IPT -t filter -A INPUT -p tcp -s 10.50.10.5/32 -d $NET --dport 1040 -j ACCEPT $IPT -t filter -A INPUT -p icmp -s 10.50.10.5/32 -d $NET -j ACCEPT # ntp.belga.be $IPT -t filter -A INPUT -p tcp -s 10.50.10.5/32 -d $NET --dport 123 -j ACCEPT $IPT -t filter -A INPUT -p udp -s 10.50.10.5/32 -d $NET --dport 123 -j ACCEPT # Allow HTTP (Reverse proxy) $IPT -t filter -A INPUT -p tcp -s 0/0 -d $NET --dport 80 -j ACCEPT $IPT -t filter -A INPUT -p tcp -s 10.50.10.247 --sport 80 -d $NET -j ACCEPT $IPT -t filter -A INPUT -p tcp -s 0/0 -d $NET --dport 443 -j ACCEPT $IPT -A OUTPUT -j ACCEPT # --------------------------- # Do not log annoying traffic # --------------------------- # SMB $IPT -t filter -A INPUT -p udp -s 0/0 -d $NET --dport 137:139 -j DROP # IGMP $IPT -t filter -A INPUT -p 2 -s 0/0 -d 0/0 -j DROP # BOOTP $IPT -t filter -A INPUT -p udp -s 0/0 -d 0/0 --dport 67:68 -j DROP # UDP/694 (ftp heartbeat) $IPT -t filter -A INPUT -p udp -s 192.168.2.0/24 -d 0/0 --dport 694 -j DROP # Deny everything not let through earlier $IPT -A INPUT -j $STOP Xavier, I've been doing some work w/ the e1000 driver for another issue. In fact, some of the symptoms with that issue look very much like what you have reported here. Along the way, I've updated to a later version and added a couple of fixes. I'd like for you to re-conduct your tests with the newer version. I have pre-built test kernels in the same plase as in comment 3. Please attempt to recreate the issue with those kernels and post the results. Thanks! The e1000 drivers in the kernels @ comment 3 have been updated yet again. Please give them a try and report back the results ASAP. Thanks! Closing due to lack of response. Please reopen with the results of running with the lastest kernels from the location in comment 3 if the problem persists. Thanks! |