Bug 206630
Summary: | Kernel BUG in skb_gso_segment and crash. | ||||||
---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Alexey Bozrikov <a> | ||||
Component: | kernel | Assignee: | Herbert Xu <herbert.xu> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Brian Brock <bbrock> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 5 | CC: | bugs-redhat, davej, master, pc, steve, ville.lindfors, wtogami | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | ppc64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | 2.6.18-1.2200.fc5 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2006-10-17 06:45:26 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Alexey Bozrikov
2006-09-15 12:45:11 UTC
Created attachment 136352 [details]
debug information about kernel crash (stack backtrace, registers etc)
This was fixed ages ago. We really need to update the xen code in FC5. *** Bug 206753 has been marked as a duplicate of this bug. *** Same problem, i686smp system using the newer kernel 2.6.17-1.2187_FC5smp BT from the last crash: PID: 2646 TASK: cd0c7150 CPU: 0 COMMAND: "httpd" #0 [cdf1ea98] crash_kexec at c0444941 #1 [cdf1eae0] die at c040547a #2 [cdf1eb20] do_invalid_op at c0405bf5 #3 [cdf1ebd0] error_code (via invalid_op) at c04049d5 EAX: 00000000 EBX: cd91b124 ECX: 000111a3 EDX: 000111a3 EBP: 000111a3 DS: 007b ESI: cd91b124 ES: 007b EDI: 00000008 CS: 0060 EIP: c05be839 ERR: ffffffff EFLAGS: 00010297 #4 [cdf1ec04] skb_gso_segment at c05be839 #5 [cdf1ec18] dev_hard_start_xmit at c05bf955 #6 [cdf1ec3c] __qdisc_run at c05cdf7a #7 [cdf1ec5c] dev_queue_xmit at c05c1371 #8 [cdf1ec78] ip_output at c05de9ba #9 [cdf1eca4] ip_queue_xmit at c05de1ee #10 [cdf1ed20] tcp_transmit_skb at c05eba0c #11 [cdf1ed70] tcp_push_one at c05ed4c6 #12 [cdf1ed88] tcp_sendmsg at c05e3bb2 #13 [cdf1ee18] do_sock_write at c05b58ef #14 [cdf1ee34] sock_writev at c05b7a4b #15 [cdf1ef24] do_readv_writev at c046b73c #16 [cdf1ef8c] vfs_writev at c046b877 #17 [cdf1ef9c] sys_writev at c046bce1 #18 [cdf1efb8] system_call at c0403e38 EAX: ffffffda EBX: 00000011 ECX: bffa3778 EDX: 00000004 DS: 007b ESI: 00000004 ES: 007b EDI: 002e5ff4 SS: 007b ESP: bffa35b0 EBP: bffa35d8 CS: 0073 EIP: 00ba9410 ERR: 00000092 EFLAGS: 00000246 crash> whatis skb_gso_segment struct sk_buff *skb_gso_segment(struct sk_buff *, int); Looks like that function is defined in the xen patch the rpmbuild process does to the core kernel code... Just to notice, that same kernel version (2.6.17-1.2187_FC5smp #1 SMP Mon Sep 11 02:07:57 EDT 2006 ppc ppc ppc GNU/Linux) on 32-bit PPC machine (7025-F50) does NOT crash. The new version doesn't crash on my i386 machine Linux wooded.hillside.co.uk 2.6.17-1.2187_FC5 #1 Mon Sep 11 01:17:06 EDT 2006 i686 i686 i386 GNU/Linux - and version Linux rose.cantweb.co.uk 2.6.17-1.2174_FC5 #1 SMP Tue Aug 8 15:30:44 EDT 2006 x86_64 x86_64 x86_64 GNU/Linux is running happily on my X86_64. The invalid op looks like a compiler error .... is it? The bug is in the NAT code so it only shows up if you have the NAT module loaded. As I said before, the bug is already fixed in rawhide pending another Xen update for FC5. *** Bug 204220 has been marked as a duplicate of this bug. *** The 2.6.18 (2189) kernel in FC5 testing should cure this. I'm getting "switchroot: mount failed" when booting the 2.6.18 kernel from testing. Unfortunately I don't have access to the console myself, so getting any further debugging is going to be hard. The root FS is a software RAID5 running on 3 SATA drives on the following interfaces: 00:1f.2 IDE interface: Intel Corporation 6300ESB SATA Storage Controller (rev 02) 03:03.0 RAID bus controller: Silicon Image, Inc. Adaptec AAR-1210SA SATA HostRAID Controller (rev 02) Is there an archive of old FC5 updates anywhere so that I can downgrade to a working kernel in the meantime? It seems that old updates are not available in the updates repository. :( You need to make sure that you've upgraded the xen package as well as the kernel. If the problem persists, please file a new bug. Thanks. *** Bug 209910 has been marked as a duplicate of this bug. *** A new kernel update has been released (Version: 2.6.18-1.2200.fc5) based upon a new upstream kernel release. Please retest against this new kernel, as a large number of patches go into each upstream release, possibly including changes that may address this problem. This bug has been placed in NEEDINFO state. Due to the large volume of inactive bugs in bugzilla, if this bug is still in this state in two weeks time, it will be closed. Should this bug still be relevant after this period, the reporter can reopen the bug at any time. Any other users on the Cc: list of this bug can request that the bug be reopened by adding a comment to the bug. In the last few updates, some users upgrading from FC4->FC5 have reported that installing a kernel update has left their systems unbootable. If you have been affected by this problem please check you only have one version of device-mapper & lvm2 installed. See bug 207474 for further details. If this bug is a problem preventing you from installing the release this version is filed against, please see bug 169613. If this bug has been fixed, but you are now experiencing a different problem, please file a separate bug for the new problem. Thank you. Kernel 2.6.18-1.2200.fc5.ppc64 seems to fix the problem. Connections initiated through iptables NAT do not crash kernel anymore. Alexey bozy I can confirm that the new kernel 2.6.18-1.2200.fc5 stayed up today on my machine, so the bug I reported appears to be fixed. Many thanks for all your efforts. |