Bug 132153
Summary: | 643 netconsole oops | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Warren Togami <wtogami> | ||||||
Component: | kernel | Assignee: | Jeff Moyer <jmoyer> | ||||||
Status: | CLOSED RAWHIDE | QA Contact: | Brian Brock <bbrock> | ||||||
Severity: | medium | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | rawhide | CC: | davej, jmoyer, jturner, wtogami | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | All | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2004-11-30 22:13:57 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Warren Togami
2004-09-09 10:40:18 UTC
Dave already has a patch set which addresses this. A patch that addresses this problem was part of a netdump/netconsole/diskdump patch set that was committed to the kernel/devel tree yesterday (10/12): http://post-office.corp.redhat.com/archives/cvs-commits-list/2004-October/msg01066.html Please test with current kernels. Still happens in 2.6.8-1.610 What version of the netdump user package is running on the client machine? This is not netdump, but netconsole, similar in purpose, but it works on all archs (rather than just x86 netdump). It utilizes plain UDP and no authentication. How are you loading the module? What parameters are passed? I cannot reproduce your problem on the 610 kernel. What hardware are you using? UP or SMP kernel? -Jeff modprobe netconsole netconsole=@/eth1,6667.16.102/ The OOPS happens if: 1) If you attempted to use netconsole on a device that does not support polling. dmesg will tell you if it failed. 2) During module unload after it failed. Back to my question, what hardware are you using? Namely, which ethernet card? Do we have access to the system displaying the problem? Oops, I have seen this problem on airo.ko and 3c59x.ko. I have access to this hardware, one being my laptop. Does this still happen with .639? It did in .637 earlier today. Really need 639 tested? Nah, that should be close enough. Created attachment 105861 [details]
Fix for dereferencing a null ifa_list in netpoll_setup.
Warren, can you try the attached patch, please?
Tried 643 + your patch on i686. netconsole: local port 6665 netconsole: interface eth0 netconsole: remote port 6667 netconsole: remote IP 172.31.16.102 netconsole: remote ethernet address ff:ff:ff:ff:ff:ff netconsole: eth0 doesn't support polling, aborting. netconsole: failed to configure syslog service netconsole: network logging started Unable to handle kernel NULL pointer dereference at virtual address 00000184 printing eip: 022b710c *pde = 00000000 Oops: 0002 [#1] Modules linked in: netconsole radeon md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core ds sunrpc microcode dm_mod button battery ac yenta_socket pcmcia_core uhci_hcd ehci_hcd snd_intel8x0m snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc gameport snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore e1000 aes_i586 airo xfs CPU: 0 EIP: 0060:[<022b710c>] Not tainted VLI EFLAGS: 00010246 (2.6.9-1.643.builder3) EIP is at netpoll_cleanup+0x143/0x152 eax: 00000000 ebx: 42b14cc0 ecx: 0234f600 edx: 42b15180 esi: 023523e0 edi: 00000000 ebp: 36c26000 esp: 36c26f60 ds: 007b es: 007b ss: 0068 Process modprobe (pid: 3808, threadinfo=36c26000 task=3659e030) Stack: 42b15200 023523e0 42b14278 02139c39 00000000 6374656e 6f736e6f 0000656c 00000000 39512a00 f6fff000 f7000000 021565c3 39512a00 3bc6cba0 021569a2 3bc6c660 39512a00 36c26fc0 00000004 0211937e 36c26fc4 00000000 0892d0e8 Call Trace: [<42b14278>] cleanup_netconsole+0x1d/0x31 [netconsole] [<02139c39>] sys_delete_module+0x132/0x179 [<021565c3>] unmap_vma_list+0xe/0x17 [<021569a2>] do_munmap+0x20e/0x218 [<0211937e>] do_page_fault+0x0/0x511 Code: <3>Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 in_atomic():0[expected: 0], irqs_disabled():1 [<0211c8b9>] __might_sleep+0x7d/0x88 [<0215e282>] rw_vm+0x216/0x482 [<022b70e1>] netpoll_cleanup+0x118/0x152 [<022b70e1>] netpoll_cleanup+0x118/0x152 [<0215e9d4>] get_user_size+0x30/0x57 [<022b70e1>] netpoll_cleanup+0x118/0x152 [<0210682b>] show_registers+0x109/0x15e [<02106a2f>] die+0x14a/0x241 [<0211937e>] do_page_fault+0x0/0x511 [<0211937e>] do_page_fault+0x0/0x511 [<02119733>] do_page_fault+0x3b5/0x511 [<022b710c>] netpoll_cleanup+0x143/0x152 [<02153a08>] do_no_page+0x3b5/0x434 [<02153c7c>] handle_mm_fault+0xe4/0x21e [<02151de6>] follow_page_pfn+0xec/0xfd [<0211937e>] do_page_fault+0x0/0x511 [<022b710c>] netpoll_cleanup+0x143/0x152 [<42b14278>] cleanup_netconsole+0x1d/0x31 [netconsole] [<02139c39>] sys_delete_module+0x132/0x179 [<021565c3>] unmap_vma_list+0xe/0x17 [<021569a2>] do_munmap+0x20e/0x218 [<0211937e>] do_page_fault+0x0/0x511 Bad EIP value. Created attachment 105897 [details]
Fix for netconsole module init code
Warren, I've tested this patch and it works for me. Please give it a try and
let me know the results.
I should note that this patch should be applied in addition to the previous one. You mean patch in Comment #14 and Comment #16 should be applied? Yes, that is correct. Thanks! The combination of those two patches prevent the module from loading in error cases like this, so it avoids the oops during unloading. Is this intended? I wont be able to test the normal non-error case until the weekend. Yes, basically, here is what is happening. You don't specify a local IP address, and one cannot be obtained automatically. In this case, how are we supposed to send IP datagrams? We can't. So, if you want your setup to work, you need to specify a source address. So, you want to change your modprobe line: modprobe netconsole netconsole=@/eth1,6667.16.102/ to include the source IP address after the @ and before the /eth1. If you don't know your IP address, then this sounds like quite an unusual configuration. If you need more help with this, please contact me via email. I'd be happy to help. Thanks. |