Description of problem: It seems that the kernel crashes every time an IPv4 route is being added trough netlink socket if the NLM_F_ECHO is specified. The'routeexample.c' demonstrates this problem: when executed with root privileges in 2.6.9-55.ELsmp kernel. Here is one demonstration: ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:592! invalid operand: 0000 [#1] SMP Modules linked in: md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core nfs lockd nfs_acl sunrpc uhci_hcd hw_random snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore 3c59x e100 mii floppy ata_piix libata scsi_mod dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod CPU: 0 EIP: 0060:[<c027cfd2>] Not tainted VLI EFLAGS: 00010202 (2.6.9-55.ELsmp) EIP is at pskb_expand_head+0x2c/0x114 eax: 00000001 ebx: cdd6e7a0 ecx: fffffeac edx: 000000d0 esi: cffb4dc0 edi: cfd10400 ebp: 0000002c esp: cce47c20 ds: 007b es: 007b ss: 0068 Process routeexample (pid: 4005, threadinfo=cce47000 task=cba9ae70) Stack: cfd10400 00000000 00000154 cffb4dc0 cfd10400 cdd6e7a0 c0293507 000000d0 00000000 cffb4dc0 00000fa5 00000040 00000000 00000000 00000000 000000d0 cdd6e7a0 00000000 00000fa5 c02c2db0 00000018 cdd6e7a0 ce285360 cfd10400 Call Trace: [<c0293507>] netlink_broadcast+0x7d/0x2ce [<c02c2db0>] rtmsg_fib+0x7e/0x108 [<c02c2e0d>] rtmsg_fib+0xdb/0x108 [<c02c2877>] fn_hash_insert+0x375/0x39d [<c02c00c6>] inet_rtm_newroute+0x5a/0x66 [<c02c006c>] inet_rtm_newroute+0x0/0x66 [<c0289093>] rtnetlink_rcv+0x226/0x327 [<c0293c32>] netlink_data_ready+0x14/0x44 [<c029333f>] netlink_sendskb+0x52/0x6c [<c0293a4d>] netlink_sendmsg+0x271/0x280 [<c02791dd>] sock_sendmsg+0xdb/0xf7 [<c0141759>] filemap_nopage+0x194/0x302 [<c012052d>] autoremove_wake_function+0x0/0x2d [<c015be7c>] fget+0x3b/0x42 [<c027a4a3>] sys_sendto+0xc7/0xe2 [<c011b01b>] do_page_fault+0x1ae/0x5c6 [<c0278f2a>] sock_map_file+0x98/0x107 [<c0292c3a>] netlink_create+0x90/0xf0 [<c027a4d7>] sys_send+0x19/0x1d [<c027aca1>] sys_socketcall+0x151/0x1fb [<c02d5ee3>] syscall_call+0x7/0xb Code: 57 56 53 89 c3 57 57 89 54 24 04 8b 80 ac 00 00 00 8b 54 24 1c 2b 83 a0 00 00 00 03 44 24 04 8d 2c 08 8b 83 9c 00 00 00 48 74 08 <0f> 0b 50 02 ca 98 30 c0 83 c5 7f 83 e5 80 8d 85 a0 00 00 00 e8 <0>Fatal exception: panic in 5 seconds Kernel panic - not syncing: Fatal exception This problem does not happen on RHEL-5.
Created attachment 156871 [details] Example program that can reproduce the kernel crash This example program can reproduce the RHEL-4 kernel crash. Just build it and run it (possibly a few times), and the problem will reproduce. Chris Lalancette
Two upstream changesets look like they address the problem: http://linux.bkbits.net:8080/linux-2.6/?PAGE=cset&REV=1.1966.10.121 http://linux.bkbits.net:8080/linux-2.6/?PAGE=cset&REV=1.1966.82.3 I'll be attaching backported versions of these soon. Chris Lalancette
Created attachment 156874 [details] [NETLINK]: Orphan SKBs in netlink_trim(). Patch 1/2 that seems to fix this problem.
Created attachment 156875 [details] [NETLINK]: Unshare SKB, as necessary, in netlink_trim() Patch 2/2 that seems to solve this problem.
*** This bug has been marked as a duplicate of 216752 ***
Clearing out bogus flags for reporting purposes. Chris Lalancette