Description of problem: System got panic in the code of spinlock because the magic was corrupted. And before the panic, the system was running in low memory condition and a lot of "oom" events were seen in the log. ------------[ cut here ]------------ kernel BUG at include/asm/spinlock.h:133! invalid operand: 0000 [#1] SMP Modules linked in: netconsole netdump autofs4 i2c_dev i2c_core nfs lockd nfs_acl sunrpc md5 ipv6 vmmemctl(U) cpufreq_powersave dm_mirror dm_multipath dm_mod button battery ac pcnet32 vmxnet(U) mii floppy ext3 jbd ata_piix libata mptscsih mptsas mptspi mptscsi mptbase sd_mod scsi_mod CPU: 2 EIP: 0060:[<c02dc7e7>] Not tainted VLI EFLAGS: 00010246 (2.6.9-89.0.11.ELsmp) EIP is at _spin_lock+0x1c/0x34 eax: c02f0760 ebx: f28667ec ecx: f699dea8 edx: c029b506 esi: d98fbd80 edi: e262b540 ebp: 000000d0 esp: f699deac ds: 007b es: 007b ss: 0068 Process csnmpd (pid: 5022, threadinfo=f699d000 task=f7b647f0) Stack: e262b540 c029b506 000000d0 d2f71a00 f28667ec f28667c0 00000006 e262b540 f28667ec f28667c0 f689ca80 c029b75e f7fe1c00 f7fd3dc0 d2f71a00 c0358910 00000300 00000006 c029030a c02900c6 c02901d6 00000000 00000001 00000014 Call Trace: [<c029b506>] netlink_dump+0x46/0x18b [<c029b75e>] netlink_dump_start+0x113/0x11c [<c029030a>] rtnetlink_rcv+0x131/0x327 [<c02900c6>] rtnetlink_dump_all+0x0/0x8f [<c02901d6>] rtnetlink_done+0x0/0x3 [<c029b3d3>] netlink_data_ready+0x14/0x44 [<c028a006>] dev_ioctl+0x76/0x4ab [<c015cc6b>] fd_install+0x21/0x40 [<c02bdaff>] udp_ioctl+0x0/0xc5 [<c02c3ca6>] inet_ioctl+0xa0/0xa5 [<c02808b3>] sock_ioctl+0x28c/0x2b4 [<c016dc0a>] sys_ioctl+0x227/0x269 [<c0126fc5>] sys_gettimeofday+0x53/0xac [<c02ddcdb>] syscall_call+0x7/0xb Code: 00 75 09 f0 81 02 00 00 00 01 30 c9 89 c8 c3 53 89 c3 81 78 04 ad 4e ad de 74 18 ff 74 24 04 68 60 07 2f c0 e8 3d 6d e4 ff 58 5a <0f> 0b 85 00 68 f7 2e c0 f0 fe 0b 79 09 f3 90 80 3b 00 7e f9 eb Version-Release number of selected component (if applicable): 2.6.9-89.0.11.ELsmp How reproducible: Only once until now. Steps to Reproduce: 1. 2. 3. Actual results: kernel panic. Expected results: No panic Additional info: vmcore is avaiable at core-i386.gsslab.rdu.redhat.com:/cores/20100802103059/work
crash> bt PID: 5022 TASK: f7b647f0 CPU: 2 COMMAND: "csnmpd" #0 [f699dd68] netpoll_start_netdump at f8960570 #1 [f699dd88] die at c0106049 #2 [f699ddbc] do_invalid_op at c0106424 #3 [f699de6c] error_code (via invalid_op) at c02de7d1 EAX: c02f0760 EBX: f28667ec ECX: f699dea8 EDX: c029b506 EBP: 000000d0 DS: 007b ESI: d98fbd80 ES: 007b EDI: e262b540 CS: 0060 EIP: c02dc7e7 ERR: ffffffff EFLAGS: 00010246 #4 [f699dea8] _spin_lock at c02dc7e7 #5 [f699deb0] netlink_dump at c029b501 #6 [f699ded8] netlink_dump_start at c029b759 #7 [f699def4] rtnetlink_rcv at c0290305 #8 [f699df28] netlink_data_ready at c029b3d1 #9 [f699df30] dev_ioctl at c028a000 #10 [f699df68] inet_ioctl at c02c3ca1 #11 [f699df7c] sock_ioctl at c02808b0 #12 [f699df94] sys_ioctl at c016dc07 #13 [f699dfc0] system_call at c02ddcd4 EAX: 00000036 EBX: 00000009 ECX: 00008912 EDX: bfebdf58 DS: 007b ESI: bfebdf58 ES: 007b EDI: 00000008 SS: 007b ESP: bfebde94 EBP: bfebded8 CS: 0073 EIP: 004dc7a2 ERR: 00000036 EFLAGS: 00000206 Pick up the pointer of netlink_opt from stack crash> netlink_opt f28667c0 struct netlink_opt { pid = 0x0, groups = 0x1, dst_pid = 0x0, dst_groups = 0x0, state = 0xf5616e00, handler = 0xf5616e00, wait = { lock = { lock = 0x2b2f2507, magic = 0xf6eee580 }, task_list = { next = 0x7, prev = 0xdead4ead } }, cb = 0xf28667e8, cb_lock = { lock = 0xf28667e8, magic = 0xc033b524 ---> The magic should be 0xdead4ead, but it was corrupted already, so system panicked when acquiring spin_lock }, data_ready = 0 } Get the parameter(a pointer to structure sock) passed to function netlink_dump from EDI register, and it looks good, not corrupted, so f28667c0 should be a good pointer. crash> sock e262b540 struct sock { ... sk_lock = { slock = { lock = 0x1, magic = 0xdead4ead }, owner = 0x0, wq = { lock = { lock = 0x1, magic = 0xdead4ead }, task_list = { next = 0xe262b574, prev = 0xe262b574 } } }, sk_rcvbuf = 0x100000, sk_sleep = 0x0, sk_dst_cache = 0x0, sk_dst_lock = { lock = 0x1000000, magic = 0xdeaf1eed }, sk_policy = {0x0, 0x0}, sk_rmem_alloc = { counter = 0x1030 }, sk_receive_queue = { next = 0xe262b59c, prev = 0xe262b59c, qlen = 0x0, lock = { lock = 0x1, magic = 0xdead4ead } }, sk_wmem_alloc = { counter = 0x0 }, sk_write_queue = { next = 0xe262b5b4, prev = 0xe262b5b4, qlen = 0x0, lock = { lock = 0x1, magic = 0xdead4ead } }, sk_omem_alloc = { counter = 0x0 }, sk_wmem_queued = 0x0, sk_forward_alloc = 0x0, sk_allocation = 0xd0, sk_sndbuf = 0x100000, sk_flags = 0x1, sk_no_check = 0x0, sk_debug = 0x0, sk_rcvtstamp = 0x0, sk_no_largesend = 0x0, sk_route_caps = 0x0, sk_lingertime = 0x0, sk_hashent = 0x0, sk_backlog = { head = 0x0, tail = 0x0 }, sk_callback_lock = { lock = 0x1000000, magic = 0xdeaf1eed }, sk_error_queue = { next = 0xe262b600, prev = 0xe262b600, qlen = 0x0, lock = { lock = 0x1, magic = 0xdead4ead } }, sk_prot = 0x0, sk_err = 0x0, sk_err_soft = 0x0, sk_ack_backlog = 0x0, sk_max_ack_backlog = 0x0, sk_priority = 0x0, sk_type = 0x3, sk_localroute = 0x0, sk_protocol = 0x0, sk_peercred = { pid = 0x0, uid = 0xffffffff, gid = 0xffffffff }, sk_rcvlowat = 0x1, sk_rcvtimeo = 0x7fffffff, sk_sndtimeo = 0x7fffffff, sk_filter = 0x0, sk_protinfo = 0xf28667c0, ---> The pointer of structure netlink_opt sk_slab = 0xf7f06740, ... } Have a look at the source code of function netlink_dump_start <snip> int netlink_dump_start(struct sock *ssk, struct sk_buff *skb, struct nlmsghdr *nlh, int (*dump)(struct sk_buff *skb, struct netlink_callback*), int (*done)(struct netlink_callback*)) { ... /* A dump is in progress... */ spin_lock(&nlk->cb_lock); if (nlk->cb) { spin_unlock(&nlk->cb_lock); netlink_destroy_callback(cb); sock_put(sk); return -EBUSY; } nlk->cb = cb; spin_unlock(&nlk->cb_lock); netlink_dump(sk); return 0; } </snip> Before calling netlink_dump, it had tried to aqcuire "cb_lock". However, no panic happened at that moment, so it should be corrupted after that. Have a check with the running time of process "csnmpd": crash> px per_cpu__runqueues PER-CPU DATA TYPE: struct runqueue per_cpu__runqueues; PER-CPU ADDRESSES: [0]: c5055d80 [1]: c505dd80 [2]: c5065d80 [3]: c506dd80 crash> runqueue c5065d80|grep last timestamp_last_tick = 0x11d5301eebd680, crash> task_struct.last_ran f7b647f0 last_ran = 0x11d5301dd92e00, crash> px (5019477258000000-5019477240000000)/1000000 $4 = 0x12 --> 12ms Search the reference of f28667c0 crash> search -k f28667c0 c5290108: f28667c0 --> in cpu1's array_cache of slab size-64 c5290168: f28667c0 --> the same as above e262b648: f28667c0 --> the member sk_protinfo of sock e262b540 f5616dfc: f28667c0 --> used by a nfs_inode f699dec0: f28667c0 --> in the stack of panic process f699ded0: f28667c0 --> the same as above crash> kmem -s|grep size-64 kmem: "size-1620" cache: object e1070300 on both cpu 2 and shared lists kmem: "size-1620" cache: object e1070980 on both cpu 2 and shared lists kmem: "size-1620" cache: object e112e2c0 on both cpu 2 and shared lists kmem: "size-1620" cache: object e112e940 on both cpu 2 and shared lists kmem: "size-1620" cache: object e112efc0 on both cpu 2 and shared lists kmem: "size-1620" cache: object e112f640 on both cpu 2 and shared lists kmem: "size-1620" cache: object ece7c280 on both cpu 2 and shared lists kmem: "size-1620" cache: object ece7d600 on both cpu 2 and shared lists kmem: "size-1620" cache: object d4ff55c0 on both cpu 2 and shared lists kmem: "size-1620" cache: object cfca0f00 on both cpu 2 and shared lists kmem: "size-1620" cache: object f3ec4840 on both cpu 2 and shared lists kmem: "size-1620" cache: object dc0b0180 on both cpu 2 and shared lists kmem: "size-1620" cache: object dc0b1500 on both cpu 2 and shared lists kmem: "size-1620" cache: object f5bd0300 on both cpu 2 and shared lists kmem: "size-1620" cache: object f541d040 on both cpu 2 and shared lists f7fff340 size-64(DMA) 64 0 0 0 4k f7fff240 size-64 64 4766 18971 311 4k crash> kmem_cache_s.array f7fff240 array = {0xc523f000, 0xc5290000, 0xc52c8000, 0xf7e02000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, crash>array_cache 0xc5290000 struct array_cache { avail = 0x26, limit = 0x78, batchcount = 0x3c, touched = 0x1 } crash> rd 0xc5290000 100 c5290000: 00000026 00000078 0000003c 00000001 &...x...<....... ... c52900e0: f7ca7c40 f3314c40 f7ca7600 f7ca7a40 @|..@L1..v..@z.. c52900f0: f7ca78c0 f7ca7d00 f7ca7b40 f689c4c0 .x...}..@{...... c5290100: f2866e80 f2866200 f28667c0 f7c87300 .n...b...g...s.. ^^^^^^^ c5290110: e4617cc0 e7595c80 f7c89280 f6899400 .|a..\Y......... c5290120: f6899d00 e44f3e40 e44f3fc0 e44f3a80 ....@>O..?O..:O. c5290130: e44f37c0 f2866740 f6e12540 e44f3600 .7O.@g..@%...6O. c5290140: e44f3d80 f7b9bb40 f7b9be40 f7b9b500 .=O.@...@....... c5290150: f3314c80 f2866700 f2866340 f2866f00 .L1..g..@c...o.. c5290160: f2866e80 f2866200 f28667c0 f7c87300 .n...b...g...s.. ^^^^^^ c5290170: e4617cc0 e7595c80 f7c89280 e7594400 .|a..\Y......DY. c5290180: f7b359c0 eca01600 eca01600 eca01600 .Y.............. f28667c0 was duplicated in the entries after cursor "avail"(0x26) in cpu1's per array_cache of slab "size-64". I think this should be a problem. Another problem, how could the process "csnmpd" on cpu2 use the slab object on cpu1? crash> nfs_inode.access_cache,access_cache_entry_lru 0xf5616d44 access_cache = { rb_node = 0xf28667c0 }, access_cache_entry_lru = { next = 0xf28667d0, prev = 0xf28667d0 }, crash> nfs_access_entry f28667c0 struct nfs_access_entry { rb_node = { rb_parent = 0x0, rb_color = 0x1, rb_right = 0x0, rb_left = 0x0 }, lru = { next = 0xf5616e00, prev = 0xf5616e00 }, jiffies = 0x2b2f2507, cred = 0xf6eee580, mask = 0x7 } crash> rpc_cred.cr_magic 0xf6eee580 cr_magic = 0xf4aa4f0, -----> the same magic as defined in source code. crash> px jiffies jiffies = $6 = 0x2b2fe8fd crash> px 0x2b2fe8fd-0x2b2f2507 $7 = 0xc3f6 -----> before the last run time of process csnmpd So it seems that the size-64 object f28667c0 was used as "netlink_opt" at first, and when it's still in use, it's allocated again for "nfs_access_entry" erroneously. When "csnmpd" was scheduled on cpu again, system got panic due to the corruption of structure "netlink_opt".