Description of problem: on rhts machine hp-dl785g5-01.rhts.eng.bos.redhat.com, when running /kernel/hotplug/cpusofthotplug test, kernel panic happen. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Initializing CPU#47 Six-Core AMD Opteron(tm) Processor 8439 SE stepping 00 CPU 13 is now offline CPU 2 is now offline CPU 15 is now offline CPU 9 is now offline CPU 10 is now offline CPU 35 is now offline Initializing CPU#9 Six-Core AMD Opteron(tm) Processor 8439 SE stepping 00 CPU 4 is now offline CPU 40 is now offline CPU 1 is now offline CPU 11 is now offline Unable to handle kernel NULL pointer dereference at 00000000000000c0 RIP: [<ffffffff80080312>] cacheinfo_cpu_callback+0x458/0x4c3 PGD 202835b067 PUD 202bd7e067 PMD 0 Oops: 0002 [1] SMP last sysfs file: /kernel/kexec_crash_loaded CPU 0 Modules linked in: autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 xfrm_nalgo crypto_api cpufreq_ondemand powernow_k8 freq_table dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport shpchp ide_cd cdrom i2c_piix4 i2c_core hpilo bnx2 serio_raw pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod qla2xxx scsi_transport_fc cciss sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 29539, comm: runtest.sh Not tainted 2.6.18-164.8.1.el5 #1 RIP: 0010:[<ffffffff80080312>] [<ffffffff80080312>] cacheinfo_cpu_callback+0x458/0x4c3 RSP: 0018:ffff812029e6dd28 EFLAGS: 00010293 RAX: 00000000000000a8 RBX: 000000000000000b RCX: 0000000000000000 RDX: 0000000000000023 RSI: 00000000000000ff RDI: ffff81042f9c9c40 RBP: ffff81042f9c9c28 R08: 000000000000001c R09: 0000000000000024 R10: ffff81042f9c9c40 R11: ffffffff8002f3d9 R12: 00000000000000a8 R13: 0000000000000003 R14: 0000000000000150 R15: ffff81041fe578c0 FS: 00002b7a7ced2e10(0000) GS:ffffffff803c1000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000000000c0 CR3: 0000002029253000 CR4: 00000000000006e0 Process runtest.sh (pid: 29539, threadinfo ffff812029e6c000, task ffff81202feff860) Stack: ffff810422acc300 0000000000000282 ffff810422acc300 ffff81042d89d880 ffffffff8045b408 0000000000000282 ffff81010eaeb690 ffff81010eaeb680 ffff81010eaeb690 ffffffff803333d0 ffffffff80333320 ffff81010eaeb690 Call Trace: [<ffffffff8014e7f1>] kobject_cleanup+0x62/0x7e [<ffffffff8014e80d>] kobject_release+0x0/0x9 [<ffffffff8014e80d>] kobject_release+0x0/0x9 [<ffffffff8014e80d>] kobject_release+0x0/0x9 [<ffffffff80066eaa>] notifier_call_chain+0x20/0x32 [<ffffffff800a45bb>] _cpu_down+0x191/0x265 [<ffffffff800a46b8>] cpu_down+0x29/0x3d [<ffffffff801c3eaf>] store_online+0x29/0x67 [<ffffffff8010ae69>] sysfs_write_file+0xb9/0xe8 [<ffffffff80016942>] vfs_write+0xce/0x174 [<ffffffff800171fa>] sys_write+0x45/0x6e [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Code: f0 0f b3 58 18 48 8d 75 18 89 d7 e8 00 dd 0c 00 3d fe 00 00 RIP [<ffffffff80080312>] cacheinfo_cpu_callback+0x458/0x4c3 RSP <ffff812029e6dd28> CR2: 00000000000000c0 <0>Kernel panic - not syncing: Fatal exception Expected results: Additional info: on 2.6.18-164.el5, kernel does not panic.
(In reply to comment #0) > How reproducible: > 100% > > Steps to Reproduce: > 1.yum -y install rh-tests-kernel-hotplug-cpusofthotplug.noarch > 2.cd /mnt/tests/kernel/hotplug/cpusofthotplug > 3.make run
panic also happens on amd-drachma-01.lab.bos.redhat.com, i386 2.6.18-164.8.1 debug kernel. console messages: lockdep: not fixing up alternatives. Booting processor 9/25 eip 11000 CPU 9 irqstacks, hard=c080e000 soft=c07ee000 Leaving ESR disabled. CPU9: AMD Engineering Sample stepping 01 lockdep: not fixing up alternatives. Booting processor 4/20 eip 11000 CPU 4 irqstacks, hard=c0809000 soft=c07e9000 Leaving ESR disabled. CPU4: AMD Engineering Sample stepping 01 Breaking affinity for irq 122 Breaking affinity for irq 137 Breaking affinity for irq 145 CPU 23 is now offline BUG: unable to handle kernel NULL pointer dereference at virtual address 0000004c printing eip: c040dfca *pde = 00000000 Oops: 0002 [#1] SMP last sysfs file: /devices/system/cpu/cpu23/online Modules linked in: autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc ipv6 xfrm_nalgo crypto_api cpufreq_ondemand powernow_k8 dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi ac parport_pc lp parport joydev sr_mod i2c_piix4 sg ide_cd i2c_core serio_raw bnx2 cdrom pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod usb_storage ahci libata mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd CPU: 0 EIP: 0060:[<c040dfca>] Not tainted VLI EFLAGS: 00010206 (2.6.18-164.8.1.el5debug #1) EIP is at cacheinfo_cpu_callback+0x37c/0x3d7 eax: 0000003c ebx: 00000017 ecx: 00000014 edx: 00000014 esi: f7b53f04 edi: 0000003c ebp: 0000012c esp: f7bf4edc ds: 007b es: 007b ss: 0068 Process runtest.sh (pid: 9365, ti=f7bf4000 task=f740ad20 task.ti=f7bf4000) Stack: c05c967b c0826fdc f740ad20 c05c967b 00000003 00000007 c043b3cd c06a1e34 c069ba20 00000017 00000007 c0628b5c 00000017 00000000 f7bcd1a0 f7bcd1a0 c0440c65 ffffffff ff7fffff 00000017 fffffff0 f7f37000 00000002 c0440d21 Call Trace: [<c05c967b>] dev_cpu_callback+0x73/0xaa [<c05c967b>] dev_cpu_callback+0x73/0xaa [<c043b3cd>] trace_hardirqs_on+0xf8/0x118 [<c0628b5c>] notifier_call_chain+0x19/0x29 [<c0440c65>] _cpu_down+0x135/0x1cc [<c0440d21>] cpu_down+0x25/0x36 [<c0566aba>] store_online+0x24/0x56 [<c0566a96>] store_online+0x0/0x56 [<c0563d8a>] sysdev_store+0x1e/0x22 [<c04b57d9>] sysfs_write_file+0xa3/0xcd [<c04b5736>] sysfs_write_file+0x0/0xcd [<c047d45b>] vfs_write+0xa1/0x143 [<c047da5b>] sys_write+0x3c/0x63 [<c0404f93>] syscall_call+0x7/0xb ======================= Code: ed 31 ff 8b 1a c7 44 24 10 00 00 00 00 eb 4a 89 fe 03 34 9d e0 74 82 c0 8d 46 10 e8 e9 64 0e 00 eb 18 89 f8 03 04 8d e0 74 82 c0 <f0> 0f b3 58 10 8d 56 10 89 c8 e8 e9 64 0e 00 83 f8 1f 89 c1 7e EIP: [<c040dfca>] cacheinfo_cpu_callback+0x37c/0x3d7 SS:ESP 0068:f7bf4edc <0>Kernel panic - not syncing: Fatal exception
2.6.18-164.el5PAE kernel finished the test smoothly.
This is reproducible with experimental.10 and the 164.1.7.el5 z-stream. P.
This same panic happens on -164.el5 (RHEL5.4 base kernel). Not a blocker for z-stream. Still a bug (obviously). P.
The kernels you are testing has the CPU hot plug patch from BZ 526770?
(In reply to comment #7) > The kernels you are testing has the CPU hot plug patch from BZ 526770? Yes. P.
Bhavna, I've tracked this issue down to this RHEL5 commit: commit 8c0ce9bfb7f4053fec9cfa70322a2391c7422314 Author: Bhavna Sarathy <bnagendr> Date: Tue Sep 22 16:11:06 2009 -0400 [x86] fix up L3 cache information for AMD Magny-cours Message-id: <20090922161302.7053.39434.sendpatchset> Patchwork-id: 20918 O-Subject: [RHEL5.5 PATCH 2/3] Fix up L3 cache information for AMD Magny-cou Bugzilla: 513684 RH-Acked-by: Christopher Lalancette <clalance> RH-Acked-by: Prarit Bhargava <prarit> Resolves BZ 513684 Fixup L3 cache information for AMD multi-node processors. Looking at the code now. P.
arch/i386/kernel/cpu/intel_cacheinfo.c, line 474 if ((index == 3) && (c->x86_vendor == X86_VENDOR_AMD)) { for_each_online_cpu(i) { if (cpuid4_info[i] == NULL) continue; this_leaf = CPUID4_INFO_IDX(i, index); this_leaf->shared_cpu_map = c[i].llc_shared_map; } return; } Removing this chunk fixes the panic in comment #1. P.
Can you please add AMD confidential group to this bug?
Sent to Andreas Hermann @ AMD in private email: ------------------------- I'm currently attempting to resolve a panic in the hotplug code which is related to your patchset posted in these links: http://lkml.org/lkml/2009/6/3/244 http://lkml.org/lkml/2009/6/3/246 http://lkml.org/lkml/2009/6/3/247 http://lkml.org/lkml/2009/6/3/248 http://lkml.org/lkml/2009/6/3/249 http://lkml.org/lkml/2009/6/3/251 (In the code below I am referencing a RHEL5 backport of this code. Please contact our onsite AMD engineer, Bhavna (cc'd), for details on accessing this codebase.) When doing a random walk of CPU hotplug events, I occasionally see Unable to handle kernel NULL pointer dereference at 00000000000000c0 RIP: [<ffffffff80080312>] cacheinfo_cpu_callback+0x458/0x4c3 PGD 202835b067 PUD 202bd7e067 PMD 0 Oops: 0002 [1] SMP last sysfs file: /kernel/kexec_crash_loaded CPU 0 Modules linked in: autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables ipv6 xfrm_nalgo crypto_api cpufreq_ondemand powernow_k8 freq_table dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport shpchp ide_cd cdrom i2c_piix4 i2c_core hpilo bnx2 serio_raw pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod qla2xxx scsi_transport_fc cciss sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 29539, comm: runtest.sh Not tainted 2.6.18-164.8.1.el5 #1 RIP: 0010:[<ffffffff80080312>] [<ffffffff80080312>] cacheinfo_cpu_callback+0x458/0x4c3 RSP: 0018:ffff812029e6dd28 EFLAGS: 00010293 RAX: 00000000000000a8 RBX: 000000000000000b RCX: 0000000000000000 RDX: 0000000000000023 RSI: 00000000000000ff RDI: ffff81042f9c9c40 RBP: ffff81042f9c9c28 R08: 000000000000001c R09: 0000000000000024 R10: ffff81042f9c9c40 R11: ffffffff8002f3d9 R12: 00000000000000a8 R13: 0000000000000003 R14: 0000000000000150 R15: ffff81041fe578c0 FS: 00002b7a7ced2e10(0000) GS:ffffffff803c1000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000000000c0 CR3: 0000002029253000 CR4: 00000000000006e0 Process runtest.sh (pid: 29539, threadinfo ffff812029e6c000, task ffff81202feff860) Stack: ffff810422acc300 0000000000000282 ffff810422acc300 ffff81042d89d880 ffffffff8045b408 0000000000000282 ffff81010eaeb690 ffff81010eaeb680 ffff81010eaeb690 ffffffff803333d0 ffffffff80333320 ffff81010eaeb690 Call Trace: [<ffffffff8014e7f1>] kobject_cleanup+0x62/0x7e [<ffffffff8014e80d>] kobject_release+0x0/0x9 [<ffffffff8014e80d>] kobject_release+0x0/0x9 [<ffffffff8014e80d>] kobject_release+0x0/0x9 [<ffffffff80066eaa>] notifier_call_chain+0x20/0x32 [<ffffffff800a45bb>] _cpu_down+0x191/0x265 [<ffffffff800a46b8>] cpu_down+0x29/0x3d [<ffffffff801c3eaf>] store_online+0x29/0x67 [<ffffffff8010ae69>] sysfs_write_file+0xb9/0xe8 [<ffffffff80016942>] vfs_write+0xce/0x174 [<ffffffff800171fa>] sys_write+0x45/0x6e [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Code: f0 0f b3 58 18 48 8d 75 18 89 d7 e8 00 dd 0c 00 3d fe 00 00 RIP [<ffffffff80080312>] cacheinfo_cpu_callback+0x458/0x4c3 RSP <ffff812029e6dd28> CR2: 00000000000000c0 I believe I have discovered what the problem is, and it relates to the following two chunks of code in RHEL5 (sorry for the cut-and-paste). FWIW, this code is very similar to upstream... static void __cpuinit cache_shared_cpu_map_setup(unsigned int cpu, int index) { struct _cpuid4_info *this_leaf, *sibling_leaf; unsigned long num_threads_sharing; int index_msb, i; struct cpuinfo_x86 *c = cpu_data; if ((index == 3) && (c->x86_vendor == X86_VENDOR_AMD)) { for_each_online_cpu(i) { if (cpuid4_info[i] == NULL) continue; this_leaf = CPUID4_INFO_IDX(i, index); this_leaf->shared_cpu_map = c[i].llc_shared_map; } return; } and static void __cpuinit cache_remove_shared_cpu_map(unsigned int cpu, int index) { struct _cpuid4_info *this_leaf, *sibling_leaf; int sibling; this_leaf = CPUID4_INFO_IDX(cpu, index); for_each_cpu_mask(sibling, this_leaf->shared_cpu_map) { sibling_leaf = CPUID4_INFO_IDX(sibling, index); cpu_clear(cpu, sibling_leaf->shared_cpu_map); } } When a cpu_up (protected by a mutex_lock) is executed cache_shared_cpu_map_setup() is called. When a cpu_down (protected by the same mutex) is executed cache_remove_shared_cpu_map() is called. Consider the following example with two CPUS A & B, which are siblings. At boot time, the global cpuinfo_x86 cpu_data structs are populated such that A and B both have the same llc_shared_map. This value, AFAICT, is only added to and never has elements removed from it. For most uses, this value is *static* after the initial system boot. DOWN CPU A results in cpuid4_info[A] = NULL DOWN CPU B results in cpuid4_info[B] = NULL UP CPU B sets cpuid4_info[B] not NULL and this_leaf = CPUID4_INFO_IDX(i, index); this_leaf->shared_cpu_map = c[i].llc_shared_map for all cpus EXCEPT A (because cpuid4_info[A] == NULL) DOWN CPU B in cache_remove_shared_cpu_map() this_leaf = CPUID4_INFO_IDX(cpu, index); for_each_cpu_mask(sibling, this_leaf->shared_cpu_map) { ^^^ this_leaf->shared_cpu_map = the static llc_shared_map, includes CPU A. sibling_leaf = CPUID4_INFO_IDX(sibling, index); ^^^^ when we get to CPU A, sibling_leaf = NULL cpu_clear(cpu, sibling_leaf->shared_cpu_map); ^^^^^ NULL pointer dereference. } I'm not entirely sure what the correct fix is here. I'm somewhat confused about the loop in cache_shared_cpu_map_setup() -- I'm not sure why the code has to examine EVERY CPU in the system -- shouldn't it only look at the cpus in c[cpu].llc_shared_map? I have attached a patch that seems to resolve the problem (and addresses my concern about the loop in cache_shared_cpu_map_setup(). Andreas, do you have any other suggestions on a fix here? I'm obviously more than willing to defer to your opinion. ------------------- Additionally, I have been running this patch for a few hours without any panics. P.
Created attachment 376107 [details] Initial patch
NOTE: THIS COMMENT MAY LEAD TO ANOTHER BZ BEING OPENED. I AM NOT SURE IF THE TWO ISSUES ARE RELATED. AS I SAID PREVIOUSLY, I'M SEEING THIS PANIC AS WELL AS THE PANIC IN COMMENT #1. THE PANIC BELOW, HOWEVER, SEEMS TO ALSO BE REPRODUCIBLE WITH 164.EL5. So the first panic, in comment #1, seems to be resolved. I'm now left with at 0000000000000024 RIP: [<ffffffff8843845f>] :powernow_k8:powernowk8_get+0x109/0x152 PGD 0 Oops: 0000 [1] SMP last sysfs file: /class/cpuid/cpu10/dev CPU 18 Modules linked in: autofs4(U) hidp(U) nfs(U) fscache(U) nfs_acl(U) rfcomm(U) l2cap(U) bluetooth(U) lockd(U) sunrpc(U) ip_conntrack_netbios_ns(U) ipt_REJECT(U) xt_state(U) ip_conntrack(U) nfnetlink(U) iptable_filter(U) ip_tables(U) ip6t_REJECT(U) xt_tcpudp(U) ip6table_filter(U) ip6_tables(U) x_tables(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) cpufreq_ondemand(U) powernow_k8(U) freq_table(U) dm_multipath(U) scsi_dh(U) video(U) hwmon(U) backlight(U) sbs(U) i2c_ec(U) dell_wmi(U) wmi(U) button(U) battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parport(U) ide_cd(U) i2c_piix4(U) cdrom(U) bnx2(U) shpchp(U) i2c_core(U) amd64_edac_mod(U) hpilo(U) serio_raw(U) pcspkr(U) edac_mc(U) dm_raid45(U) dm_message(U) dm_region_hash(U) dm_mem_cache(U) dm_snapshot(U) dm_zero(U) dm_mirror(U) dm_log(U) dm_mod(U) qla2xxx(U) scsi_transport_fc(U) cciss(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U) Pid: 146, comm: events/0 Tainted: G 2.6.18.4 #15 RIP: 0010:[<ffffffff8843845f>] [<ffffffff8843845f>] :powernow_k8:powernowk8_get+0x109/0x152 RSP: 0018:ffff8103fed3fd50 EFLAGS: 00010202 RAX: 0000000000000020 RBX: ffff8103ffb84cc0 RCX: 00000000c0010063 RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8103ffb84cc0 RBP: 0000000000000012 R08: ffff8103fed3e000 R09: 0000000000000001 R10: 00002b04acc08880 R11: ffff8103ffc268e0 R12: 0000000000000000 R13: 0000000000000282 R14: 0000000000000000 R15: ffffffff8006db05 FS: 00002b04ace17710(0000) GS:ffff8103ffd0a2c0(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000000024 CR3: 0000000000201000 CR4: 00000000000006a0 Process events/0 (pid: 146, threadinfo ffff8103fed3e000, task ffff8103fed2a7a0) Stack: ffffffffffffffff 0000000000000000 0000000000000000 ffff8103fed2a7a0 0000000000040000 0000000000000000 0000000000000000 0000000000000000 0000000000000001 0000000000000000 0000000000000000 0000000000000000 Call Trace: [<ffffffff8021d96d>] __cpufreq_get+0x24/0x5e [<ffffffff8021e64d>] cpufreq_get+0x28/0x42 [<ffffffff8006db1b>] handle_cpufreq_delayed_get+0x16/0x39 [<ffffffff8004d9db>] run_workqueue+0x94/0xe4 [<ffffffff8004a246>] worker_thread+0x0/0x122 [<ffffffff8004a336>] worker_thread+0xf0/0x122 [<ffffffff8008c9ef>] default_wake_function+0x0/0xe [<ffffffff80032afe>] kthread+0xfe/0x132 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff80032a00>] kthread+0x0/0x132 [<ffffffff8005dfa7>] child_rip+0x0/0x11 Code: 44 8b 60 04 eb 0e 69 43 2c a0 86 01 00 44 8d a0 00 35 0c 00 RIP [<ffffffff8843845f>] :powernow_k8:powernowk8_get+0x109/0x152 P.
Created attachment 376973 [details] RHEL5 fix for this issue
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
in kernel-2.6.18-178.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please update the appropriate value in the Verified field (cf_verified) to indicate this fix has been successfully verified. Include a comment with verification details.
Script developed by Prarit to reproduce this bug: #!/bin/bash function version2 () { for k in `seq 1 10`; do for j in `seq 1 100000`; do let "i=($RANDOM % 126)+1" if [ $i -lt 64 ]; then echo "OFFLINING CPU $i" echo 0 > /sys/devices/system/cpu/cpu$i/online elif [ $i -gt 64 ]; then let "i=$i-64" echo "ONLINING CPU $i" echo 1 > /sys/devices/system/cpu/cpu$i/online fi dmesg | grep 'kref' >& /dev/null if [ $? -eq 0 ]; then exit 1 fi done done } version2
I believe the 5.4.z flag needs to be set for this to be included in the 15Dec09 Z-stream release.
Tested the fix. We were able to trigger the issue on a Dinar with the RC1 kernel and a script similar to the one posted in comment #30 of Red Hat bug #541953. It took only a couple of seconds to take the box down. After installing and booting into the RC2 kernel (2.6.18-164.9.1.el5) we ran the script again. The box survived 2 hours of random CPU hotplug. It's safe to assume that this fixed the problem.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0178.html