604974 – 32 bit Guest kernel panic when onlining cpu

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 604974 - 32 bit Guest kernel panic when onlining cpu

Summary: 32 bit Guest kernel panic when onlining cpu

Keywords:
Status:	CLOSED DUPLICATE of bug 581722
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	6.0
Hardware:	All
OS:	Linux
Priority:	high
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Prarit Bhargava
QA Contact:	Red Hat Kernel QE team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	599016
TreeView+	depends on / blocked

Reported:	2010-06-17 07:40 UTC by Joy Pu
Modified:	2010-07-16 18:33 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2010-07-16 18:33:49 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Joy Pu 2010-06-17 07:40:44 UTC

Description:
Hotplug cpu in RHEL6-32 guest will cause a kernel panic. This can reproduce in 2.6.32-36 and 2.6.32-33 kernel, but not in 2.6.32-25 kernel.

Version-Release number of selected component (if applicable):
host kernel: 2.6.32-33.el6.x86_64 
guest kernel: 2.6.32-36.el6.i686
# rpm -qa | grep qemu
qemu-kvm-0.12.1.2-2.68.el6.x86_64
qemu-kvm-debuginfo-0.12.1.2-2.68.el6.x86_64
qemu-img-0.12.1.2-2.68.el6.x86_64
gpxe-roms-qemu-0.9.7-6.3.el6.noarch
qemu-kvm-tools-0.12.1.2-2.68.el6.x86_64

How reproducible:
always

Steps to Reproduce:
1. boot up a smp RHEL-6.0-32 guest
2. listen to serial by nc
# nc -U /tmp/serial-20100617-130306-gcxW
3. hotplug cpu1 with echo:
# echo 0 > /sys/devices/system/cpu/cpu1/online
# echo 1 > /sys/devices/system/cpu/cpu1/online

Actual results:
guest kernel panic when hotplug cpu1

Expected results:
guest can hotplug cpu1 successfully

Additional info:
1. The command line:
#  /root/work/autotest/client/tests/kvm/qemu -name vm1 -monitor tcp:0:6001,server,nowait -drive file=/root/work/autotest/client/tests/kvm/images/RHEL-Server-6.0-32-virtio.qcow2,if=virtio,cache=none,boot=on,aio=native -net nic,vlan=0,model=virtio,macaddr=02:30:0D:20:0b:95 -net tap,vlan=0,ifname=virtio_0_6001,script=/root/work/autotest/client/tests/kvm/scripts/qemu-ifup-switch,downscript=no,vhost=on -m 4096 -smp 2 -soundhw ac97 -redir tcp:5000::22 -vnc :0 -spice port=8000,disable-ticketing -usbdevice tablet -rtc-td-hack -cpu qemu64,+sse2 -no-kvm-pit-reinjection -serial unix:/tmp/serial-20100617-130306-gcxW,server,nowait -no-hpet

2.Host cpuinfo
model           : 2
model name      : AMD Phenom(tm) 8750 Triple-Core Processor
stepping        : 3
cpu MHz         : 1200.000
cache size      : 512 KB
physical id     : 0
siblings        : 3
core id         : 2
cpu cores       : 3
apicid          : 2
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat
pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm
3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16
popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse
3dnowprefetch osvw ibs
bogomips        : 4809.90
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

3.Kernel panic info
invalid opcode: 0000 [#1] SMP 

last sysfs file: /sys/devices/system/cpu/cpu1/online

Modules linked in: autofs4(U) sunrpc(U) ip6t_REJECT(U) nf_conntrack_ipv6(U) ip6table_filter(U) ip6_tables(U) ipv6(U) dm_mirror(U) dm_region_hash(U) dm_log(U) snd_intel8x0(U) snd_ac97_codec(U) ac97_bus(U) snd_seq(U) snd_seq_device(U) snd_pcm(U) ppdev(U) i2c_piix4(U) snd_timer(U) parport_pc(U) i2c_core(U) parport(U) snd(U) soundcore(U) snd_page_alloc(U) sg(U) ext4(U) mbcache(U) jbd2(U) sr_mod(U) cdrom(U) ata_generic(U) pata_acpi(U) virtio_blk(U) virtio_net(U) virtio_pci(U) virtio_ring(U) virtio(U) ata_piix(U) dm_mod(U) [last unloaded: scsi_wait_scan]



Pid: 1652, comm: bash Tainted: G S      W  (2.6.32-36.el6.i686 #1) Bochs

EIP: 0060:[<c0443dc1>] EFLAGS: 00210046 CPU: 0

EIP is at scheduler_tick+0xe1/0x240

EAX: 00000000 EBX: c0825b80 ECX: f4500000 EDX: c1e09080

ESI: c1e09080 EDI: 000095a6 EBP: 000009fa ESP: f4501d30

 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068

Process bash (pid: 1652, ti=f4500000 task=f45d5a90 task.ti=f4500000)

Stack:

 00000e31 00000000 f45d5a90 f45d5a90 00000000 00000000 c1e04fc0 c045f71f

<0> f4501e18 7109d7f0 0000000e c047f209 00000000 c04735ad c1e04fc0 c1e04f30

<0> c1e04f00 c047f1b0 c0473c07 00000000 3d10eb68 3d10eb68 f4501de8 33e22200

Call Trace:

 [<c045f71f>] ? update_process_times+0x3f/0x60

 [<c047f209>] ? tick_sched_timer+0x59/0xd0

 [<c04735ad>] ? __remove_hrtimer+0x2d/0xa0

 [<c047f1b0>] ? tick_sched_timer+0x0/0xd0

 [<c0473c07>] ? __run_hrtimer+0x77/0x190

 [<c0473fb9>] ? hrtimer_interrupt+0x129/0x2a0

 [<c04b1425>] ? rcu_process_callbacks+0x35/0x40

 [<c0456935>] ? __do_softirq+0xb5/0x1b0

 [<c044e396>] ? copy_process+0x796/0xff0

 [<c042599f>] ? smp_apic_timer_interrupt+0x4f/0x90

 [<c040a335>] ? apic_timer_interrupt+0x31/0x38

 [<c044e396>] ? copy_process+0x796/0xff0

 [<c0819edf>] ? text_poke+0x1af/0x200

 [<c040ef26>] ? alternatives_smp_switch+0xe6/0x190

 [<c081d756>] ? _etext+0x0/0x2

 [<c081187d>] ? native_cpu_up+0x1a1/0xaa1

 [<c081227e>] ? do_fork_idle+0x0/0x17

 [<c08136bf>] ? _cpu_up+0x99/0x111

 [<c04f80a2>] ? handle_mm_fault+0x132/0x1d0

 [<c081377f>] ? cpu_up+0x48/0x57

 [<c0805af8>] ? store_online+0x58/0x80

 [<c0805aa0>] ? store_online+0x0/0x80

 [<c06a0825>] ? sysdev_store+0x25/0x40

 [<c0574b69>] ? sysfs_write_file+0x99/0x100

 [<c0574ad0>] ? sysfs_write_file+0x0/0x100

 [<c051b5a0>] ? vfs_write+0xa0/0x190

 [<c051c031>] ? sys_write+0x41/0x70

 [<c04098fb>] ? sysenter_do_call+0x12/0x28

Code: 38 04 00 00 39 d0 74 11 89 c1 29 d1 89 86 ac 04 00 00 f0 01 0d cc 99 af c0 8b 44 24 08 31 c9 8b 58 28 89 c2 89 f0 ff 53 44 89 f2 <f0> 66 c7 02 00 00 8b 44 24 08 e8 d0 ea 08 00 b8 80 70 ad c0 8b 

EIP: [<c0443dc1>] scheduler_tick+0xe1/0x240 SS:ESP 0068:f4501d30

Comment 2 RHEL Program Management 2010-06-17 07:53:20 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 4 Peter Martuccelli 2010-06-25 15:23:19 UTC

This is an CPU online problem, not a hotplug issue.

Comment 6 Don Zickus 2010-07-16 18:20:32 UTC

I believe this is a duplicate of bz581722.

The hrtimers do not get shutdown correctly.  The patch hasn't made it into any kernel yet, otherwise I would just have you try it.  The other bz has an attached patch.   But I can whip up a scratch kernel with that patch too, if you want to verify it fixes your problem.

Comment 7 Prarit Bhargava 2010-07-16 18:33:49 UTC

Don, for now I'm dup'ing to 581722.  If it turns out that this is not a dup, I can undup and I can go from there...

P.

*** This bug has been marked as a duplicate of bug 581722 ***

Note You need to log in before you can comment on or make changes to this bug.