RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 654511 - kernel panic sometime when reboot guest on AMD host
Summary: kernel panic sometime when reboot guest on AMD host
Keywords:
Status: CLOSED DUPLICATE of bug 654532
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.1
Hardware: Unspecified
OS: Unspecified
low
medium
Target Milestone: rc
: ---
Assignee: Gleb Natapov
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 580954
TreeView+ depends on / blocked
 
Reported: 2010-11-18 04:52 UTC by Suqin Huang
Modified: 2013-12-09 00:51 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-12-22 08:42:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
call trace in rhel6 guest (20.96 KB, image/png)
2010-11-18 04:59 UTC, Suqin Huang
no flags Details

Description Suqin Huang 2010-11-18 04:52:03 UTC
Description of problem:
Both rhel6-64 and rhel5-64 kernel panic sometime when reboot guest on AMD host.

Version-Release number of selected component (if applicable):
qemu-kvm-0.12.1.2-2.119.el6.x86_64

How reproducible:
sometime

Steps to Reproduce:
1. Install new guest, and update to the latest kernel
/usr/libexec/qemu-kvm -drive file='/usr/images/RHEL-Server-5.5-64-virtio.qcow2',index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,format=qcow2,aio=native \
-device virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 -device virtio-net-pci,mac=9a:63:1b:a8:15:25,netdev=idVl99xY,id=ndev00idVl99xY,bus=pci.0,addr=0x3 -netdev tap,id=idVl99xY,ifname='t0-120735-glc8',script='/usr/local/staf/test/RHEV/kvm-new/autotest/client/tests/kvm/scripts/qemu-ifup-switch',downscript='no' -m 2048 -smp 2,cores=1,threads=1,sockets=2 -cpu cpu64-rhel6,+sse2,+x2apic -spice port=8000,disable-ticketing -vga qxl -rtc base=utc,clock=host,driftfix=none -M rhel6.0.0 -usbdevice tablet -no-kvm-pit-reinjection -enable-kvm 
2. reboot guest
3.
  
Actual results:


Expected results:


Additional info:
1.
2010-11-18 12:25:11: Booting processor 1/2 APIC 0x1
2010-11-18 12:25:16: Initializing CPU#1
2010-11-18 12:25:16: Stuck ??
2010-11-18 12:25:16: Inquiring remote APIC #1...
2010-11-18 12:25:16: ... APIC #1 ID: failed
2010-11-18 12:25:16: ... APIC #1 VERSION: failed
2010-11-18 12:25:16: ... APIC #1 SPIV: failed
2010-11-18 12:25:16: Unable to handle kernel paging request at ffff81000253d700 RIP:
2010-11-18 12:25:16:  [<ffff81000253d700>]
2010-11-18 12:25:16: PGD 10063 PUD 11063 PMD 80000000024001e3 PTE 0
2010-11-18 12:25:16: Oops: 0011 [1] SMP
2010-11-18 12:25:16: last sysfs file:
2010-11-18 12:25:16: CPU 0
2010-11-18 12:25:16: Modules linked in:
2010-11-18 12:25:16: Pid: 6, comm: ksoftirqd/1 Not tainted 2.6.18-232.el5 #1
2010-11-18 12:25:16: RIP: 0010:[<ffff81000253d700>]  [<ffff81000253d700>]
2010-11-18 12:25:16: RSP: 0000:ffff81007ff07ef0  EFLAGS: 00010246
2010-11-18 12:25:16: RAX: 0000000000000000 RBX: 0000000000010046 RCX: 0000000000000000
2010-11-18 12:25:16: RDX: ffff810002536420 RSI: ffff81007ff860c0 RDI: ffff81007ff860c0
2010-11-18 12:25:16: RBP: ffffffff8005dde9 R08: ffff81007ff06000 R09: 0000000000000000
2010-11-18 12:25:16: R10: ffffffff8049a420 R11: 0000000000000048 R12: ffff81007ff07ed8
2010-11-18 12:25:16: R13: 0000000000000018 R14: ffffffff802b4360 R15: 0000000000000001
2010-11-18 12:25:16: FS:  0000000000000000(0000) GS:ffffffff80424000(0000) knlGS:0000000000000000
2010-11-18 12:25:16: CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
2010-11-18 12:25:16: CR2: ffff81000253d700 CR3: 0000000000201000 CR4: 00000000000006e0
2010-11-18 12:25:17: Process ksoftirqd/1 (pid: 6, threadinfo ffff81007ff06000, task ffff81007ff860c0)
2010-11-18 12:25:17: Stack:  ffff81007ffa8100 00000000ffff8100 ffff81007ff15000 000000008005d67c
2010-11-18 12:25:17:  ffff81007ffa8100 0000000000000200 ffff81007ffa8460 0000000000000000
2010-11-18 12:25:17:  ffff81007ff91a0d 0000000000000000 0000000000000000 0000000000000000
2010-11-18 12:25:17: Call Trace:
2010-11-18 12:25:17:  [<ffffffff8006c966>] math_state_restore+0x2c/0x4c
2010-11-18 12:25:17:  [<ffffffff8005dde9>] error_exit+0x0/0x84
2010-11-18 12:25:17: 
2010-11-18 12:25:17: 
2010-11-18 12:25:17: Code: 00 60 f0 7f 00 81 ff ff 00 f0 f0 7f 00 81 ff ff 00 00 f1 7f
2010-11-18 12:25:17: RIP  [<ffff81000253d700>]
2010-11-18 12:25:17:  RSP <ffff81007ff07ef0>
2010-11-18 12:25:17: CR2: ffff81000253d700
2010-11-18 12:25:17:  <0>Kernel panic - not syncing: Fatal exception
2010-11-18 12:25:17: 
2010-11-18 12:31:06: (Process terminated with status 0)

2. 
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) KVM internal error. Suberror: 1
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) rax 0000000000000040 rbx 000000000000000a rcx 000000000000000a rdx 000000000000896a
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) rsi 0000000000000012 rdi 0000000000000008 rsp 0000000000000360 rbp 0000000000000000
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) r8  0000000000000000 r9  0000000000000000 r10 0000000000000000 r11 0000000000000000
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 r15 0000000000000000
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) rip 0000000000000004 rflags 00000046
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) cs 0600 (00006000/0000ffff p 1 dpl 0 db 0 s 1 type b l 0 g 0 avl 0)
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) ds 0040 (00000400/0000ffff p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0)
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) es 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0)
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) ss 9cc0 (0009cc00/0000ffff p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0)
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) fs 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0)
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) gs 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0)
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) tr 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0)
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) ldt 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0)
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) gdt 0/ffff
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) idt 0/ffff
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) cr0 10 cr2 0 cr3 0 cr4 0 cr8 0 efer 0
11/18 12:25:47 DEBUG|kvm_subpro:0700| (qemu) emulation failure, check dmesg for details

3. dmesg:
device t0-120735-glc8 entered promiscuous mode
switch: port 2(t0-120735-glc8) entering learning state
t0-120735-glc8: no IPv6 routers present
kvm: 8736: cpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0x0
kvm: 8736: cpu0 unimplemented perfctr wrmsr: 0xc0010000 data 0x130076
kvm: 8736: cpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0xffffffffffdcd4be
kvm: 8736: cpu0 unimplemented perfctr wrmsr: 0xc0010000 data 0x530076
kvm: 8736: cpu1 unimplemented perfctr wrmsr: 0xc0010004 data 0x0
kvm: 8736: cpu1 unimplemented perfctr wrmsr: 0xc0010000 data 0x130076
kvm: 8736: cpu1 unimplemented perfctr wrmsr: 0xc0010004 data 0xffffffffffdcd4be
kvm: 8736: cpu1 unimplemented perfctr wrmsr: 0xc0010000 data 0x530076
switch: port 2(t0-120735-glc8) entering forwarding state
kvm: 8736: cpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0x0
kvm: 8736: cpu0 unimplemented perfctr wrmsr: 0xc0010000 data 0x130076
kvm: 8736: cpu0 unimplemented perfctr wrmsr: 0xc0010004 data 0xffffffffffdcd4be
kvm: 8736: cpu0 unimplemented perfctr wrmsr: 0xc0010000 data 0x530076
Ignoring delivery mode 3
Ignoring delivery mode 3
Ignoring delivery mode 3
switch: port 2(t0-120735-glc8) entering disabled state
device t0-120735-glc8 left promiscuous mode
switch: port 2(t0-120735-glc8) entering disabled state

4. host:  
2 cpu and 4G mem

processor	: 1
vendor_id	: AuthenticAMD
cpu family	: 15
model		: 107
model name	: AMD Athlon(tm) Dual Core Processor 4450B
stepping	: 2
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy 3dnowprefetch lbrv

5. guest:
both rhel5-64 and rhel6-64

6. can not reproduce on Intel machine

Comment 1 Suqin Huang 2010-11-18 04:59:13 UTC
Created attachment 461209 [details]
call trace in rhel6 guest

Comment 3 Gleb Natapov 2010-11-22 13:56:50 UTC
What is the host kernel version? Can you attach log output instead of pasting it here (it gets garbled). Can you reproduce on AMD with NPT?

Comment 4 Suqin Huang 2010-11-23 05:54:00 UTC
host kernel: 2.6.32-83.el6.x86_64

npt is enabled when this happen

[root@amd-4450b-4-1 parameters]# cat npt 
1

Comment 5 Gleb Natapov 2010-11-23 06:46:54 UTC
(In reply to comment #4)
> host kernel: 2.6.32-83.el6.x86_64
> 
> npt is enabled when this happen
> 
> [root@amd-4450b-4-1 parameters]# cat npt 
> 1

your cpuinfo above shows that cpu does not support npt. Try on a cpu where npt is present in cpu flags.

Comment 6 Suqin Huang 2010-11-23 08:45:58 UTC
repeat 40 times, can not reproduce on 9600b 


processor	: 3
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 2
model name	: AMD Phenom(tm) 9600B Quad-Core Processor
stepping	: 3
cpu MHz		: 1150.000
cache size	: 512 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 5
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid p

Comment 7 Suqin Huang 2010-11-23 08:53:28 UTC
processor	: 3
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 2
model name	: AMD Phenom(tm) 9600B Quad-Core Processor
stepping	: 3
cpu MHz		: 1150.000
cache size	: 512 KB
physical id	: 0
siblings	: 4
core id		: 3
cpu cores	: 4
apicid		: 3
initial apicid	: 3
fpu		: yes
fpu_exception	: yes
cpuid level	: 5
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs npt lbrv svm_lock

Comment 8 Suqin Huang 2010-11-23 09:18:51 UTC
1. 
(qemu) KVM internal error. Suberror: 1
(qemu) rax 0000000000000040 rbx 000000000000000a rcx 000000000000000a rdx 000000000000896a
(qemu) rsi 0000000000000012 rdi 0000000000000008 rsp 0000000000000360 rbp 0000000000000000
(qemu) r8  0000000000000000 r9  0000000000000000 r10 0000000000000000 r11 0000000000000000
(qemu) r12 0000000000000000 r13 0000000000000000 r14 0000000000000000 r15 0000000000000000
(qemu) rip 0000000000000004 rflags 00000046
(qemu) cs 0600 (00006000/0000ffff p 1 dpl 0 db 0 s 1 type b l 0 g 0 avl 0)
(qemu) ds 0040 (00000400/0000ffff p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0)
(qemu) es 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0)
(qemu) ss 9cc0 (0009cc00/0000ffff p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0)
(qemu) fs 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0)
(qemu) gs 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 1 type 3 l 0 g 0 avl 0)
(qemu) tr 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 0 type b l 0 g 0 avl 0)
(qemu) ldt 0000 (00000000/0000ffff p 1 dpl 0 db 0 s 0 type 2 l 0 g 0 avl 0)
(qemu) gdt 0/ffff
(qemu) idt 0/ffff
(qemu) cr0 10 cr2 0 cr3 0 cr4 0 cr8 0 efer 0
(qemu) emulation failure, check dmesg for details

2. guest kernel
rhel5: 2.6.18-232.el5
rhel6: 2.6.32-71.el6.x86_64

Comment 9 Gleb Natapov 2010-12-21 06:54:18 UTC
Please retest with kernel-2.6.32-93.el6.

Comment 10 Suqin Huang 2010-12-22 08:24:43 UTC
repeat 100 times, can reproduce 
host:
2.6.32-93.el6.x86_64 && qemu-kvm-0.12.1.2-2.127.el6.x86_64  
guest:
rhel6: 2.6.32-71.el6.x86_64

Comment 11 Suqin Huang 2010-12-22 08:25:22 UTC
(In reply to comment #10)
> repeat 100 times, can reproduce 
make a mistake, can not reproduce

> host:
> 2.6.32-93.el6.x86_64 && qemu-kvm-0.12.1.2-2.127.el6.x86_64  
> guest:
> rhel6: 2.6.32-71.el6.x86_64

Comment 12 Gleb Natapov 2010-12-22 08:42:21 UTC

*** This bug has been marked as a duplicate of bug 654532 ***


Note You need to log in before you can comment on or make changes to this bug.