Bug 1377920

Summary: Guest fails reboot and causes kernel-panic
Product: Red Hat Enterprise Linux 7 Reporter: Pei Zhang <pezhang>
Component: qemu-kvm-rhevAssignee: Dr. David Alan Gilbert <dgilbert>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.3CC: chayang, coli, crose, dgilbert, hhuang, juzhang, michen, mrezanin, ngu, pbonzini, pezhang, prarit, qzhang, virt-maint, xfu, yduan
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: qemu-kvm-rhev-2.6.0-27.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-07 21:36:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Pei Zhang 2016-09-21 03:06:15 UTC
Description of problem:
Reboot guest will cause guest kernel-panic.

Version-Release number of selected component (if applicable):
Host:
3.10.0-508.rt56.413.el7.x86_64
qemu-kvm-rhev-2.6.0-26.el7.x86_64

Guest:
3.10.0-508.rt56.413.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Boot guest
# /usr/libexec/qemu-kvm -name rhel7.3 \
-cpu IvyBridge -m 4G \
-smp 8,sockets=1,cores=8,threads=1 \
-drive file=/home/pezhang/rhel7.3.qcow2,format=qcow2,if=none,id=drive-virtio-blk0,werror=stop,rerror=stop \
-device virtio-blk-pci,drive=drive-virtio-blk0,id=virtio-blk0 \
-monitor stdio \
-serial unix:/tmp/monitor,server,nowait \

2. Reboot guest, fail and kernel panic.
Rebooting.
[    0.000000] console [ttyS0] enabled
[    0.000000] allocated 67108864 bytes of page_cgroup
[    0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
[    0.000000] tsc: Detected 2299.996 MHz processor
[    0.208161] Calibrating delay loop (skipped) preset value.. 4599.99 BogoMIPS (lpj=2299996)
[    0.208162] pid_max: default: 32768 minimum: 301
[    0.208197] Security Framework initialized
[    0.208205] SELinux:  Initializing.
[    0.208224] SELinux:  Starting in permissive mode
[    0.208929] Dentry cache hash table entries: 524288 (order: 11, 8388608 bytes)
[    0.212585] Inode-cache hash table entries: 262144 (order: 9, 2097152 bytes)
[    0.213525] Mount-cache hash table entries: 4096
[    0.213802] Initializing cgroup subsys memory
[    0.213810] Initializing cgroup subsys devices
[    0.213811] Initializing cgroup subsys freezer
[    0.213812] Initializing cgroup subsys net_cls
[    0.213813] Initializing cgroup subsys blkio
[    0.213814] Initializing cgroup subsys perf_event
[    0.213815] Initializing cgroup subsys hugetlb
[    0.213816] Initializing cgroup subsys pids
[    0.213817] Initializing cgroup subsys net_prio
[    0.213922] CPU: Physical Processor ID: 0
[    0.213923] CPU: Processor Core ID: 0
[    0.214699] mce: CPU supports 10 MCE banks
[    0.214738] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[    0.214738] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0
[    0.214739] tlb_flushall_shift: 2
[    0.215006] Freeing SMP alternatives: 24k freed
[    0.217266] ACPI: Core revision 20130517
[    0.218067] ACPI: All ACPI Tables successfully acquired
[    0.218098] ftrace: allocating 25317 entries in 99 pages
[    0.245401] smpboot: Max logical packages: 1
[    0.245402] smpboot: APIC(0) Converting physical 0 to logical package 0
[    0.245594] Enabling x2apic
[    0.245599] Enabled x2apic
[    0.245841] Switched APIC routing to physical x2apic.
[    0.246886] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.246889] smpboot: CPU0: Intel Xeon E3-12xx v2 (Ivy Bridge) (fam: 06, model: 3a, stepping: 09)
[    0.246898] TSC deadline timer enabled
[    0.246909] Performance Events: unsupported p6 CPU model 58 no PMU driver, software events only.
[    0.247249] KVM setup paravirtual spinlock
[    0.258408] NMI watchdog: Disabling watchdog on nohz_full cores by default
[   10.541911] smpboot: Booting Node   0, Processors  #1
[   10.541911] smpboot: do_boot_cpu failed(-1) to wakeup CPU#1
[   20.824343]  #2
[   20.824344] smpboot: do_boot_cpu failed(-1) to wakeup CPU#2
[   31.106817]  #3
[   31.106818] smpboot: do_boot_cpu failed(-1) to wakeup CPU#3
[   41.389860]  #4
[   41.389860] smpboot: do_boot_cpu failed(-1) to wakeup CPU#4
[   51.672115]  #5
[   51.672115] smpboot: do_boot_cpu failed(-1) to wakeup CPU#5
[   61.954779]  #6
[   61.954779] smpboot: do_boot_cpu failed(-1) to wakeup CPU#6
[   61.958918]  #7 OK
[   72.237682] smpboot: do_boot_cpu failed(-1) to wakeup CPU#7
[   72.237704] Brought up 1 CPUs
[   72.237705] smpboot: Total of 1 processors activated (4599.99 BogoMIPS)
[   72.238064] ------------[ cut here ]------------
[   72.238065] kernel BUG at kernel/sched/core.c:8400!
[   72.238067] invalid opcode: 0000 [#1] PREEMPT SMP 
[   72.238067] Modules linked in:
[   72.238069] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-508.rt56.413.el7.x86_64 #1
[   72.238070] Hardware name: Red Hat KVM, BIOS 1.9.1-5.el7 04/01/2014
[   72.238071] task: ffff880136700000 ti: ffff880136708000 task.ti: ffff880136708000
[   72.238074] RIP: 0010:[<ffffffff81b58797>]  [<ffffffff81b58797>] sched_init_smp+0x4ac/0x4e3
[   72.238075] RSP: 0000:ffff88013670bbc0  EFLAGS: 00010282
[   72.238075] RAX: 00000000ffffffea RBX: 0000000000000000 RCX: 0000000000000008
[   72.238076] RDX: 0000000000000000 RSI: 00000000013c013a RDI: 0000000000000282
[   72.238076] RBP: ffff88013670be80 R08: 0000000000000000 R09: 0000000000000000
[   72.238077] R10: ffffffff81270e73 R11: ffff88013b00da00 R12: 000000000000000a
[   72.238077] R13: 000000000000000a R14: 000000000000000a R15: 0000000000000001
[   72.238078] FS:  0000000000000000(0000) GS:ffff88013fc00000(0000) knlGS:0000000000000000
[   72.238079] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   72.238080] CR2: 00000000ffffffff CR3: 00000000019b9000 CR4: 00000000001406f0
[   72.238083] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   72.238083] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[   72.238084] Stack:
[   72.238085]  0000000000000000 0000000000000000 00000000000000e0 ffff880136708000
[   72.238086]  ffff88013670bfd8 0000000000000282 ffff88013670bd88 ffffffff8110e780
[   72.238087]  ffff88013670bc20 ffffffff816864c8 0000000000000282 ffff880136700000
[   72.238088] Call Trace:
[   72.238092]  [<ffffffff8110e780>] ? cpu_stop_park+0x40/0x40
[   72.238095]  [<ffffffff816864c8>] ? preempt_schedule.part.88+0x42/0x5a
[   72.238097]  [<ffffffff81686509>] ? preempt_schedule+0x29/0x40
[   72.238099]  [<ffffffff81687c65>] ? _raw_spin_unlock_irqrestore+0x75/0x80
[   72.238101]  [<ffffffff810ef686>] ? rt_mutex_adjust_prio+0x36/0x40
[   72.238102]  [<ffffffff816869cf>] ? rt_spin_lock_slowunlock+0x3f/0x60
[   72.238104]  [<ffffffff816876c8>] ? __rt_spin_unlock+0x28/0x30
[   72.238106]  [<ffffffff810b0258>] ? lg_global_unlock+0x48/0x60
[   72.238108]  [<ffffffff8110e72f>] ? queue_stop_cpus_work.isra.4+0x8f/0xa0
[   72.238109]  [<ffffffff8110e780>] ? cpu_stop_park+0x40/0x40
[   72.238111]  [<ffffffff8110e780>] ? cpu_stop_park+0x40/0x40
[   72.238113]  [<ffffffff8110e910>] ? __stop_cpus+0x70/0xa0
[   72.238114]  [<ffffffff8110e780>] ? cpu_stop_park+0x40/0x40
[   72.238116]  [<ffffffff8168794e>] ? _mutex_unlock+0xe/0x10
[   72.238118]  [<ffffffff8110ec9f>] ? stop_cpus+0x3f/0x50
[   72.238120]  [<ffffffff81039490>] ? mtrr_restore+0xb0/0xb0
[   72.238122]  [<ffffffff8110ee41>] ? __stop_machine+0x101/0x150
[   72.238124]  [<ffffffff81039490>] ? mtrr_restore+0xb0/0xb0
[   72.238125]  [<ffffffff8168794e>] ? _mutex_unlock+0xe/0x10
[   72.238127]  [<ffffffff8107e236>] ? put_online_cpus+0x56/0x80
[   72.238129]  [<ffffffff8110eec1>] ? stop_machine+0x31/0x40
[   72.238131]  [<ffffffff81039eb3>] ? mtrr_aps_init+0x83/0xa0
[   72.238133]  [<ffffffff81b45292>] ? native_smp_cpus_done+0x10a/0x111
[   72.238135]  [<ffffffff81b35197>] kernel_init_freeable+0x13c/0x262
[   72.238140]  [<ffffffff81671be0>] ? rest_init+0x90/0x90
[   72.238142]  [<ffffffff81671bee>] kernel_init+0xe/0xf0
[   72.238144]  [<ffffffff816900d8>] ret_from_fork+0x58/0x90
[   72.238145]  [<ffffffff81671be0>] ? rest_init+0x90/0x90
[   72.238157] Code: 81 e8 1e a5 b1 ff 48 c7 c7 e0 64 9f 81 e8 12 a5 b1 ff 48 8d b5 50 fd ff ff 65 48 8b 3c 25 c0 d6 00 00 e8 ed 4c 56 ff 85 c0 79 02 <0f> 0b e8 72 b7 56 ff e8 3d 42 57 ff e8 91 04 00 00 48 8b 45 d0 
[   72.238159] RIP  [<ffffffff81b58797>] sched_init_smp+0x4ac/0x4e3
[   72.238159]  RSP <ffff88013670bbc0>
[   72.379033] ---[ end trace 0000000000000001 ]---
[   72.379035] Kernel panic - not syncing: Fatal exception


Actual results:
Guest fails reboot.

Expected results:
Guest should reboot well.

Additional info:
1. This is a regression bug
qemu-kvm-rhev-2.6.0-18.el7.x86_64   work
qemu-kvm-rhev-2.6.0-22.el7.x86_64   work
qemu-kvm-rhev-2.6.0-24.el7.x86_64   work
qemu-kvm-rhev-2.6.0-25.el7.x86_64   work
qemu-kvm-rhev-2.6.0-25.el7.x86_64   doesn't work

Comment 1 Pei Zhang 2016-09-21 03:30:17 UTC
Additional info(continued):
1. (fix typo in above)
qemu-kvm-rhev-2.6.0-26.el7.x86_64   doesn't work

2. Also test with rhel kernel(non-rt), hit same issue
Host:
3.10.0-509.el7.x86_64
qemu-kvm-rhev-2.6.0-26.el7.x86_64

Guest:
3.10.0-509.el7.x86_64

console info
[   10.365787] smpboot: do_boot_cpu failed(-1) to wakeup CPU#1

[   20.637017] smpboot: do_boot_cpu failed(-1) to wakeup CPU#2
[   30.907909] smpboot: do_boot_cpu failed(-1) to wakeup CPU#3
[   42.063551] smpboot: do_boot_cpu failed(-1) to wakeup CPU#1
[   52.318500] smpboot: do_boot_cpu failed(-1) to wakeup CPU#2
[   62.576511] smpboot: do_boot_cpu failed(-1) to wakeup CPU#3

Comment 2 Pei Zhang 2016-09-21 07:20:10 UTC
Additional info(continued)
3. rhel6.8GA hit this regression issue, windows doesn't.

rhel6.8 GA      doesn't work
windows 7       work

Comment 6 Dr. David Alan Gilbert 2016-09-21 09:34:55 UTC
OK, I can reproduce this bug - only happens on my rhel/Xeon hosts, not on my fedora/laptop even with the same qemu.

Comment 7 Dr. David Alan Gilbert 2016-09-21 12:38:54 UTC
If I add back the kvm_put_apic in kvm_arch_put_registers it seems to work again; so I'm guessing that means that something somewhere is changing the APIC state after the point we reset and before that point and we're losing the change.

Comment 8 Dr. David Alan Gilbert 2016-09-21 17:51:01 UTC
Note if I turn on
   apic=verbose debug ignore_loglevel

in the guest, I see it's stalled at 'calibrating APIC timer...'

If I make the code in kvm_arch_put_registers:

  kvm_put_apic
like it was before it apparently works (not checked the migrate)

but if I make it:
  kvm_get_apic
  kvm_put_apic

it fails just like the put isn't there; so again it sounds like some inconsistent state that the put is restoring;  so I do:

  x86_cpu_dump_local_apic_state
  kvm_get_apic
  x86_cpu_dump_local_apic_state
  kvm_put_apic

and expect to see the state that's different that's being lost without the put or with the get/put pair - nothing, no differences in the dump.

Hmm.
I add to the dump the dump of some other fields it hasn't got and the only ones that I've found different are initial_count_load_time and count_shift; however, if I preserve those around the get, it still fails - suggesting it's something else. Hmm.

Comment 9 Dr. David Alan Gilbert 2016-09-22 12:13:18 UTC
Debugging with Paolo;  this works with a host -490 kernel, breaks with a -491

Comment 10 Paolo Bonzini 2016-09-22 12:46:28 UTC
... but it's still a QEMU bug.

Comment 16 Miroslav Rezanina 2016-09-23 13:58:39 UTC
Fix included in qemu-kvm-rhev-2.6.0-27.el7

Comment 18 Pei Zhang 2016-09-24 02:08:36 UTC
==Verification==
Versions:
Host:
3.10.0-510.el7.x86_64
qemu-kvm-rhev-2.6.0-27.el7.x86_64

Guest:
Same as Comment14.

Steps:
Same as Comment14.

Results show:
Guest          Basic(boot/reboot/shutdown)    Migration
RHEL7.3            PASS                          PASS
RHEL6.8GA          PASS                          PASS
RHEL7.2GA          PASS                          PASS
Windows7           PASS                          PASS
Windows8           PASS                          PASS
Windows8.1         PASS                          PASS
Windows10          PASS                           \


So this bug has been fixed well, thank you.

Comment 19 Pei Zhang 2016-09-24 02:26:50 UTC
==Verification(update)==
Results show:
Guest          Basic(boot/reboot/shutdown)    Migration
...
Windows10          PASS                           PASS


Note: In this testing, win10 migration works. 
I will update this testing in bug[1]. 
[1]Bug 1357765 - windows10 guest hangs after migration

Comment 20 Pei Zhang 2016-09-24 02:27:52 UTC
Set this bug as 'VERIFIED' as Comment 18 and Comment 19.

Comment 21 Amit Shah 2016-09-27 11:25:31 UTC
*** Bug 1357765 has been marked as a duplicate of this bug. ***

Comment 23 errata-xmlrpc 2016-11-07 21:36:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-2673.html