Bug 1613277

Summary: kernel panic in init_amd_cacheinfo
Product: Red Hat Enterprise Linux 7 Reporter: Xiaodai Wang <xiaodwan>
Component: qemu-kvm-rhevAssignee: Eduardo Habkost <ehabkost>
Status: CLOSED ERRATA QA Contact: Guo, Zhiyi <zhguo>
Severity: medium Docs Contact:
Priority: high    
Version: 7.6CC: aliang, chayang, coli, jinzhao, jiyan, juzhang, juzhou, michen, mrezanin, mtessun, mxie, mzhan, ptoscano, rjones, timao, tzheng, xiaodwan, xuwei
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: V2V
Fixed In Version: qemu-kvm-rhev-2.12.0-12.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1619804 (view as bug list) Environment:
Last Closed: 2018-11-01 11:13:16 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 910269    
Attachments:
Description Flags
cpu info of the host
none
kernel panic log
none
qemu-system-x86_64-log.txt none

Description Xiaodai Wang 2018-08-07 11:05:56 UTC
Description of problem:
kernel panic during converting guest by v2v

Version-Release number of selected component (if applicable):
kernel-3.10.0-931.el7.x86_64

How reproducible:
100% on host amd-9600b-8-1.englab.nay.redhat.com in beaker

Steps to Reproduce:
1. Run virt-v2v commnad to convert any guest.
# virt-v2v  -ic vpx://root.75.182/data/10.73.72.61/?no_verify=1 -o rhev -os 10.73.194.236:/home/nfs_export -of raw -b ovirtmgmt -n ovirtmgmt esx6.0-rhel7.5-x86_64 -on esx6.0-rhel7.5-x86_64cxJ --password-file /tmp/v2v_vpx_passwd -v -x

Actual results:
2018-08-06 08:37:15,332 process          L0390 DEBUG| [stderr] [  111.079350] Kernel panic - not syncing: Fatal exception in interrupt

Expected results:
The guest should be converted successfully.

Additional info:
1. CI job logs and kernel panic backtrace log.
https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/v2v-RHEL-7.6-runtest-x86_64-matrix-esx6.0/lastCompletedBuild/testReport/rhev/convert_vm_to_ovirt/esx_vm_6_0_linux_7_5_arch_x86_64_raw_f_NFS_rhv/

Comment 2 Xiaodai Wang 2018-08-07 11:06:37 UTC
Created attachment 1473957 [details]
cpu info of the host

Comment 3 Richard W.M. Jones 2018-08-09 10:51:18 UTC
I cannot find the log.  Could you attach it to the bug or provide a
link to it please.

Comment 5 Xiaodai Wang 2018-08-09 11:01:41 UTC
Created attachment 1474638 [details]
kernel panic log

Comment 6 Richard W.M. Jones 2018-08-09 11:05:15 UTC
2018-08-06 05:29:00,413 process          L0390 DEBUG| [stderr] Google, Inc.
2018-08-06 05:29:00,413 process          L0390 DEBUG| [stderr] Serial Graphics Adapter 12/29/13
2018-08-06 05:29:00,414 process          L0390 DEBUG| [stderr] SGABIOS $Id: sgabios.S 8 2010-04-22 00:03:40Z nlaredo $ (mockbuild@) Sun Dec 29 03:43:06 UTC 2013
2018-08-06 05:29:00,414 process          L0390 DEBUG| [stderr] Term: 80x24
2018-08-06 05:29:00,414 process          L0390 DEBUG| [stderr] 4 0
2018-08-06 05:29:00,417 process          L0390 DEBUG| [stderr] \x1b[2J
2018-08-06 05:29:00,417 process          L0390 DEBUG| [stderr] SeaBIOS (version 1.11.0-2.el7)
2018-08-06 05:29:00,417 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,418 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,418 process          L0390 DEBUG| [stderr] Machine UUID 868312ce-0540-4d5f-b8e1-013b3e9e9217
2018-08-06 05:29:00,418 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,458 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,459 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,459 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,459 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,459 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,459 process          L0390 DEBUG| [stderr] iPXE (http://ipxe.org) 00:05.0 C100 PCI2.10 PnP PMM+7CF94A30+7CEF4A30 C100
2018-08-06 05:29:00,460 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,460 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,460 process          L0390 DEBUG| [stderr] Press Ctrl-B to configure iPXE (PCI 00:05.0)...
2018-08-06 05:29:00,460 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,461 process          L0390 DEBUG| [stderr]                                                                                
2018-08-06 05:29:00,461 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,461 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,461 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,461 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,462 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,462 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,462 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,462 process          L0390 DEBUG| [stderr] Booting from ROM...
2018-08-06 05:29:00,463 process          L0390 DEBUG| [stderr] 
2018-08-06 05:29:00,946 process          L0390 DEBUG| [stderr] \x1b[2J[    0.000000] Initializing cgroup subsys cpuset
2018-08-06 05:29:00,946 process          L0390 DEBUG| [stderr] [    0.000000] Initializing cgroup subsys cpu
2018-08-06 05:29:00,946 process          L0390 DEBUG| [stderr] [    0.000000] Initializing cgroup subsys cpuacct
2018-08-06 05:29:00,947 process          L0390 DEBUG| [stderr] [    0.000000] Linux version 3.10.0-931.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-36) (GCC) ) #1 SMP Tue Jul 31 17:55:24 EDT 2018
2018-08-06 05:29:00,947 process          L0390 DEBUG| [stderr] [    0.000000] Command line: panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 guestfs_network=1 TERM=xterm-256color guestfs_identifier=v2v
2018-08-06 05:29:00,947 process          L0390 DEBUG| [stderr] [    0.000000] e820: BIOS-provided physical RAM map:
2018-08-06 05:29:00,947 process          L0390 DEBUG| [stderr] [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009f7ff] usable
2018-08-06 05:29:00,948 process          L0390 DEBUG| [stderr] [    0.000000] BIOS-e820: [mem 0x000000000009f800-0x000000000009ffff] reserved
2018-08-06 05:29:00,948 process          L0390 DEBUG| [stderr] [    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
2018-08-06 05:29:00,948 process          L0390 DEBUG| [stderr] [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007cffdfff] usable
2018-08-06 05:29:00,948 process          L0390 DEBUG| [stderr] [    0.000000] BIOS-e820: [mem 0x000000007cffe000-0x000000007cffffff] reserved
2018-08-06 05:29:00,949 process          L0390 DEBUG| [stderr] [    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
2018-08-06 05:29:00,949 process          L0390 DEBUG| [stderr] [    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
2018-08-06 05:29:00,949 process          L0390 DEBUG| [stderr] [    0.000000] NX (Execute Disable) protection: active
2018-08-06 05:29:00,949 process          L0390 DEBUG| [stderr] [    0.000000] SMBIOS 2.8 present.
2018-08-06 05:29:00,950 process          L0390 DEBUG| [stderr] [    0.000000] DMI: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
2018-08-06 05:29:00,950 process          L0390 DEBUG| [stderr] [    0.000000] Hypervisor detected: KVM
2018-08-06 05:29:00,960 process          L0390 DEBUG| [stderr] [    0.000000] AGP: No AGP bridge found
2018-08-06 05:29:00,960 process          L0390 DEBUG| [stderr] [    0.000000] e820: last_pfn = 0x7cffe max_arch_pfn = 0x400000000
2018-08-06 05:29:00,961 process          L0390 DEBUG| [stderr] [    0.000000] PAT configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- UC  
2018-08-06 05:29:00,961 process          L0390 DEBUG| [stderr] [    0.000000] found SMP MP-table at [mem 0x000f63b0-0x000f63bf] mapped at [ffffffffff2003b0]
2018-08-06 05:29:00,961 process          L0390 DEBUG| [stderr] [    0.000000] Using GB pages for direct mapping
2018-08-06 05:29:00,961 process          L0390 DEBUG| [stderr] [    0.000000] RAMDISK: [mem 0x7ccbf000-0x7cfeffff]
2018-08-06 05:29:00,962 process          L0390 DEBUG| [stderr] [    0.000000] Early table checksum verification disabled
2018-08-06 05:29:00,962 process          L0390 DEBUG| [stderr] [    0.000000] ACPI BIOS Error (bug): A valid RSDP was not found (20130517/tbxfroot-243)
2018-08-06 05:29:00,980 process          L0390 DEBUG| [stderr] [    0.000000] No NUMA configuration found
2018-08-06 05:29:00,981 process          L0390 DEBUG| [stderr] [    0.000000] Faking a node at [mem 0x0000000000000000-0x000000007cffdfff]
2018-08-06 05:29:00,981 process          L0390 DEBUG| [stderr] [    0.000000] NODE_DATA(0) allocated [mem 0x7cc98000-0x7ccbefff]
2018-08-06 05:29:00,981 process          L0390 DEBUG| [stderr] [    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
2018-08-06 05:29:00,981 process          L0390 DEBUG| [stderr] [    0.000000] kvm-clock: cpu 0, msr 0:7cc48001, primary cpu clock
2018-08-06 05:29:00,982 process          L0390 DEBUG| [stderr] [    0.000000] kvm-clock: using sched offset of 789991423 cycles
2018-08-06 05:29:00,982 process          L0390 DEBUG| [stderr] [    0.000000] Zone ranges:
2018-08-06 05:29:00,982 process          L0390 DEBUG| [stderr] [    0.000000]   DMA      [mem 0x00001000-0x00ffffff]
2018-08-06 05:29:00,982 process          L0390 DEBUG| [stderr] [    0.000000]   DMA32    [mem 0x01000000-0xffffffff]
2018-08-06 05:29:00,983 process          L0390 DEBUG| [stderr] [    0.000000]   Normal   empty
2018-08-06 05:29:00,983 process          L0390 DEBUG| [stderr] [    0.000000] Movable zone start for each node
2018-08-06 05:29:00,983 process          L0390 DEBUG| [stderr] [    0.000000] Early memory node ranges
2018-08-06 05:29:00,983 process          L0390 DEBUG| [stderr] [    0.000000]   node   0: [mem 0x00001000-0x0009efff]
2018-08-06 05:29:00,983 process          L0390 DEBUG| [stderr] [    0.000000]   node   0: [mem 0x00100000-0x7cffdfff]
2018-08-06 05:29:00,984 process          L0390 DEBUG| [stderr] [    0.000000] Initmem setup node 0 [mem 0x00001000-0x7cffdfff]
2018-08-06 05:29:00,984 process          L0390 DEBUG| [stderr] [    0.000000] SFI: Simple Firmware Interface v0.81 http://simplefirmware.org
2018-08-06 05:29:00,984 process          L0390 DEBUG| [stderr] [    0.000000] Intel MultiProcessor Specification v1.4
2018-08-06 05:29:01,032 process          L0390 DEBUG| [stderr] [    0.000000] MPTABLE: OEM ID: BOCHSCPU
2018-08-06 05:29:01,032 process          L0390 DEBUG| [stderr] [    0.000000] MPTABLE: Product ID: 0.1         
2018-08-06 05:29:01,032 process          L0390 DEBUG| [stderr] [    0.000000] MPTABLE: APIC at: 0xFEE00000
2018-08-06 05:29:01,033 process          L0390 DEBUG| [stderr] [    0.000000] Processor #0 (Bootup-CPU)
2018-08-06 05:29:01,033 process          L0390 DEBUG| [stderr] [    0.000000] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23
2018-08-06 05:29:01,033 process          L0390 DEBUG| [stderr] [    0.000000] Processors: 1
2018-08-06 05:29:01,033 process          L0390 DEBUG| [stderr] [    0.000000] smpboot: Allowing 1 CPUs, 0 hotplug CPUs
2018-08-06 05:29:01,034 process          L0390 DEBUG| [stderr] [    0.000000] PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff]
2018-08-06 05:29:01,034 process          L0390 DEBUG| [stderr] [    0.000000] PM: Registered nosave memory: [mem 0x000a0000-0x000effff]
2018-08-06 05:29:01,034 process          L0390 DEBUG| [stderr] [    0.000000] PM: Registered nosave memory: [mem 0x000f0000-0x000fffff]
2018-08-06 05:29:01,034 process          L0390 DEBUG| [stderr] [    0.000000] e820: [mem 0x7d000000-0xfeffbfff] available for PCI devices
2018-08-06 05:29:01,034 process          L0390 DEBUG| [stderr] [    0.000000] Booting paravirtualized kernel on KVM
2018-08-06 05:29:01,035 process          L0390 DEBUG| [stderr] [    0.000000] setup_percpu: NR_CPUS:5120 nr_cpumask_bits:1 nr_cpu_ids:1 nr_node_ids:1
2018-08-06 05:29:01,035 process          L0390 DEBUG| [stderr] [    0.000000] PERCPU: Embedded 37 pages/cpu @ffff8b86fca00000 s113432 r8192 d29928 u2097152
2018-08-06 05:29:01,035 process          L0390 DEBUG| [stderr] [    0.000000] KVM setup async PF for cpu 0
2018-08-06 05:29:01,035 process          L0390 DEBUG| [stderr] [    0.000000] kvm-stealtime: cpu 0, msr 7ca13500
2018-08-06 05:29:01,036 process          L0390 DEBUG| [stderr] [    0.000000] Built 1 zonelists in Node order, mobility grouping on.  Total pages: 503879
2018-08-06 05:29:01,036 process          L0390 DEBUG| [stderr] [    0.000000] Policy zone: DMA32
2018-08-06 05:29:01,036 process          L0390 DEBUG| [stderr] [    0.000000] Kernel command line: panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=/dev/sdb selinux=0 guestfs_verbose=1 guestfs_network=1 TERM=xterm-256color guestfs_identifier=v2v
2018-08-06 05:29:01,036 process          L0390 DEBUG| [stderr] [    0.000000] Disabling memory control group subsystem
2018-08-06 05:29:01,037 process          L0390 DEBUG| [stderr] [    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
2018-08-06 05:29:01,037 process          L0390 DEBUG| [stderr] [    0.000000] AGP: Checking aperture...
2018-08-06 05:29:01,037 process          L0390 DEBUG| [stderr] [    0.000000] AGP: No AGP bridge found
2018-08-06 05:29:01,037 process          L0390 DEBUG| [stderr] [    0.000000] Memory: 1991992k/2047992k available (7648k kernel code, 392k absent, 55608k reserved, 6078k data, 1872k init)
2018-08-06 05:29:01,038 process          L0390 DEBUG| [stderr] [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
2018-08-06 05:29:01,038 process          L0390 DEBUG| [stderr] [    0.000000] Hierarchical RCU implementation.
2018-08-06 05:29:01,038 process          L0390 DEBUG| [stderr] [    0.000000] \tRCU restricting CPUs from NR_CPUS=5120 to nr_cpu_ids=1.
2018-08-06 05:29:01,038 process          L0390 DEBUG| [stderr] [    0.000000] NR_IRQS:327936 nr_irqs:256 0
2018-08-06 05:29:01,039 process          L0390 DEBUG| [stderr] [    0.000000] Console: colour *CGA 80x25
2018-08-06 05:29:01,039 process          L0390 DEBUG| [stderr] [    0.000000] console [ttyS0] enabled
2018-08-06 05:29:01,039 process          L0390 DEBUG| [stderr] [    0.000000] tsc: Detected 2304.806 MHz processor
2018-08-06 05:29:01,039 process          L0390 DEBUG| [stderr] [    0.298620] Calibrating delay loop (skipped) preset value.. 4609.61 BogoMIPS (lpj=2304806)
2018-08-06 05:29:01,040 process          L0390 DEBUG| [stderr] [    0.300721] pid_max: default: 32768 minimum: 301
2018-08-06 05:29:01,040 process          L0390 DEBUG| [stderr] [    0.301905] Security Framework initialized
2018-08-06 05:29:01,040 process          L0390 DEBUG| [stderr] [    0.302943] SELinux:  Disabled at boot.
2018-08-06 05:29:01,040 process          L0390 DEBUG| [stderr] [    0.303910] Yama: becoming mindful.
2018-08-06 05:29:01,051 process          L0390 DEBUG| [stderr] [    0.305020] Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
2018-08-06 05:29:01,051 process          L0390 DEBUG| [stderr] [    0.308144] Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
2018-08-06 05:29:01,051 process          L0390 DEBUG| [stderr] [    0.310027] Mount-cache hash table entries: 4096 (order: 3, 32768 bytes)
2018-08-06 05:29:01,052 process          L0390 DEBUG| [stderr] [    0.311476] Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes)
2018-08-06 05:29:01,052 process          L0390 DEBUG| [stderr] [    0.313147] Initializing cgroup subsys memory
2018-08-06 05:29:01,052 process          L0390 DEBUG| [stderr] [    0.314152] Initializing cgroup subsys devices
2018-08-06 05:29:01,052 process          L0390 DEBUG| [stderr] [    0.315254] Initializing cgroup subsys freezer
2018-08-06 05:29:01,053 process          L0390 DEBUG| [stderr] [    0.316369] Initializing cgroup subsys net_cls
2018-08-06 05:29:01,053 process          L0390 DEBUG| [stderr] [    0.317487] Initializing cgroup subsys blkio
2018-08-06 05:29:01,053 process          L0390 DEBUG| [stderr] [    0.318569] Initializing cgroup subsys perf_event
2018-08-06 05:29:01,053 process          L0390 DEBUG| [stderr] [    0.319744] Initializing cgroup subsys hugetlb
2018-08-06 05:29:01,054 process          L0390 DEBUG| [stderr] [    0.320849] Initializing cgroup subsys pids
2018-08-06 05:29:01,054 process          L0390 DEBUG| [stderr] [    0.321904] Initializing cgroup subsys net_prio
2018-08-06 05:29:01,280 process          L0390 DEBUG| [stderr] [    0.552256] random: fast init done
2018-08-06 05:29:14,399 process          L0390 DEBUG| [stderr] [   13.670840] random: crng init done
2018-08-06 05:30:51,744 process          L0390 DEBUG| [stderr] [  111.004247] BUG: unable to handle kernel NULL pointer dereference at 0000000000000102
2018-08-06 05:30:51,745 process          L0390 DEBUG| [stderr] [  111.006262] IP: [<ffffffff98cb69c2>] __queue_work+0x32/0x3e0
2018-08-06 05:30:51,745 process          L0390 DEBUG| [stderr] [  111.007711] PGD 0 
2018-08-06 05:30:51,745 process          L0390 DEBUG| [stderr] [  111.008285] Oops: 0000 [#1] SMP 
2018-08-06 05:30:51,746 process          L0390 DEBUG| [stderr] [  111.009163] Modules linked in:
2018-08-06 05:30:51,746 process          L0390 DEBUG| [stderr] [  111.009958] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0-931.el7.x86_64 #1
2018-08-06 05:30:51,746 process          L0390 DEBUG| [stderr] [  111.011770] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
2018-08-06 05:30:51,746 process          L0390 DEBUG| [stderr] [  111.013355] task: ffffffff99818480 ti: ffffffff99800000 task.ti: ffffffff99800000
2018-08-06 05:30:51,747 process          L0390 DEBUG| [stderr] [  111.015246] RIP: 0010:[<ffffffff98cb69c2>]  [<ffffffff98cb69c2>] __queue_work+0x32/0x3e0
2018-08-06 05:30:51,747 process          L0390 DEBUG| [stderr] [  111.017291] RSP: 0000:ffff8b86fca03e20  EFLAGS: 00010046
2018-08-06 05:30:51,747 process          L0390 DEBUG| [stderr] [  111.018617] RAX: 0000000000000082 RBX: 0000000000000087 RCX: 0000000000000000
2018-08-06 05:30:51,747 process          L0390 DEBUG| [stderr] [  111.020395] RDX: ffffffff998ee9a0 RSI: 0000000000000000 RDI: 0000000000001400
2018-08-06 05:30:51,752 process          L0390 DEBUG| [stderr] [  111.022167] RBP: ffff8b86fca03e58 R08: 0000000000000000 R09: 0000000000004000
2018-08-06 05:30:51,753 process          L0390 DEBUG| [stderr] [  111.023585] R10: ffffffff99e36bc8 R11: 0000000000007ffe R12: ffffffff998ee9a0
2018-08-06 05:30:51,753 process          L0390 DEBUG| [stderr] [  111.025021] R13: 0000000000001400 R14: 0000000000000000 R15: ffffffff996c1551
2018-08-06 05:30:51,753 process          L0390 DEBUG| [stderr] [  111.026682] FS:  0000000000000000(0000) GS:ffff8b86fca00000(0000) knlGS:0000000000000000
2018-08-06 05:30:51,753 process          L0390 DEBUG| [stderr] [  111.028698] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
2018-08-06 05:30:51,767 process          L0390 DEBUG| [stderr] [  111.030108] CR2: 0000000000000102 CR3: 0000000056210000 CR4: 00000000000006b0
2018-08-06 05:30:51,767 process          L0390 DEBUG| [stderr] [  111.031556] Call Trace:
2018-08-06 05:30:51,767 process          L0390 DEBUG| [stderr] [  111.032203]  <IRQ> 
2018-08-06 05:30:51,768 process          L0390 DEBUG| [stderr] [  111.032706]  [<ffffffff98cb6fc5>] queue_work_on+0x45/0x50
2018-08-06 05:30:51,768 process          L0390 DEBUG| [stderr] [  111.034135]  [<ffffffff99081a26>] credit_entropy_bits+0x1c6/0x290
2018-08-06 05:30:51,768 process          L0390 DEBUG| [stderr] [  111.035668]  [<ffffffff99082734>] ? add_interrupt_randomness+0x1c4/0x230
2018-08-06 05:30:51,768 process          L0390 DEBUG| [stderr] [  111.037347]  [<ffffffff99082734>] add_interrupt_randomness+0x1c4/0x230
2018-08-06 05:30:51,769 process          L0390 DEBUG| [stderr] [  111.038987]  [<ffffffff98d494df>] handle_irq_event_percpu+0x3f/0x80
2018-08-06 05:30:51,769 process          L0390 DEBUG| [stderr] [  111.040534]  [<ffffffff98d4955c>] handle_irq_event+0x3c/0x60
2018-08-06 05:30:51,769 process          L0390 DEBUG| [stderr] [  111.041957]  [<ffffffff98d4c663>] handle_level_irq+0x73/0xd0
2018-08-06 05:30:51,769 process          L0390 DEBUG| [stderr] [  111.043386]  [<ffffffff98c2e564>] handle_irq+0xe4/0x1a0
2018-08-06 05:30:51,799 process          L0390 DEBUG| [stderr] [  111.044693]  [<ffffffff98c9f028>] ? __local_bh_enable+0x28/0x90
2018-08-06 05:30:51,800 process          L0390 DEBUG| [stderr] [  111.045880]  [<ffffffff9937553d>] do_IRQ+0x4d/0xf0
2018-08-06 05:30:51,800 process          L0390 DEBUG| [stderr] [  111.046849]  [<ffffffff99367362>] common_interrupt+0x162/0x162
2018-08-06 05:30:51,800 process          L0390 DEBUG| [stderr] [  111.048152]  <EOI> 
2018-08-06 05:30:51,800 process          L0390 DEBUG| [stderr] [  111.048659]  [<ffffffff993674a6>] ? retint_restore_args+0x6/0x36
2018-08-06 05:30:51,801 process          L0390 DEBUG| [stderr] [  111.050241]  [<ffffffff98c6a511>] ? native_cpuid+0x11/0x20
2018-08-06 05:30:51,801 process          L0390 DEBUG| [stderr] [  111.051612]  [<ffffffff98c3c5fe>] find_num_cache_leaves.isra.0+0x6e/0xa0
2018-08-06 05:30:51,801 process          L0390 DEBUG| [stderr] [  111.053305]  [<ffffffff98c3dc39>] init_amd_cacheinfo+0x99/0xb0
2018-08-06 05:30:51,801 process          L0390 DEBUG| [stderr] [  111.054769]  [<ffffffff98c41f40>] init_amd+0xb0/0x880
2018-08-06 05:30:51,802 process          L0390 DEBUG| [stderr] [  111.056040]  [<ffffffff98c3f772>] identify_cpu+0x1c2/0x4d0
2018-08-06 05:30:51,802 process          L0390 DEBUG| [stderr] [  111.057429]  [<ffffffff99994f30>] identify_boot_cpu+0x10/0xa9
2018-08-06 05:30:51,802 process          L0390 DEBUG| [stderr] [  111.058874]  [<ffffffff99994fff>] check_bugs+0x21/0x22e
2018-08-06 05:30:51,802 process          L0390 DEBUG| [stderr] [  111.060203]  [<ffffffff99986198>] start_kernel+0x41d/0x467
2018-08-06 05:30:51,803 process          L0390 DEBUG| [stderr] [  111.061562]  [<ffffffff99985b7b>] ? repair_env_string+0x5c/0x5c
2018-08-06 05:30:51,803 process          L0390 DEBUG| [stderr] [  111.063064]  [<ffffffff99985120>] ? early_idt_handler_array+0x120/0x120
2018-08-06 05:30:51,803 process          L0390 DEBUG| [stderr] [  111.064728]  [<ffffffff9998572f>] x86_64_start_reservations+0x24/0x26
2018-08-06 05:30:51,803 process          L0390 DEBUG| [stderr] [  111.066341]  [<ffffffff99985885>] x86_64_start_kernel+0x154/0x177
2018-08-06 05:30:51,804 process          L0390 DEBUG| [stderr] [  111.067874]  [<ffffffff98c000d5>] start_cpu+0x5/0x14
2018-08-06 05:30:51,804 process          L0390 DEBUG| [stderr] [  111.069136] Code: 89 e5 41 57 41 56 49 89 f6 41 55 41 89 fd 41 54 49 89 d4 53 48 83 ec 10 89 7d d4 ff 14 25 80 40 83 99 f6 c4 02 0f 85 de 02 00 00 <41> f6 86 02 01 00 00 01 0f 85 78 02 00 00 49 c7 c7 48 7b 01 00 
2018-08-06 05:30:51,804 process          L0390 DEBUG| [stderr] [  111.075625] RIP  [<ffffffff98cb69c2>] __queue_work+0x32/0x3e0
2018-08-06 05:30:51,807 process          L0390 DEBUG| [stderr] [  111.077082]  RSP <ffff8b86fca03e20>
2018-08-06 05:30:51,807 process          L0390 DEBUG| [stderr] [  111.077782] CR2: 0000000000000102
2018-08-06 05:30:51,807 process          L0390 DEBUG| [stderr] [  111.078460] ---[ end trace b5e07c7d2de96d4f ]---
2018-08-06 05:30:51,808 process          L0390 DEBUG| [stderr] [  111.079398] Kernel panic - not syncing: Fatal exception in interrupt
2018-08-06 05:30:51,808 process          L0390 DEBUG| [stderr] [  111.080939] Rebooting in 1 seconds..libguestfs: child_cleanup: 0x1969990: child process died

Comment 7 Richard W.M. Jones 2018-08-09 11:17:00 UTC
[  110.924970] BUG: unable to handle kernel NULL pointer dereference at 0000000000000102
[  110.926766] IP: [<ffffffffb1cb69c2>] __queue_work+0x32/0x3e0
[  110.928061] PGD 0 
[  110.928545] Oops: 0000 [#1] SMP 
[  110.929353] Modules linked in:
[  110.930061] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0-931.el7.x86_64 #1
[  110.931658] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
[  110.933031] task: ffffffffb2818480 ti: ffffffffb2800000 task.ti: ffffffffb2800000
[  110.934703] RIP: 0010:[<ffffffffb1cb69c2>]  [<ffffffffb1cb69c2>] __queue_work+0x32/0x3e0
[  110.936518] RSP: 0000:ffff93611ee03e20  EFLAGS: 00010046
[  110.937663] RAX: 0000000000000082 RBX: 0000000000000087 RCX: 0000000000000000
[  110.939256] RDX: ffffffffb28ee9a0 RSI: 0000000000000000 RDI: 0000000000001400
[  110.940802] RBP: ffff93611ee03e58 R08: 0000000000000000 R09: 0000000000004000
[  110.942393] R10: ffffffffb2e36bc8 R11: 0000000000007ffe R12: ffffffffb28ee9a0
[  110.943969] R13: 0000000000001400 R14: 0000000000000000 R15: ffffffffb26c1551
[  110.945521] FS:  0000000000000000(0000) GS:ffff93611ee00000(0000) knlGS:0000000000000000
[  110.947336] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  110.948608] CR2: 0000000000000102 CR3: 000000001b210000 CR4: 00000000000006b0
[  110.950183] Call Trace:
[  110.950751]  <IRQ> 
[  110.951191]  [<ffffffffb1cb6fc5>] queue_work_on+0x45/0x50
[  110.952440]  [<ffffffffb2081a26>] credit_entropy_bits+0x1c6/0x290
[  110.953805]  [<ffffffffb2082734>] ? add_interrupt_randomness+0x1c4/0x230
[  110.955296]  [<ffffffffb2082734>] add_interrupt_randomness+0x1c4/0x230
[  110.956734]  [<ffffffffb1d494df>] handle_irq_event_percpu+0x3f/0x80
[  110.958134]  [<ffffffffb1d4955c>] handle_irq_event+0x3c/0x60
[  110.959393]  [<ffffffffb1d4c663>] handle_level_irq+0x73/0xd0
[  110.960649]  [<ffffffffb1c2e564>] handle_irq+0xe4/0x1a0
[  110.961820]  [<ffffffffb1c9f028>] ? __local_bh_enable+0x28/0x90
[  110.963139]  [<ffffffffb237553d>] do_IRQ+0x4d/0xf0
[  110.964210]  [<ffffffffb2367362>] common_interrupt+0x162/0x162
[  110.965506]  <EOI> 
[  110.965945]  [<ffffffffb23674a6>] ? retint_restore_args+0x6/0x36
[  110.967343]  [<ffffffffb1c6a511>] ? native_cpuid+0x11/0x20
[  110.968566]  [<ffffffffb1c3c5fe>] find_num_cache_leaves.isra.0+0x6e/0xa0
[  110.970055]  [<ffffffffb1c3dc39>] init_amd_cacheinfo+0x99/0xb0
[  110.971336]  [<ffffffffb1c41f40>] init_amd+0xb0/0x880
[  110.972468]  [<ffffffffb1c3f772>] identify_cpu+0x1c2/0x4d0
[  110.973702]  [<ffffffffb2994f30>] identify_boot_cpu+0x10/0xa9
[  110.974988]  [<ffffffffb2994fff>] check_bugs+0x21/0x22e
[  110.976162]  [<ffffffffb2986198>] start_kernel+0x41d/0x467
[  110.977395]  [<ffffffffb2985b7b>] ? repair_env_string+0x5c/0x5c
[  110.978727]  [<ffffffffb2985120>] ? early_idt_handler_array+0x120/0x120
[  110.980195]  [<ffffffffb298572f>] x86_64_start_reservations+0x24/0x26
[  110.981615]  [<ffffffffb2985885>] x86_64_start_kernel+0x154/0x177
[  110.982974]  [<ffffffffb1c000d5>] start_cpu+0x5/0x14
[  110.984076] Code: 89 e5 41 57 41 56 49 89 f6 41 55 41 89 fd 41 54 49 89 d4 53 48 83 ec 10 89 7d d4 ff 14 25 80 40 83 b2 f6 c4 02 0f 85 de 02 00 00 <41> f6 86 02 01 00 00 01 0f 85 78 02 00 00 49 c7 c7 48 7b 01 00 
[  110.990252] RIP  [<ffffffffb1cb69c2>] __queue_work+0x32/0x3e0
[  110.991564]  RSP <ffff93611ee03e20>
[  110.992378] CR2: 0000000000000102
[  110.993124] ---[ end trace b2fc48aec0ff4a88 ]---
[  110.994148] Kernel panic - not syncing: Fatal exception in interrupt

Comment 8 Richard W.M. Jones 2018-08-09 12:51:34 UTC
I am bisecting this.

Comment 9 Richard W.M. Jones 2018-08-09 18:59:26 UTC
It's actually qemu-kvm-rhev, not the kernel.

FAILS: qemu-kvm-rhev-2.12.0-9.el7.x86_64

WORKS: qemu-kvm-rhev-2.10.0-21.el7_5.5.x86_64

Now I'm bisecting qemu ...

Comment 10 Richard W.M. Jones 2018-08-09 20:00:09 UTC
On the qemu-kvm-rhev rhv7/master-2.12.0 branch:

499c2933a848699a80edd44308e1c4f7497a8a66 is the first bad commit
commit 499c2933a848699a80edd44308e1c4f7497a8a66
Author: Eduardo Habkost <ehabkost>
Date:   Thu Jul 26 15:25:35 2018 +0200

    i386: Allow TOPOEXT to be enabled on older kernels
    
    RH-Author: Eduardo Habkost <ehabkost>
    Message-id: <20180726152535.4493-2-ehabkost>
    Patchwork-id: 81513
    O-Subject: [RHEL-7.6 qemu-kvm-rhev PATCH 1/1] i386: Allow TOPOEXT to be enabled on older kernels
    Bugzilla: 1608698
    RH-Acked-by: Laurent Vivier <lvivier>
    RH-Acked-by: Paolo Bonzini <pbonzini>
    RH-Acked-by: Igor Mammedov <imammedo>
    
    From: Babu Moger <babu.moger>
    
    Enabling TOPOEXT feature might cause compatibility issues if
    older kernels does not set this feature. Lets set this feature
    unconditionally.
    
    Signed-off-by: Babu Moger <babu.moger>
    Message-Id: <1528939107-17193-2-git-send-email-babu.moger>
    [ehabkost: rewrite comment and commit message]
    Signed-off-by: Eduardo Habkost <ehabkost>
    (cherry picked from commit f98bbd8304112187cafc3e636c31b2a3865d2717)
    Signed-off-by: Eduardo Habkost <ehabkost>
    
    Signed-off-by: Miroslav Rezanina <mrezanin>

:040000 040000 c663c63f012b1b1ab90ed02fb92c757f1cfa70dc 0f27545721a49224a8b31001061ec3043ae5a99e M	target

Comment 11 Richard W.M. Jones 2018-08-09 20:03:20 UTC
Adding Regression keyword since I have now verified that this
worked OK in RHEL 7.5.

Comment 13 Richard W.M. Jones 2018-08-09 20:29:46 UTC
Created attachment 1474813 [details]
qemu-system-x86_64-log.txt

Comment 14 Richard W.M. Jones 2018-08-09 20:41:48 UTC
Upstream results are the same:

f98bbd8304112187cafc3e636c31b2a3865d2717 is the first bad commit
commit f98bbd8304112187cafc3e636c31b2a3865d2717
Author: Babu Moger <babu.moger>
Date:   Wed Jun 13 21:18:22 2018 -0400

    i386: Allow TOPOEXT to be enabled on older kernels
    
    Enabling TOPOEXT feature might cause compatibility issues if
    older kernels does not set this feature. Lets set this feature
    unconditionally.
    
    Signed-off-by: Babu Moger <babu.moger>
    Message-Id: <1528939107-17193-2-git-send-email-babu.moger>
    [ehabkost: rewrite comment and commit message]
    Signed-off-by: Eduardo Habkost <ehabkost>

:040000 040000 9c16632c72b94e3edf417cbf4180b013973a28bd 16f9c6abfd2d035048d92d69fe824ca6d42b7cbe M	target

Comment 17 Richard W.M. Jones 2018-08-10 07:46:15 UTC
Eduardo posted a patch which works for me with upstream qemu:

https://lists.nongnu.org/archive/html/qemu-devel/2018-08/msg01641.html

I also tested Eduardo's scratch build of qemu-kvm-rhev
(qemu-kvm-rhev-2.12.0-9.el7.topoext.crash.fix.v1.x86_64) and
that also fixed the problem for me.

BTW a one line command to test this on the affected AMD Phenom
machine is:

$ LIBGUESTFS_BACKEND=direct libguestfs-test-tool

Comment 20 Eduardo Habkost 2018-08-21 18:31:07 UTC
*** Bug 1614612 has been marked as a duplicate of this bug. ***

Comment 24 Miroslav Rezanina 2018-08-29 03:10:58 UTC
Fix included in qemu-kvm-rhev-2.12.0-12.el7

Comment 26 Guo, Zhiyi 2018-09-07 03:18:15 UTC
Reproduce this issue against qemu-kvm-rhev-2.12.0-11.el7.x86_64

qemu cli used:
/usr/libexec/qemu-kvm -name 75 -m 8G -machine pc,accel=kvm \
        -S \
        -cpu host,enforce \
        -smp 2,cores=2 \
        -monitor stdio \
        -qmp unix:/tmp/qmp1,server,nowait \
        -vnc :0 \
        -serial unix:/tmp/console,server,nowait \
        -uuid 115e11b2-a869-41b5-91cd-6a32a907be7e \
        -drive file=rhel7.6-20180904.qcow2,if=none,id=drive-scsi-disk0,format=qcow2,cache=none,werror=stop,rerror=stop -device ide-hd,drive=drive-scsi-disk0,id=scsi-disk0 \
        -device ich9-usb-uhci6 -device usb-tablet \
        -device qxl-vga \

Boot rhel 7.6 guest with this qemu cli, guest kernel call trace:
[  110.720008] BUG: unable to handle kernel NULL pointer dereference at 0000000000000102
[  110.721000] IP: [<ffffffffab4b7742>] __queue_work+0x32/0x3e0
[  110.721000] PGD 0 
[  110.721000] Oops: 0000 [#1] SMP 
[  110.721000] Modules linked in:
[  110.721000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.10.0-943.el7.x86_64 #1
[  110.721000] Hardware name: Red Hat KVM, BIOS 1.11.0-2.el7 04/01/2014
[  110.721000] task: ffffffffac018480 ti: ffffffffac000000 task.ti: ffffffffac000000
[  110.721000] RIP: 0010:[<ffffffffab4b7742>]  [<ffffffffab4b7742>] __queue_work+0x32/0x3e0
[  110.721000] RSP: 0000:ffff9cd6bfc03e20  EFLAGS: 00010046
[  110.721000] RAX: 0000000000000082 RBX: 0000000000000087 RCX: 0000000000000000
[  110.721000] RDX: ffffffffac0ed4a0 RSI: 0000000000000000 RDI: 0000000000001400
[  110.721000] RBP: ffff9cd6bfc03e58 R08: 0000000000000000 R09: 0000000000004000
[  110.721000] R10: ffffffffac634c3c R11: 0000000000007ffe R12: ffffffffac0ed4a0
[  110.721000] R13: 0000000000001400 R14: 0000000000000000 R15: ffffffffabec091b
[  110.721000] FS:  0000000000000000(0000) GS:ffff9cd6bfc00000(0000) knlGS:0000000000000000
[  110.721000] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  110.721000] CR2: 0000000000000102 CR3: 000000000a610000 CR4: 00000000000006b0
[  110.721000] Call Trace:
[  110.721000]  <IRQ> 
[  110.721000]  [<ffffffffab4b7d45>] queue_work_on+0x45/0x50
[  110.721000]  [<ffffffffab884536>] credit_entropy_bits+0x1c6/0x290
[  110.721000]  [<ffffffffab885244>] ? add_interrupt_randomness+0x1c4/0x230
[  110.721000]  [<ffffffffab885244>] add_interrupt_randomness+0x1c4/0x230
[  110.721000]  [<ffffffffab54a05f>] handle_irq_event_percpu+0x3f/0x80
[  110.721000]  [<ffffffffab54a0dc>] handle_irq_event+0x3c/0x60
[  110.721000]  [<ffffffffab54d1e3>] handle_level_irq+0x73/0xd0
[  110.721000]  [<ffffffffab42e554>] handle_irq+0xe4/0x1a0
[  110.721000]  [<ffffffffab49fed8>] ? __local_bh_enable+0x28/0x90
[  110.721000]  [<ffffffffabb775dd>] do_IRQ+0x4d/0xf0
[  110.721000]  [<ffffffffabb69362>] common_interrupt+0x162/0x162
[  110.721000]  <EOI> 
[  110.721000]  [<ffffffffabb694a6>] ? retint_restore_args+0x6/0x36
[  110.721000]  [<ffffffffab46a511>] ? native_cpuid+0x11/0x20
[  110.721000]  [<ffffffffab43c50e>] find_num_cache_leaves.isra.0+0x6e/0xa0
[  110.721000]  [<ffffffffab43db49>] init_amd_cacheinfo+0x99/0xb0
[  110.721000]  [<ffffffffab4422ae>] init_amd+0x23e/0x7d0
[  110.721000]  [<ffffffffab43f662>] identify_cpu+0x1c2/0x560
[  110.721000]  [<ffffffffac193f49>] identify_boot_cpu+0x10/0xa9
[  110.721000]  [<ffffffffac194107>] check_bugs+0x21/0x2b6
[  110.721000]  [<ffffffffac18519d>] start_kernel+0x422/0x46c
[  110.721000]  [<ffffffffac184b7b>] ? repair_env_string+0x5c/0x5c
[  110.721000]  [<ffffffffac184120>] ? early_idt_handler_array+0x120/0x120
[  110.721000]  [<ffffffffac18472f>] x86_64_start_reservations+0x24/0x26
[  110.721000]  [<ffffffffac184885>] x86_64_start_kernel+0x154/0x177
[  110.721000]  [<ffffffffab4000d5>] start_cpu+0x5/0x14
[  110.721000] Code: 89 e5 41 57 41 56 49 89 f6 41 55 41 89 fd 41 54 49 89 d4 53 48 83 ec 10 89 7d d4 ff 14 25 40 42 03 ac f6 c4 02 0f 85 de 02 00 00 <41> f6 86 02 01 00 00 01 0f 85 78 02 00 00 49 c7 c7 48 7b 01 00 
[  110.721000] RIP  [<ffffffffab4b7742>] __queue_work+0x32/0x3e0
[  110.721000]  RSP <ffff9cd6bfc03e20>
[  110.721000] CR2: 0000000000000102
[  110.721000] ---[ end trace 0527cc9f758d9d62 ]---
[  110.721000] Kernel panic - not syncing: Fatal exception in interrupt

Verify this issue against qemu-kvm-rhev-2.12.0-13.el7.x86_64.

Boot guest 10 times, no such issue happen now

Comment 27 Guo, Zhiyi 2018-09-07 03:18:54 UTC
Verified per comment 26

Comment 28 errata-xmlrpc 2018-11-01 11:13:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3443