Description of problem: This is using a SandyBridge CPU which has AVX instructions: https://en.wikipedia.org/wiki/Advanced_Vector_Extensions I'm booting a guest using <cpu mode="host-model"/>. Inside the guest, when initializing an mdadm device (yes, this guest has RAID arrays inside), we see the trace attached below. I think what is happening here: (a) CPU flags are copied from host to guest, advertising 'avx' (b) Guest tries to use 'avx'. (c) KVM doesn't emulate it, so it all falls in a hole. Perhaps libvirt should filter flags based on what KVM can actually do? Version-Release number of selected component (if applicable): qemu-1.2.0-16.fc18.x86_64 libvirt-0.10.2-3.fc18.x86_64 kernel-3.6.2-2.fc18.x86_64 How reproducible: 100% Steps to Reproduce: 1. in libguestfs test suite: make -C tests/md check Additional info: Host CPU flags: flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid mdadm --create --run r5t1 --level 5 --raid-devices 4 --spare-devices 1 /dev/sda2 /dev/sdb2 /dev/sdc2 /dev/sdd2 missing mdadm: Defaulting to version 1.2 metadata [ 5.131487] md: bind<sda2> [ 5.132218] md: bind<sdb2> [ 5.132966] md: bind<sdc2> [ 5.133773] md: bind<sdd2> [ 5.150258] async_tx: api initialized (async) [ 5.152459] xor: automatically using best checksumming function: [ 5.153064] invalid opcode: 0000 [#1] SMP [ 5.153423] Modules linked in: xor(+) async_tx raid1 ghash_clmulni_intel microcode virtio_net virtio_scsi virtio_blk virtio_rng virtio_balloon virtio_mmio sparse_keymap rfkill sym53c8xx scsi_transport_spi crc8 crc_ccitt crc_itu_t crc32c_intel libcrc32c [ 5.154012] CPU 0 [ 5.154012] Pid: 262, comm: modprobe Not tainted 3.6.2-2.fc18.x86_64.debug #1 Bochs Bochs [ 5.154012] RIP: 0010:[<ffffffffa0095c6c>] [<ffffffffa0095c6c>] xor_avx_2+0x5c/0x270 [xor] [ 5.154012] RSP: 0018:ffff88001abfdd00 EFLAGS: 00010202 [ 5.154012] RAX: 000000008005003b RBX: ffff8800192d0000 RCX: 0000000000000001 [ 5.154012] RDX: ffff8800192d3000 RSI: ffff8800192d0000 RDI: 0000000000001000 [ 5.154012] RBP: ffff88001abfddc8 R08: 0000000000000000 R09: 0000000000000000 [ 5.154012] R10: 0000000000000001 R11: 0000000000000000 R12: ffff8800192d3000 [ 5.154012] R13: 0000000000000008 R14: 000000008005003b R15: ffff8800192d0000 [ 5.154012] FS: 00007f1e5e769740(0000) GS:ffff88001f000000(0000) knlGS:0000000000000000 [ 5.154012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.154012] CR2: 00007fff1f8d5000 CR3: 0000000019278000 CR4: 00000000000007f0 [ 5.154012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 5.154012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 5.154012] Process modprobe (pid: 262, threadinfo ffff88001abfc000, task ffff88001a0fa450) [ 5.154012] Stack: [ 5.154012] ffffffff8134df3e 0000000000000000 0000000000000000 0000000000000001 [ 5.154012] 0000000000000001 ffff88001abfdfd8 ffff88001abfc000 ffff88001a0fa450 [ 5.154012] ffff88001abfdd48 ffffffff8111c227 ffffffff816dcb30 0000000000000000 [ 5.154012] Call Trace: [ 5.154012] [<ffffffff8134df3e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 5.154012] [<ffffffff8111c227>] ? rcu_irq_exit+0x87/0xd0 [ 5.154012] [<ffffffff816dcb30>] ? retint_restore_args+0x13/0x13 [ 5.154012] [<ffffffffa0096a69>] do_xor_speed+0x7d/0xe7 [xor] [ 5.154012] [<ffffffffa0005075>] calibrate_xor_blocks+0x75/0x1000 [xor] [ 5.154012] [<ffffffffa0005000>] ? 0xffffffffa0004fff [ 5.154012] [<ffffffff8100212a>] do_one_initcall+0x12a/0x180 [ 5.154012] [<ffffffff810e3553>] sys_init_module+0x133/0x2340 [ 5.154012] [<ffffffff81362490>] ? ddebug_proc_open+0xd0/0xd0 [ 5.154012] [<ffffffff81099143>] ? up_write+0x23/0x40 [ 5.154012] [<ffffffff816e55e9>] system_call_fastpath+0x16/0x1b [ 5.154012] Code: 98 00 00 00 31 c0 49 c1 ed 09 65 48 8b 04 25 b0 c8 00 00 83 80 44 e0 ff ff 01 e8 10 54 fb e0 66 90 49 89 c6 0f 06 66 66 90 66 90 <c5> fc 29 04 24 c5 fc 29 4c 24 20 c5 fc 29 54 24 40 c5 fc 29 5c [ 5.154012] RIP [<ffffffffa0095c6c>] xor_avx_2+0x5c/0x270 [xor] [ 5.154012] RSP <ffff88001abfdd00> [ 5.174236] ---[ end trace 1500dad90bed99ad ]--- [ 5.174615] BUG: sleeping function called from invalid context at kernel/rwsem.c:20 [ 5.175227] in_atomic(): 1, irqs_disabled(): 0, pid: 262, name: modprobe [ 5.175754] INFO: lockdep is turned off. [ 5.176088] Pid: 262, comm: modprobe Tainted: G D 3.6.2-2.fc18.x86_64.debug #1 [ 5.176725] Call Trace: [ 5.176922] [<ffffffff810a281a>] __might_sleep+0x18a/0x240 [ 5.177381] [<ffffffff816d9af6>] down_read+0x26/0x98 [ 5.177790] [<ffffffff81081e64>] exit_signals+0x24/0x130 [ 5.178243] [<ffffffff8106e6bd>] do_exit+0xbd/0xb00 [ 5.178629] [<ffffffff8106b348>] ? kmsg_dump+0x1b8/0x240 [ 5.179089] [<ffffffff8106b1b5>] ? kmsg_dump+0x25/0x240 [ 5.179499] [<ffffffff816dda5d>] oops_end+0x9d/0xe0 [ 5.179893] [<ffffffff8101d9e8>] die+0x58/0x90 [ 5.180282] [<ffffffff816dd320>] do_trap+0xc0/0x170 [ 5.180669] [<ffffffff8101ae96>] ? do_invalid_op+0x86/0xc0 [ 5.181147] [<ffffffff8101aec0>] do_invalid_op+0xb0/0xc0 [ 5.181563] [<ffffffffa0095c6c>] ? xor_avx_2+0x5c/0x270 [xor] [ 5.182016] [<ffffffff8134df7d>] ? trace_hardirqs_off_thunk+0x3a/0x3c [ 5.182559] [<ffffffff816dcb60>] ? restore_args+0x30/0x30 [ 5.182981] [<ffffffff816e663b>] invalid_op+0x1b/0x20 [ 5.183428] [<ffffffffa0095c6c>] ? xor_avx_2+0x5c/0x270 [xor] [ 5.183878] [<ffffffff8134df3e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 5.184418] [<ffffffff8111c227>] ? rcu_irq_exit+0x87/0xd0 [ 5.184844] [<ffffffff816dcb30>] ? retint_restore_args+0x13/0x13 [ 5.185347] [<ffffffffa0096a69>] do_xor_speed+0x7d/0xe7 [xor] [ 5.185816] [<ffffffffa0005075>] calibrate_xor_blocks+0x75/0x1000 [xor] [ 5.186381] [<ffffffffa0005000>] ? 0xffffffffa0004fff [ 5.186810] [<ffffffff8100212a>] do_one_initcall+0x12a/0x180 [ 5.187292] [<ffffffff810e3553>] sys_init_module+0x133/0x2340 [ 5.187775] [<ffffffff81362490>] ? ddebug_proc_open+0xd0/0xd0 [ 5.188273] [<ffffffff81099143>] ? up_write+0x23/0x40 [ 5.188692] [<ffffffff816e55e9>] system_call_fastpath+0x16/0x1b [ 5.189202] BUG: scheduling while atomic: modprobe/262/0x10000003 [ 5.189683] INFO: lockdep is turned off. [ 5.190034] Modules linked in: xor(+) async_tx raid1 ghash_clmulni_intel microcode virtio_net virtio_scsi virtio_blk virtio_rng virtio_balloon virtio_mmio sparse_keymap rfkill sym53c8xx scsi_transport_spi crc8 crc_ccitt crc_itu_t crc32c_intel libcrc32c [ 5.192151] Pid: 262, comm: modprobe Tainted: G D 3.6.2-2.fc18.x86_64.debug #1 [ 5.192779] Call Trace: [ 5.192979] [<ffffffff816cfa51>] __schedule_bug+0x67/0x75 [ 5.193452] [<ffffffff816da78b>] __schedule+0x98b/0x9f0 [ 5.193872] [<ffffffff810a500a>] __cond_resched+0x2a/0x40 [ 5.194356] [<ffffffff816da870>] _cond_resched+0x30/0x40 [ 5.194783] [<ffffffff816d9afb>] down_read+0x2b/0x98 [ 5.195211] [<ffffffff81081e64>] exit_signals+0x24/0x130 [ 5.195657] [<ffffffff8106e6bd>] do_exit+0xbd/0xb00 [ 5.196079] [<ffffffff8106b348>] ? kmsg_dump+0x1b8/0x240 [ 5.196524] [<ffffffff8106b1b5>] ? kmsg_dump+0x25/0x240 [ 5.196946] [<ffffffff816dda5d>] oops_end+0x9d/0xe0 [ 5.197372] [<ffffffff8101d9e8>] die+0x58/0x90 [ 5.197755] [<ffffffff816dd320>] do_trap+0xc0/0x170 [ 5.198182] [<ffffffff8101ae96>] ? do_invalid_op+0x86/0xc0 [ 5.198642] [<ffffffff8101aec0>] do_invalid_op+0xb0/0xc0 [ 5.199095] [<ffffffffa0095c6c>] ? xor_avx_2+0x5c/0x270 [xor] [ 5.199557] [<ffffffff8134df7d>] ? trace_hardirqs_off_thunk+0x3a/0x3c [ 5.200119] [<ffffffff816dcb60>] ? restore_args+0x30/0x30 [ 5.200551] [<ffffffff816e663b>] invalid_op+0x1b/0x20 [ 5.200975] [<ffffffffa0095c6c>] ? xor_avx_2+0x5c/0x270 [xor] [ 5.201465] [<ffffffff8134df3e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 5.201992] [<ffffffff8111c227>] ? rcu_irq_exit+0x87/0xd0 [ 5.202455] [<ffffffff816dcb30>] ? retint_restore_args+0x13/0x13 [ 5.202944] [<ffffffffa0096a69>] do_xor_speed+0x7d/0xe7 [xor] [ 5.203446] [<ffffffffa0005075>] calibrate_xor_blocks+0x75/0x1000 [xor] [ 5.203976] [<ffffffffa0005000>] ? 0xffffffffa0004fff [ 5.204432] [<ffffffff8100212a>] do_one_initcall+0x12a/0x180 [ 5.204887] [<ffffffff810e3553>] sys_init_module+0x133/0x2340 [ 5.205395] [<ffffffff81362490>] ? ddebug_proc_open+0xd0/0xd0 [ 5.205858] [<ffffffff81099143>] ? up_write+0x23/0x40 [ 5.206314] [<ffffffff816e55e9>] system_call_fastpath+0x16/0x1b [ 5.206857] note: modprobe[262] exited with preempt_count 2 [ 5.207586] BUG: scheduling while atomic: modprobe/262/0x10000003 [ 5.208102] INFO: lockdep is turned off. [ 5.208424] Modules linked in: xor(+) async_tx raid1 ghash_clmulni_intel microcode virtio_net virtio_scsi virtio_blk virtio_rng virtio_balloon virtio_mmio sparse_keymap rfkill sym53c8xx scsi_transport_spi crc8 crc_ccitt crc_itu_t crc32c_intel libcrc32c [ 5.210496] Pid: 262, comm: modprobe Tainted: G D W 3.6.2-2.fc18.x86_64.debug #1 [ 5.211161] Call Trace: [ 5.211369] [<ffffffff816cfa51>] __schedule_bug+0x67/0x75 [ 5.211825] [<ffffffff816da78b>] __schedule+0x98b/0x9f0 [ 5.212264] [<ffffffff810976d0>] ? lock_hrtimer_base.isra.20+0x30/0x60 [ 5.212772] [<ffffffff810a500a>] __cond_resched+0x2a/0x40 [ 5.213241] [<ffffffff816da870>] _cond_resched+0x30/0x40 [ 5.213657] [<ffffffff816d9afb>] down_read+0x2b/0x98 [ 5.214091] [<ffffffff810e7fae>] acct_collect+0x4e/0x1b0 [ 5.214507] [<ffffffff8106ee05>] do_exit+0x805/0xb00 [ 5.214896] [<ffffffff8106b348>] ? kmsg_dump+0x1b8/0x240 [ 5.215363] [<ffffffff8106b1b5>] ? kmsg_dump+0x25/0x240 [ 5.215773] [<ffffffff816dda5d>] oops_end+0x9d/0xe0 [ 5.216205] [<ffffffff8101d9e8>] die+0x58/0x90 [ 5.216557] [<ffffffff816dd320>] do_trap+0xc0/0x170 [ 5.216942] [<ffffffff8101ae96>] ? do_invalid_op+0x86/0xc0 [ 5.217419] [<ffffffff8101aec0>] do_invalid_op+0xb0/0xc0 [ 5.217834] [<ffffffffa0095c6c>] ? xor_avx_2+0x5c/0x270 [xor] [ 5.218324] [<ffffffff8134df7d>] ? trace_hardirqs_off_thunk+0x3a/0x3c [ 5.218834] [<ffffffff816dcb60>] ? restore_args+0x30/0x30 [ 5.219287] [<ffffffff816e663b>] invalid_op+0x1b/0x20 [ 5.219704] [<ffffffffa0095c6c>] ? xor_avx_2+0x5c/0x270 [xor] [ 5.220181] [<ffffffff8134df3e>] ? trace_hardirqs_on_thunk+0x3a/0x3f [ 5.220700] [<ffffffff8111c227>] ? rcu_irq_exit+0x87/0xd0 [ 5.221153] [<ffffffff816dcb30>] ? retint_restore_args+0x13/0x13 [ 5.221638] [<ffffffffa0096a69>] do_xor_speed+0x7d/0xe7 [xor] [ 5.222140] [<ffffffffa0005075>] calibrate_xor_blocks+0x75/0x1000 [xor] [ 5.222656] [<ffffffffa0005000>] ? 0xffffffffa0004fff [ 5.223103] [<ffffffff8100212a>] do_one_initcall+0x12a/0x180 [ 5.223550] [<ffffffff810e3553>] sys_init_module+0x133/0x2340 [ 5.224034] [<ffffffff81362490>] ? ddebug_proc_open+0xd0/0xd0 [ 5.224492] [<ffffffff81099143>] ? up_write+0x23/0x40 [ 5.224887] [<ffffffff816e55e9>] system_call_fastpath+0x16/0x1b [ 5.225927] md: personality for level 5 is not loaded! [ 5.226402] md: md125 stopped. [ 5.226664] md: unbind<sdd2> [ 5.226902] md: export_rdev(sdd2) mdadm: RUN_ARRAY failed: Invalid argument [ 5.227520] md: unbind<sdc2> [ 5.227759] md: export_rdev(sdc2) [ 5.228085] md: unbind<sdb2> [ 5.228343] md: export_rdev(sdb2) [ 5.228626] md: unbind<sda2> [ 5.228863] md: export_rdev(sda2)
Instruction that KVM failed to parse was: bfc: c5 fc 29 04 24 vmovaps %ymm0,(%rsp)
Apparently you can't just change the libvirt XML to disable features that don't work: <cpu mode="host-model"> <model fallback="allow"/> <feature policy="disable" name="avx"/> </cpu> gives the error: *stdin*:6: libguestfs: error: could not create appliance through libvirt: internal error Non-empty feature list specified without CPU model [code=1 domain=31]
(In reply to comment #2) > Apparently you can't just change the libvirt XML to disable > features that don't work: > > <cpu mode="host-model"> > <model fallback="allow"/> > <feature policy="disable" name="avx"/> > </cpu> Right, bug 799354 is tracking that.
Could you share the QEMU command line generated by libvirt? I believe, it does not explicitly mention avx, i.e., it gets there through SandyBridge model, right? Anyway, avx is supposed to work with KVM since QEMU supports SandyBridge model, which enables avx. Thus, it's either QEMU or kernel bug. I'm moving this bug to the former for further investigation.
LC_ALL=C LD_LIBRARY_PATH=/tmp/whenjobs2f2d92b86ba2111addc7e199fa77e648/libguestfs-1.19.53/src/.libs:/tmp/whenjobs2f2d92b86ba2111addc7e199fa77e648/libguestfs-1.19.53/gobject/.libs:/tmp/whenjobs2f2d92b86ba2111addc7e199fa77e648/libguestfs-1.19.53/ruby/ext/guestfs PATH=/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/rjones/.local/bin:/home/rjones/bin HOME=/home/rjones USER=rjones LOGNAME=rjones TMPDIR=/home/rjones/d/libguestfs/tmp /usr/bin/qemu-kvm -name guestfs-1t6f28e33d5tqbmu -S -M pc-1.2 -cpu Westmere,+rdtscp,+avx,+osxsave,+xsave,+tsc-deadline,+pdcm,+xtpr,+tm2,+est,+vmx,+ds_cpl,+monitor,+dtes64,+pclmuldq,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme -enable-kvm -m 500 -smp 1,sockets=1,cores=1,threads=1 -uuid 3ab7b5c6-31ff-2591-bc35-4d338347423c -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/home/rjones/.config/libvirt/qemu/lib/guestfs-1t6f28e33d5tqbmu.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-reboot -no-shutdown -no-acpi -kernel /home/rjones/d/libguestfs/tmp/.guestfs-1000/kernel.17966 -initrd /home/rjones/d/libguestfs/tmp/.guestfs-1000/initrd.17966 -append panic=1 console=ttyS0 udevtimeout=600 no_timer_check acpi=off printk.time=1 cgroup_disable=memory root=/dev/sde selinux=0 guestfs_verbose=1 TERM=xterm -device virtio-scsi-pci,id=scsi0,bus=pci.0,addr=0x3 -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4 -drive file=/home/rjones/d/libguestfs/tests/md/md-test1.img,if=none,id=drive-scsi0-0-0-0,format=raw,cache=none -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 -drive file=/home/rjones/d/libguestfs/tests/md/md-test2.img,if=none,id=drive-scsi0-0-1-0,format=raw,cache=none -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=1,lun=0,drive=drive-scsi0-0-1-0,id=scsi0-0-1-0 -drive file=/home/rjones/d/libguestfs/tests/md/md-test3.img,if=none,id=drive-scsi0-0-2-0,format=raw,cache=none -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=2,lun=0,drive=drive-scsi0-0-2-0,id=scsi0-0-2-0 -drive file=/home/rjones/d/libguestfs/tests/md/md-test4.img,if=none,id=drive-scsi0-0-3-0,format=raw,cache=none -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=3,lun=0,drive=drive-scsi0-0-3-0,id=scsi0-0-3-0 -drive file=/home/rjones/d/libguestfs/tmp/libguestfsowvgCb/snapshot1,if=none,id=drive-scsi0-0-4-0,format=qcow2,cache=unsafe -device scsi-hd,bus=scsi0.0,channel=0,scsi-id=4,lun=0,drive=drive-scsi0-0-4-0,id=scsi0-0-4-0 -chardev socket,id=charserial0,path=/home/rjones/d/libguestfs/tmp/libguestfsowvgCb/console.sock -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/home/rjones/d/libguestfs/tmp/libguestfsowvgCb/guestfsd.sock -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.libguestfs.channel.0 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
Eduardo, can you comment here?
Interesting, your SandyBridge machine is likely missing x2apic feature. If had that feature, libvirt would report SandyBridge instead of Westmere + avx. Could you try changing the XML to contain <cpu mode='custom' match='exact'> <model fallback='forbid'>SandyBridge</model> <feature policy='disable' name='x2apic'/> </cpu> and see if that makes any difference?
Yes, this works: <cpu mode='custom' match='exact'> <model fallback='forbid'>SandyBridge</model> <feature policy='disable' name='x2apic'/> </cpu> Host /proc/cpuinfo is below. It is indeed missing x2apic. processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz stepping : 7 microcode : 0x28 cpu MHz : 1600.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 6822.32 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz stepping : 7 microcode : 0x28 cpu MHz : 1600.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 1 cpu cores : 4 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 6822.32 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz stepping : 7 microcode : 0x28 cpu MHz : 1600.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 2 cpu cores : 4 apicid : 4 initial apicid : 4 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 6822.32 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz stepping : 7 microcode : 0x28 cpu MHz : 1600.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 6 initial apicid : 6 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 6822.32 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 4 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz stepping : 7 microcode : 0x28 cpu MHz : 1600.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 6822.32 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 5 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz stepping : 7 microcode : 0x28 cpu MHz : 1600.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 1 cpu cores : 4 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 6822.32 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 6 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz stepping : 7 microcode : 0x28 cpu MHz : 1600.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 2 cpu cores : 4 apicid : 5 initial apicid : 5 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 6822.32 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 7 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz stepping : 7 microcode : 0x28 cpu MHz : 1600.000 cache size : 8192 KB physical id : 0 siblings : 8 core id : 3 cpu cores : 4 apicid : 7 initial apicid : 7 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid bogomips : 6822.32 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management:
Thanks, what about: <cpu mode='custom' match='exact'> <model fallback='forbid'>SandyBridge</model> <feature policy='disable' name='x2apic'/> <feature policy='require' name='osxsave'/> <feature policy='require' name='pdcm'/> <feature policy='require' name='xtpr'/> <feature policy='require' name='tm2'/> <feature policy='require' name='est'/> <feature policy='require' name='vmx'/> <feature policy='require' name='ds_cpl'/> <feature policy='require' name='monitor'/> <feature policy='require' name='dtes64'/> <feature policy='require' name='pbe'/> <feature policy='require' name='tm'/> <feature policy='require' name='ht'/> <feature policy='require' name='ss'/> <feature policy='require' name='acpi'/> <feature policy='require' name='ds'/> <feature policy='require' name='vme'/> </cpu> that should give you the same (feature-wise) CPU but using SandyBridge rather than Westmere model. And BTW, it looks like we have a bug since <cpu mode='custom' match='exact'> <model fallback='forbid'>SandyBridge</model> <feature policy='force' name='x2apic'/> </cpu> should work even if the host CPU does not support x2apic (and AFAIK x2apic is one of the features that QEMU will emulate) but I tried that and libvirt is complaining that x2apic is not supported by host CPU.
This works: <cpu mode='custom' match='exact'> <model fallback='forbid'>SandyBridge</model> <feature policy='disable' name='x2apic'/> <feature policy='require' name='osxsave'/> <feature policy='require' name='pdcm'/> <feature policy='require' name='xtpr'/> <feature policy='require' name='tm2'/> <feature policy='require' name='est'/> <feature policy='require' name='vmx'/> <feature policy='require' name='ds_cpl'/> <feature policy='require' name='monitor'/> <feature policy='require' name='dtes64'/> <feature policy='require' name='pbe'/> <feature policy='require' name='tm'/> <feature policy='require' name='ht'/> <feature policy='require' name='ss'/> <feature policy='require' name='acpi'/> <feature policy='require' name='ds'/> <feature policy='require' name='vme'/> </cpu> I also tried above plus: <feature policy='require' name='avx'/> which *worked*. Is that expected?
The <feature policy='require' name='avx'/> element is redundant since avx is already required by the SandyBridge model, the guest OS should see exactly the same CPU regardless on this element. Anyway, it's expected that SandyBridge works with avx since it explicitly has support for it. The fact that it doesn't work when it's added on top of Westmere is unfortunate but not entirely surprising. It's likely influenced by bits that are not covered by libvirt, such as cpu family, model, stepping and other stuff. We've seen this behaviour in the past. I think we need a new mode in addition to custom, host-model, and host-passthrough, that would be similar to host-model but will only use bare CPU model without trying to add all features that are not included in the model but supported by host CPU. The situation may also become a bit better once we have a better interface for CPU probing (bug 824989). Eduardo, could you confirm that the kernel panic might be caused by libvirt using Westmere + avx and that is an unsupported configuration? If so, we can move this bug to libvirt.
"-cpu Westmere,+avx" actually should enable the bit on CPUID if and only if KVM is able to handle the feature. When KVM can't handle the feature, it should be filtered out before the guest CPUID table is built. I still don't understand why exactly the guest got an invalid operation exception, as the instruction was supposed to be working. Maybe it's related to the "level" field and the xsave feature (that is required for AVX, as far as I recall), that needs level >= 0xD. I don't know if the guest is really allowed to use the feature when the AVX bit is set but the necessary xsave bits are not present (in it is not, then this is a guest bug). If the guest was simply misled by the CPUID information, and correct in trying to use the instructions, it is a QEMU bug (QEMU should have disabled the feature, and abort in case the "enforce" flag is set). On either case, it is not a libvirt bug to ask for "-cpu Westmere,+avx". But it would be interesting if libvirt could treat some CPU features as "can be safely disabled". It would be much better if libvirt used "-cpu SandyBridge,-x2apic" on that host, instead of "-cpu Westmere,+<lots of flags>".
Reassigning to libvirt based on above discussion
Do we have an update on this? I would really like to start using host-model.
This still happens with Fedora 19, libvirt-1.0.5.5-1.fc19.x86_64. Loading the btrfs module, which loads the xor module, fails because it tries to run an AVX instruction: modprobe btrfs [ 1.804591] xor: automatically using best checksumming function: [ 1.806020] invalid opcode: 0000 [#1] SMP [ 1.806416] Modules linked in: xor(+) snd_pcsp snd_pcm snd_page_alloc ghash_clmulni_intel snd_timer microcode snd soundcore virtio_net virtio_scsi virtio_blk virtio_rng virtio_balloon virtio_mmio sparse_keymap rfkill sym53c8xx scsi_transport_spi crc8 crc_ccitt crc32 crc_itu_t crc32_pclmul crc32c_intel libcrc32c megaraid megaraid_sas megaraid_mbox megaraid_mm [ 1.809709] CPU: 0 PID: 150 Comm: modprobe Not tainted 3.10.9-200.fc19.x86_64.debug #1 [ 1.810397] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 1.810931] task: ffff880019cf8000 ti: ffff880019ca4000 task.ti: ffff880019ca4000 [ 1.811597] RIP: 0010:[<ffffffffa0119da0>] [<ffffffffa0119da0>] xor_avx_2+0x50/0x230 [xor] [ 1.812333] RSP: 0018:ffff880019ca5d08 EFLAGS: 00010202 [ 1.812811] RAX: 0000000000000007 RBX: ffff880019eb8000 RCX: ffff880019cf8000 [ 1.813429] RDX: ffff880019ca5fd8 RSI: 0000000000000000 RDI: 0000000000001000 [ 1.814071] RBP: ffff880019ca5d20 R08: 0000000000000002 R09: 0000000000000000 [ 1.814684] R10: 0000000000000001 R11: 0000000000000001 R12: ffff880019ebb000 [ 1.815320] R13: 0000000000000008 R14: ffff880019eb8000 R15: 00000000fffb732e [ 1.815952] FS: 00007fb93b96a740(0000) GS:ffff88001f000000(0000) knlGS:0000000000000000 [ 1.816647] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.817170] CR2: 00007f8d740e9000 CR3: 0000000019e2b000 CR4: 00000000000007f0 [ 1.817785] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.818424] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 1.819047] Stack: [ 1.819245] 0000000000000000 ffffffffa011c000 ffff880019ebb000 ffff880019ca5d60 [ 1.819929] ffffffffa00a7080 0000000000000005 ffff880019eb8000 ffff880019ebb000 [ 1.820651] ffffffffa011c110 0000000000000001 ffffffffa011c0c0 ffff880019ca5d80 [ 1.821354] Call Trace: [ 1.821581] [<ffffffffa00a7080>] do_xor_speed+0x80/0xe0 [xor] [ 1.822097] [<ffffffffa00a714b>] calibrate_xor_blocks+0x6b/0xf20 [xor] [ 1.822683] [<ffffffffa00a70e0>] ? do_xor_speed+0xe0/0xe0 [xor] [ 1.823214] [<ffffffff810020e2>] do_one_initcall+0xe2/0x1a0 [ 1.823723] [<ffffffff810e95e2>] load_module+0x1c62/0x27d0 [ 1.824217] [<ffffffff813749d0>] ? ddebug_proc_write+0xf0/0xf0 [ 1.824748] [<ffffffff810ea2e6>] SyS_finit_module+0x86/0xb0 [ 1.825247] [<ffffffff81720099>] system_call_fastpath+0x16/0x1b [ 1.825779] Code: 01 00 00 65 48 8b 04 25 f0 c8 00 00 83 80 44 e0 ff ff 01 e8 23 7a f0 e0 4d 85 ed 49 8d 45 ff 0f 84 9b 01 00 00 66 0f 1f 44 00 00 <c4> c1 7d 6f 04 24 c5 fc 57 03 c5 fd 7f 03 c4 c1 7d 6f 4c 24 20 [ 1.828335] RIP [<ffffffffa0119da0>] xor_avx_2+0x50/0x230 [xor] [ 1.828869] RSP <ffff880019ca5d08> [ 1.829213] ---[ end trace 70ce68c981f09edb ]--- However using plain old -cpu host on the qemu command line works fine.
Since this bug has been around for almost *a year*, and it's extremely annoying, I'm trying to work out if this is a bug in the guest kernel, qemu, or libvirt. I'm not any closer to working that out. libvirt passes the following CPU/machine-related flags: -machine pc-i440fx-1.6,accel=kvm,usb=off -cpu Westmere,+rdtscp,+avx,+osxsave,+xsave,+tsc-deadline,+pcid,+pdcm,+xtpr,+tm2,+est,+vmx,+ds_cpl,+monitor,+dtes64,+pclmuldq,+pbe,+tm,+ht,+ss,+acpi,+ds,+vme -m 500 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 Host CPU flags are reported to be: flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid cpuid level : 13 Guest CPU flags are reported to be: flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx rdtscp lm constant_tsc rep_good nopl pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes avx hypervisor lahf_lm cpuid level : 13 (In reply to Eduardo Habkost from comment #12) > "-cpu Westmere,+avx" actually should enable the bit on CPUID if and only if > KVM is able to handle the feature. When KVM can't handle the feature, it > should be filtered out before the guest CPUID table is built. I still don't > understand why exactly the guest got an invalid operation exception, as the > instruction was supposed to be working. Maybe it's related to the "level" > field and the xsave feature (that is required for AVX, as far as I recall), > that needs level >= 0xD. I see the xsave flag in the host CPU flags, and in the libvirt-generated qemu command line. I do NOT see the xsave flag in the guest flags. Not sure what that means. Assuming "level" means "cpuid level", then both report 13 == 0xD. > I don't know if the guest is really allowed to use the feature when the AVX > bit is set but the necessary xsave bits are not present (in it is not, then > this is a guest bug). As far as I can tell from the kernel code, cpu_has_avx just checks the avx feature flag. It doesn't check for xsave. The xor code which is throwing the invalid opcode is only checking cpu_has_avx, ie. only checking for the avx flag. According to the Intel PRM it does appear that you shouldn't use avx unless xsave is supported, although it doesn't appear to be an absolute requirement. I'm assuming it's something to do with those extra registers not being saved over a context switch, which doesn't sound like an invalid opcode situation to me (corrupt data OTOH). Why would xsave bit not be present in the guest? > If the guest was simply misled by the CPUID > information, and correct in trying to use the instructions, it is a QEMU bug > (QEMU should have disabled the feature, and abort in case the "enforce" flag > is set). On either case, it is not a libvirt bug to ask for "-cpu > Westmere,+avx". > > But it would be interesting if libvirt could treat some CPU features as "can > be safely disabled". It would be much better if libvirt used "-cpu > SandyBridge,-x2apic" on that host, instead of "-cpu Westmere,+<lots of > flags>".
Also to confirm, the instruction which fails is an AVX instruction (not xsave): 1c60: c4 c1 7d 6f 04 24 vmovdqa (%r12),%ymm0
Still happening in Rawhide (albeit using the F19 kernel, because the Rawhide kernel has other issues): libvirt-1.1.2-1.fc21.x86_64 qemu-1.6.0-5.fc21.x86_64 kernel-3.10.10-200.fc19.x86_64
The following program compiled and ran fine on the host, so I guess that indicates that the host has no problem with AVX instructions: .text .globl main main: movq $testdata,%r12 vmovdqa (%r12),%ymm0 /*movq (%r12),%r10*/ movq $0,%rax ret .data .align 32 testdata: .float 1,2,3,4,5,6,7,8
From Boxes bug https://bugzilla.gnome.org/show_bug.cgi?id=720798 "When I run qemu with the '-cpu host' parameter instead, GNOME Shell starts correctly." Shouldn't libvirt just use that option of qemu for 'host-model' config?
Oops, didn't mean to remove the needinfo flag.
(In reply to Zeeshan Ali from comment #20) > From Boxes bug https://bugzilla.gnome.org/show_bug.cgi?id=720798 > > "When I run qemu with the '-cpu host' parameter instead, GNOME Shell starts > correctly." > > Shouldn't libvirt just use that option of qemu for 'host-model' config? host-passthrough maps to the qemu '-cpu host' parameter. host-model is as convoluted as it is so that live migration can be supported. ie. You start a guest on host A with host-model, then migrate it to host B. What you DON'T want to happen is that -cpu host is used on host B, since likely host A and host B have different CPUs, so the guest will see a sudden change in CPU capabilities and probably crash / get inconsistent results. Instead host-model tries to create a full description of host A's CPU, and after live migration the guest sees the same (host A) CPU features. AIUI host-passthrough disables or prevents live migration in some way.
(In reply to Richard W.M. Jones from comment #22) > (In reply to Zeeshan Ali from comment #20) > > From Boxes bug https://bugzilla.gnome.org/show_bug.cgi?id=720798 > > > > "When I run qemu with the '-cpu host' parameter instead, GNOME Shell starts > > correctly." > > > > Shouldn't libvirt just use that option of qemu for 'host-model' config? > > host-passthrough maps to the qemu '-cpu host' parameter. > > host-model is as convoluted as it is so that live migration can > be supported. ie. You start a guest on host A with host-model, > then migrate it to host B. What you DON'T want to happen is > that -cpu host is used on host B, since likely host A and host B > have different CPUs, so the guest will see a sudden change in CPU > capabilities and probably crash / get inconsistent results. > Instead host-model tries to create a full description of host A's > CPU, and after live migration the guest sees the same (host A) > CPU features. > > AIUI host-passthrough disables or prevents live migration in some way. Ah, in that case I think we are actually better off using host-model in Boxes, at least for now. We don't have live migration support and if hardware changes, user will loose any saved state on the VMs.
Moving to the upstream tracker: there's ongoing work in libvirt and qemu to sort this out, but it's unlikely to ever be backportable to a stable Fedora, it will require a rebase.
*** Bug 1084576 has been marked as a duplicate of this bug. ***
Any update on this? We moved from 'host-model' to 'host-passthrough' to avoid this in Boxes and only now realized that that breaks support of non-KVM CPUs. :(
(In reply to Zeeshan Ali from comment #26) > Any update on this? We moved from 'host-model' to 'host-passthrough' to > avoid this in Boxes and only now realized that that breaks support of > non-KVM CPUs. :( Still a bug. It only happens on certain hardware (which unfortunately I own) so it's rather hard to reproduce. You should use host-passthrough, but disable it when the <domain type="qemu">. This is the logic used by libguestfs: https://github.com/libguestfs/libguestfs/blob/master/src/launch-libvirt.c#L1012 I would say your biggest problem with using host-passthrough is surely that live migration doesn't work? (Of course we don't care about that in libguestfs)
(In reply to Richard W.M. Jones from comment #27) > (In reply to Zeeshan Ali from comment #26) > > Any update on this? We moved from 'host-model' to 'host-passthrough' to > > avoid this in Boxes and only now realized that that breaks support of > > non-KVM CPUs. :( > > Still a bug. It only happens on certain hardware (which > unfortunately I own) so it's rather hard to reproduce. > > You should use host-passthrough, but disable it when the > <domain type="qemu">. This is the logic used by libguestfs: > > https://github.com/libguestfs/libguestfs/blob/master/src/launch-libvirt. > c#L1012 Hmm.. Ah thanks. So by 'disable' you mean you just leave the model to libvirt? I was thinking of following the advice in libvirt docs: "Beware, due to the way libvirt detects host CPU and due to the fact libvirt does not talk to QEMU/KVM when creating the CPU model, CPU configuration created using host-model may not work as expected. The guest CPU may differ from the configuration and it may also confuse guest OS by using a combination of CPU features and other parameters (such as CPUID level) that don't work. Until these issues are fixed, it's a good idea to avoid using host-model and use custom mode with just the CPU model from host capabilities XML." > I would say your biggest problem with using host-passthrough > is surely that live migration doesn't work? (Of course we > don't care about that in libguestfs) It certainly would be nice to have 'live migration' since Boxes always suspends VMs on exit and if you change your CPU in between, your only choice is to loose the saved state (which could easily mean lose of important data) but thats not currently my issue. Boxes unable to start the created VM on on-kvm hosts, is. :)
By "disable", I mean, don't include the <cpu> element when !kvm. And also for ARM which doesn't have a concept of -cpu host. See the code I linked to.
The bug is that features xsave and avx need level=13, but Westmere has 11. QEMU should not expose xsave unless level=13. Does it work if you remove "+xsave" from the command line of comment 16?
I cannot reproduce this with 3.16.0-0.rc3.git2.1.fc21.x86_64 I suspect the kernel has been modified to workaround buggy CPUID. Previously: [ 5.150258] async_tx: api initialized (async) [ 5.152459] xor: automatically using best checksumming function: [ 5.153064] invalid opcode: 0000 [#1] SMP Now: [ 2.198939] async_tx: api initialized (async) [ 2.201931] xor: measuring software checksum speed [ 2.212021] prefetch64-sse: 16340.000 MB/sec [ 2.222012] generic_sse: 15188.000 MB/sec [ 2.222856] xor: using function: prefetch64-sse (16340.000 MB/sec) [ 2.239949] md: raid6 personality registered for level 6 [ 2.241063] md: raid5 personality registered for level 5 [ 2.242168] md: raid4 personality registered for level 4 [ 2.244093] md/raid:md127: device sda operational as raid disk 0 [ 2.247075] md/raid:md127: allocated 0kB [ 2.247939] md/raid:md127: raid level 5 active with 1 out of 2 devices, algorithm 2 [ 2.249726] md127: detected capacity change from 0 to 103809024 [ 2.253250] md: recovery of RAID array md127 [ 2.255054] md: minimum _guaranteed_ speed: 1000 KB/sec/disk. [ 2.256554] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery. [ 2.258584] md: using 128k window, over a total of 101376k. mdadm: array /dev/md/md started.
*** Bug 1271157 has been marked as a duplicate of this bug. ***
I experience a similar issue with libvirt 1.2.20 on ubuntu 15.04 (3.19.0-30-generic #34-Ubuntu x86_64) If I use "Copy host CPU configuration", virt-manager ends up using a Westmere profile instead of Haswell (i7 4700MQ)- more details here: https://bugzilla.redhat.com/show_bug.cgi?id=1271157 If I use "Haswell" profile, I get the following error: Error starting domain: unsupported configuration: guest and host CPU are not compatible: Host CPU does not provide required features: rtm, hle, x2apic; try using 'Haswell-noTSX' CPU model If I use "Haswell-noTSX" profile, I get the following error: Error starting domain: unsupported configuration: guest and host CPU are not compatible: Host CPU does not provide required features: x2apic
*** Bug 1281971 has been marked as a duplicate of this bug. ***
This bug still crops up from time to time due to the vagrant libvirt provider defaulting to `host-model`. You can work around it by passing box.cpu-mode = 'host-passthrough' or another compatible option when you encounter this issue. (were 'box' is the provider object) Example: LV_CPU_MODE = 'host-passthrough' machine.vm.provider :libvirt do |lv, override| lv.default_prefix = TYPE_NAME lv.memory = MEM_SIZE lv.cpu_mode = LV_CPU_MODE end
This should be finally fixed by (in combination with QEMU 2.9.0): commit 2a586b4402a7637e0bef9a2876d065c0ce6bfef1 Refs: v3.1.0-9-g2a586b440 Author: Jiri Denemark <jdenemar> AuthorDate: Mon Jan 30 16:10:22 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:56 2017 +0100 qemucapstest: Update test data for QEMU 2.9.0 Signed-off-by: Jiri Denemark <jdenemar> commit 0bde051f3de02b1be25ea4a4d9f062abfa3d1397 Refs: v3.1.0-10-g0bde051f3 Author: Jiri Denemark <jdenemar> AuthorDate: Mon Jan 30 16:10:49 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:56 2017 +0100 domaincapstest: Add test data for QEMU 2.9.0 Signed-off-by: Jiri Denemark <jdenemar> commit d2f8f3052d48f284d56e27c98ce7a2ce6c656e59 Refs: v3.1.0-11-gd2f8f3052 Author: Jiri Denemark <jdenemar> AuthorDate: Wed Feb 15 10:18:53 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:56 2017 +0100 docs: Update description of the host-model CPU mode Signed-off-by: Jiri Denemark <jdenemar> commit 4c0723a1d75b981e8939c4c5b6bde7607fc7301e Refs: v3.1.0-12-g4c0723a1d Author: Jiri Denemark <jdenemar> AuthorDate: Mon Jan 30 16:30:13 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:56 2017 +0100 qemu: Rename hostCPU/feature element in capabilities cache The element will be generalized in the following commits. Signed-off-by: Jiri Denemark <jdenemar> commit 03a34f6b84da009291e8651aba71df8a6761d081 Refs: v3.1.0-13-g03a34f6b8 Author: Jiri Denemark <jdenemar> AuthorDate: Wed Feb 22 15:46:47 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:56 2017 +0100 qemu: Prepare for more types in qemuMonitorCPUModelInfo Signed-off-by: Jiri Denemark <jdenemar> commit 2fc215dd2ad4b88c1054da804c4c45b3d4e5c2fa Refs: v3.1.0-14-g2fc215dd2 Author: Jiri Denemark <jdenemar> AuthorDate: Wed Feb 22 16:01:30 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:56 2017 +0100 qemu: Store more types in qemuMonitorCPUModelInfo While query-cpu-model-expansion returns only boolean features on s390, but x86_64 reports some integer and string properties which we are interested in. Signed-off-by: Jiri Denemark <jdenemar> commit d7f054a512a911a386d9bbeec51379e4bb843ca5 Refs: v3.1.0-15-gd7f054a51 Author: Jiri Denemark <jdenemar> AuthorDate: Wed Feb 22 16:51:50 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:57 2017 +0100 qemu: Probe "max" CPU model in TCG Querying "host" CPU model expansion only makes sense for KVM. QEMU 2.9.0 introduces a new "max" CPU model which can be used to ask QEMU what the best CPU it can provide to a TCG domain is. Signed-off-by: Jiri Denemark <jdenemar> commit f0138289920d5204c1654bc9b17115d1a315d62e Refs: v3.1.0-16-gf01382899 Author: Jiri Denemark <jdenemar> AuthorDate: Wed Jan 11 14:36:34 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:57 2017 +0100 qemu: Get host CPU model from QEMU on x86_64 Until now host-model CPU mode tried to enable all CPU features supported by the host CPU even if QEMU/KVM did not support them. This caused a number of issues and made host-model quite unreliable. Asking QEMU for the CPU it can provide and the current host makes host-model much more robust. This commit fixes the following bugs: https://bugzilla.redhat.com/show_bug.cgi?id=1018251 https://bugzilla.redhat.com/show_bug.cgi?id=1371617 https://bugzilla.redhat.com/show_bug.cgi?id=1372581 https://bugzilla.redhat.com/show_bug.cgi?id=1404627 https://bugzilla.redhat.com/show_bug.cgi?id=870071 In addition to that, the following bug should be mostly limited to cases when an unsupported feature is explicitly requested: https://bugzilla.redhat.com/show_bug.cgi?id=1335534 Signed-off-by: Jiri Denemark <jdenemar> commit be3d59754b1a1da174ff1796882a0ceb35e198e8 Refs: v3.1.0-17-gbe3d59754 Author: Jiri Denemark <jdenemar> AuthorDate: Tue Jan 31 13:44:00 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:57 2017 +0100 qemu: Use enum for CPU model expansion type Signed-off-by: Jiri Denemark <jdenemar> commit bb3363c90b5b19c37f8e5b8f512eb00014d2dae4 Refs: v3.1.0-18-gbb3363c90 Author: Jiri Denemark <jdenemar> AuthorDate: Thu Feb 23 13:53:51 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:57 2017 +0100 qemu: Use full CPU model expansion on x86 The static CPU model expansion is designed to return only canonical names of all CPU properties. To maintain backwards compatibility libvirt is stuck with different spelling of some of the features, but we need to use the full expansion to get the additional spellings. In addition to returning all spelling variants for all properties the full expansion will contain properties which are not guaranteed to be migration compatible. Thus, we need to combine both expansions. First we need to call the static expansion to limit the result to migratable properties. Then we can use the result of the static expansion as an input to the full expansion to get both canonical names and their aliases. Signed-off-by: Jiri Denemark <jdenemar> commit 2f882dbfa92c14d585a786a42d284b63ffdca4e3 Refs: v3.1.0-19-g2f882dbfa Author: Jiri Denemark <jdenemar> AuthorDate: Thu Feb 23 14:31:23 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:57 2017 +0100 qemu: Make virQEMUCapsInitCPUModel testable Signed-off-by: Jiri Denemark <jdenemar> commit d065934cd07c01fbb29f25bbb223eb4ce126a90e Refs: v3.1.0-20-gd065934cd Author: Jiri Denemark <jdenemar> AuthorDate: Wed Feb 1 17:48:41 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:57 2017 +0100 cputest: Switch host CPU data scripts to model expansion Instantiating "host" CPU and querying it using qom-get has been the only way of probing host CPU via QEMU until 2.9.0 implemented query-cpu-model-expansion for x86_64. Even though libvirt never really used the old way its result can be easily converted into the one produced by query-cpu-model-expansion. Thus we can reuse the original test data and possible get new data from hosts where QEMU does not support the new QMP command. Signed-off-by: Jiri Denemark <jdenemar> commit d46a1aa4d8caafe977cc41a80ef86af1d10e60b7 Refs: v3.1.0-21-gd46a1aa4d Author: Jiri Denemark <jdenemar> AuthorDate: Mon Feb 13 14:59:42 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:57 2017 +0100 cputest: Convert all json data files to query-cpu-model-expansion Converted by running the following command, renaming the files as *.new, and committing only the *.new files. (cd tests/cputestdata; ./cpu-convert.py *.json) Signed-off-by: Jiri Denemark <jdenemar> commit a19696b5924e7512dcca4f30d15147036708389e Refs: v3.1.0-22-ga19696b59 Author: Jiri Denemark <jdenemar> AuthorDate: Mon Feb 13 10:33:52 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:57 2017 +0100 cputest: Test virQEMUCapsInitCPUModel The original test didn't use family/model numbers to make better decisions about the CPU model and thus mis-detected the model in the two cases which are modified in this commit. The detected CPU models now match those obtained from raw CPUID data. Signed-off-by: Jiri Denemark <jdenemar> commit 5e4fc2ef993343643587f2b079b63f2c9f038e6f Refs: v3.1.0-23-g5e4fc2ef9 Author: Jiri Denemark <jdenemar> AuthorDate: Mon Feb 13 15:04:38 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:57 2017 +0100 cputest: Drop obsolete CPU test data files Signed-off-by: Jiri Denemark <jdenemar> commit 8907204cd83f0ca29c48d19bbf2778132d8578a2 Refs: v3.1.0-24-g8907204cd Author: Jiri Denemark <jdenemar> AuthorDate: Mon Feb 13 15:06:35 2017 +0100 Commit: Jiri Denemark <jdenemar> CommitDate: Fri Mar 3 19:57:57 2017 +0100 cputest: Drop .new suffix from CPU test data files Signed-off-by: Jiri Denemark <jdenemar>
Note, I've filed bugs suggesting both virt-manager/virt-install and gnome-boxes should go to host-model and drop their workarounds now: https://bugzilla.redhat.com/show_bug.cgi?id=1468016 https://bugzilla.gnome.org/show_bug.cgi?id=784573