Bug 1371617 - Libvirt passes unsupported "arat" flag to QEMU when using host-model
Libvirt passes unsupported "arat" flag to QEMU when using host-model
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt (Show other bugs)
7.0
Unspecified Unspecified
high Severity high
: pre-dev-freeze
: ---
Assigned To: Jiri Denemark
Luyao Huang
: Upstream
Depends On:
Blocks: libvirtCPUconfig 1420851
  Show dependency treegraph
 
Reported: 2016-08-30 11:57 EDT by Phil Sutter
Modified: 2017-09-23 17:36 EDT (History)
37 users (show)

See Also:
Fixed In Version: libvirt-3.2.0-1.el7
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-08-01 13:14:13 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2758651 None None None 2016-11-14 22:52 EST

  None (edit)
Description Phil Sutter 2016-08-30 11:57:12 EDT
Trying to duplicate the host's CPU features can cause issues if some of them are not supported by qemu. This happened with the 'arat' CPU flag in a new installation. Here's the relevant libvirt instance log:

-----------------------
2016-08-27 11:57:53.865+0000: starting up libvirt version: 2.0.0, package: 6.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2016-08-23-01:18:42, x86-039.build.eng.bos.redhat.com), qemu version: 2.3.0 (qemu-kvm-rhev-2.3.0-31.el7_2.21), hostname: wsfd-netdev11.ntdv.lab.eng.bos.redhat.com
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=instance-00000001,debug-threads=on -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu IvyBridge,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,+smx,+est,+tm2,+xtpr,+pdcm,+pcid,+dca,+osxsave,+arat,+xsaveopt,+pdpe1gb -m 8192 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid bbeea595-943c-4464-b446-560737af1828 -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Nova,version=13.1.1-2.el7ost,serial=83b07993-8311-43c2-806e-c2d9b898186f,uuid=bbeea595-943c-4464-b446-560737af1828,family=Virtual Machine' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-4-instance-00000001/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/nova/instances/bbeea595-943c-4464-b446-560737af1828/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=31 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:56:85:da,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/bbeea595-943c-4464-b446-560737af1828/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on

(process:28017): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported
char device redirected to /dev/pts/1 (label charserial1)
2016-08-27T11:57:53.949252Z qemu-kvm: CPU feature arat not found
2016-08-27 11:57:54.136+0000: shutting down
---------------------

This is the cpuinfo entry for one of the hypervisor's cores:

------------------
processor	: 31
vendor_id	: GenuineIntel
cpu family	: 6
model		: 62
model name	: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
stepping	: 4
microcode	: 0x428
cpu MHz		: 1897.187
cache size	: 20480 KB
physical id	: 1
siblings	: 16
core id		: 7
cpu cores	: 8
apicid		: 47
initial apicid	: 47
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt
bogomips	: 5205.93
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:
---------------

The list of allowed CPU flags can be extracted from the installed qemu binary by calling '/usr/libexec/qemu-kvm -cpu help'. In my case, 'arat' is not amongst the list. Using the help output's flag list should allow using a dynamic whitelist when building the qemu command line for cpu_mode=host-model configurations. The parsing overhead should be feasible given the version independence gained.
Comment 1 Daniel Berrange 2016-09-05 06:34:34 EDT
This is not an openstack bug - libvirt shouldn't be passing a feature that is not supported by qemu.
Comment 3 Jiri Denemark 2016-09-06 07:20:11 EDT
Well, currently libvirt doesn't check whether a requested CPU feature is supported by QEMU before starting a domain (QEMU does not provide any command which we could use for this). We could work with QEMU folks to fix this.

However, you are mixing libvirt from 7.3 and qemu-kvm-rhev from 7.2. The version of qemu-kvm-rhev (2.6.*) from 7.3 supports "arat" CPU feature and you won't see the problem there.

Moreover, copying host's CPU definition from capabilities XML is not really the best thing to do. Either use a host-passthrough or a host-model CPU in domain XML. "host-model" won't fix the problem right now, but in the (hopefully near) future it will only use CPU features that are supported on the current host/kernel/QEMU combination.

That said, even if we start checking supported CPU features, the domain will still fail to start if an unsupported feature is explicitly requested in domain XML. Libvirt will just provide a better error message. I'll take this bug as a request to do so since we have enough bugs that already cover the "host-model" issue.
Comment 4 Phil Sutter 2016-09-06 09:33:40 EDT
Hi,

I stumbled upon this issue when trying to set up OpenStack from RDO on a RHEL7.3 machine from beaker. There is no package qemu-kvm-rhev installed at all. Instead I have qemu-kvm-1.5.3-122.el7.x86_64 which also provides /usr/libexec/qemu-kvm used by OpenStack. All installed libvirt packages (apart from libvirt-python) are of version 2.0.0-6.el7. All these packages come from the beaker-server repository.

To my surprise, the passing of unsupported arat CPU feature seems not a fatal error. In my current setup, instances are running (and reachable) despite the error message which is found in /var/log/libvirt/qemu/instance-*.log. So it seems like there was another issue which prevented the machines from starting up.

Thanks, Phil
Comment 5 Daniel Berrange 2016-09-06 09:45:26 EDT
Use of qemu-kvm in combination with openstack is *not* supported - only qemu-kvm-rhev is permitted. We've got a bug open that will use RPM requires to prevent this mistake in future, by mandating qemu-kvm-rhev at the RPM level.
Comment 6 Dr. David Alan Gilbert 2016-10-24 06:11:42 EDT
I've seeing this on a gss box that they're trying to debug a different problem on with rhos;

qemu-kvm-rhev-2.3.0-31.el7_2.21.x86_64
libvirt-daemon-2.0.0-10.el7.x86_64

from cpuinfo:
model name	: Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdt
scp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc ap
erfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma 
cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes
 xsave avx f16c rdrand lahf_lm abm arat epb pln pts dtherm tpr_shadow vnmi flexp
riority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveo
pt cqm_llc cqm_occup_llc

LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=instance-00000081,debug-threads=on -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu Haswell-noTSX,+vme,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,+smx,+est,+tm2,+xtpr,+pdcm,+dca,+osxsave,+f16c,+rdrand,+arat,+tsc_adjust,+xsaveopt,+pdpe1gb,+abm -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid e8eac58c-19b4-4d87-8ab8-0c8a214a6ee1 -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=2015.1.4-13.el7ost,serial=f37fc981-aa7c-4ef0-afa9-afdba8cb1469,uuid=e8eac58c-19b4-4d87-8ab8-0c8a214a6ee1' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-64-instance-00000081/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 'file=rbd:vms/e8eac58c-19b4-4d87-8ab8-0c8a214a6ee1_disk:id=cinder:key=AQBCQzdW/pXyGxAAsiM5BFFqIiGlLMBBEE+0OA==:auth_supported=cephx\;none:mon_host=10.74.128.30\:6789\;10.74.128.31\:6789\;10.74.128.32\:6789,format=raw,if=none,id=drive-virtio-disk0,cache=writeback' -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=32,id=hostnet0,vhost=on,vhostfd=34 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:ee:94:11,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/e8eac58c-19b4-4d87-8ab8-0c8a214a6ee1/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on

2016-10-24T04:18:34.105911Z qemu-kvm: CPU feature arat not found
Comment 7 Dr. David Alan Gilbert 2016-10-24 06:56:26 EDT
I *think* openstack is using host-model for the cpu type by default, which I guess is the problem?
Comment 8 Jiri Denemark 2016-10-24 08:23:58 EDT
Partially. It's a combination of host-model and using libvirtd from 7.3 with qemu-kvm-rhev from 7.2. The old qemu-kvm-rhev does not support arat.
Comment 10 Jeremy 2016-11-04 13:05:37 EDT
seeing this issue as well:


instnace id  :4e117abe-c875-4821-8acf-3e64c142bcbb    


###nova-conductor.log
2016-11-04 11:16:28.615 23729 ERROR nova.scheduler.utils [req-8997e7bb-777e-4109-bbca-3af1421f38a9 612df114ddc5410aa09c709769f6d7c8 1b9ba84140514e9eaa4fd4d8ebcacb02 - - -] [instance: 4e117abe-c875-4821-8acf-3e64c142bcbb] Error from last host: ucs-c (node ucs-c.lwr04.cisco.com): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1905, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2082, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 4e117abe-c875-4821-8acf-3e64c142bcbb was re-scheduled: internal error: process exited while connecting to monitor: \n(process:15864): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported\n2016-11-04T15:16:28.209775Z qemu-kvm: CPU feature arat not found\n']
2016-11-04 11:16:31.269 23703 ERROR nova.scheduler.utils [req-8997e7bb-777e-4109-bbca-3af1421f38a9 612df114ddc5410aa09c709769f6d7c8 1b9ba84140514e9eaa4fd4d8ebcacb02 - - -] [instance: 4e117abe-c875-4821-8acf-3e64c142bcbb] Error from last host: ucs-d (node ucs-d.lwr04.cisco.com): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1905, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2082, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 4e117abe-c875-4821-8acf-3e64c142bcbb was re-scheduled: internal error: qemu unexpectedly closed the monitor: \n(process:14773): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported\n2016-11-04T15:16:30.582633Z qemu-kvm: CPU feature arat not found\n']
2016-11-04 11:16:31.273 23703 INFO oslo.messaging._drivers.impl_rabbit [req-8997e7bb-777e-4109-bbca-3af1421f38a9 612df114ddc5410aa09c709769f6d7c8 1b9ba84140514e9eaa4fd4d8ebcacb02 - - -] Connecting to AMQP server on 10.90.54.160:5672
2016-11-04 11:16:31.282 23703 INFO oslo.messaging._drivers.impl_rabbit [req-8997e7bb-777e-4109-bbca-3af1421f38a9 612df114ddc5410aa09c709769f6d7c8 1b9ba84140514e9eaa4fd4d8ebcacb02 - - -] Connected to AMQP server on 10.90.54.160:5672
2016-11-04 11:16:34.055 23765 ERROR nova.scheduler.utils [req-8997e7bb-777e-4109-bbca-3af1421f38a9 612df114ddc5410aa09c709769f6d7c8 1b9ba84140514e9eaa4fd4d8ebcacb02 - - -] [instance: 4e117abe-c875-4821-8acf-3e64c142bcbb] Error from last host: ucs-e (node ucs-e.lwr04.cisco.com): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1905, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2082, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 4e117abe-c875-4821-8acf-3e64c142bcbb was re-scheduled: internal error: process exited while connecting to monitor: \n(process:14071): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported\n2016-11-04T15:16:33.504622Z qemu-kvm: CPU feature arat not found\n']
2016-11-04 11:16:34.056 23765 WARNING nova.scheduler.utils [req-8997e7bb-777e-4109-bbca-3af1421f38a9 612df114ddc5410aa09c709769f6d7c8 1b9ba84140514e9eaa4fd4d8ebcacb02 - - -] Failed to compute_task_build_instances: Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance 4e117abe-c875-4821-8acf-3e64c142bcbb. Last exception: [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1905, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2082, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 4e117abe-c875-4821-8acf-3e64c142bcbb was re-scheduled: internal error: process exited while connecting to monitor: \n(process:14071): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported\n2016-11-04T15:16:33.504622Z qemu-kvm: CPU feature arat not found\n']
2016-11-04 11:16:34.057 23765 WARNING nova.scheduler.utils [req-8997e7bb-777e-4109-bbca-3af1421f38a9 612df114ddc5410aa09c709769f6d7c8 1b9ba84140514e9eaa4fd4d8ebcacb02 - - -] [instance: 4e117abe-c875-4821-8acf-3e64c142bcbb] Setting instance to ERROR state.


###from compute node, it has arat flag...
vendor_id       : GenuineIntel
cpu family      : 6
model           : 62
model name      : Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz
stepping        : 4
microcode       : 0x428
cpu MHz         : 2248.382
cache size      : 30720 KB
physical id     : 0
siblings        : 24
core id         : 0
cpu cores       : 12
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt


###compute node /var/log/libvirt/qemu/instance-00000005.log 
2016-11-04 15:16:33.426+0000: starting up libvirt version: 2.0.0, package: 10.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2016-09-21-10:15:26, x86-038.build.eng.bos.redhat.com), qemu version: 2.3.0 (qemu-kvm-rhev-2.3.0-31.el7_2.21), hostname: ucs-e.lwr04.cisco.com
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=instance-00000005,debug-threads=on -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu IvyBridge,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,+smx,+est,+tm2,+xtpr,+pdcm,+pcid,+dca,+osxsave,+arat,+xsaveopt,+pdpe1gb -m 512 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 4e117abe-c875-4821-8acf-3e64c142bcbb -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=12.0.4-8.el7ost,serial=fbe41075-92fc-486a-91b4-7086104ed812,uuid=4e117abe-c875-4821-8acf-3e64c142bcbb,family=Virtual Machine' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-5-instance-00000005/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/nova/instances/4e117abe-c875-4821-8acf-3e64c142bcbb/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=31 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:11:9f:fc,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/4e117abe-c875-4821-8acf-3e64c142bcbb/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on

(process:14071): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported
char device redirected to /dev/pts/0 (label charserial1)
2016-11-04T15:16:33.504622Z qemu-kvm: CPU feature arat not found
2016-11-04 15:16:33.570+0000: shutting down
Comment 13 Shanmugavel Balu 2016-11-06 21:45:40 EST
I too seeing this issue, any workaround available?

Instance Log:
(process:5076): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported
char device redirected to /dev/pts/1 (label charserial1)
2016-11-07T02:32:04.022357Z qemu-kvm: CPU feature arat not found
2016-11-07 02:32:04.101+0000: shutting down

Conductor:
2016-11-06 21:32:04.461 5713 ERROR nova.scheduler.utils [req-cefdd8bd-cb5c-4334-883b-57fb742984bd 2ec0f13b4e494d49b2901281fc640d71 c900d0d4a8d24a8ab29fdb06b5f81d99 - - -] [instance: d7fac7ba-7e30-42fe-beb0-e2c605f8bc1b] Error from last host: chennai-compute1 (node chennai-compute1): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1905, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2082, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance d7fac7ba-7e30-42fe-beb0-e2c605f8bc1b was re-scheduled: internal error: process exited while connecting to monitor: \n(process:5076): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported\n2016-11-07T02:32:04.022357Z qemu-kvm: CPU feature arat not found\n']
Comment 20 Jiri Denemark 2016-11-07 17:27:33 EST
As explained above qemu-kvm-rhev package needs to be installed since openstack is not supported with qemu-kvm. And since running qemu-kvm-rhev-2.3.0 (from 7.2) is unsupported on RHEL 7.3, you need to install qemu-kvm-rhev-2.6.0 (from 7.3). Once installed, everything should work fine with libvirt from 7.3.

In addition to this, there are a few possible ways to work around this issue:

1) tell openstack not to use host-model by modifying cpu_mode and cpu_model in nova.conf

2) comment out <feature name='arat'>...</feature> and <model name='Skylake-Client'>...</model> in /usr/share/libvirt/cpu_map.xml


In the future, libvirt's host-model CPU mode will be fixed to avoid features unknown to QEMU. But this requires changes to both libvirt and QEMU and we're working on them upstream.

I also asked for a QEMU interface which would allow us to check what CPU features it supports, but since it's a new interface, it won't really help in this situation (which is caused by using new libvirt with old QEMU).
Comment 22 Shanmugavel Balu 2016-11-07 18:23:04 EST
Hello, I upgraded to qemu-kvm-rhev-2.6.0-27.el7.x86_64 and not seeing the above issue, but after upgrade i did nova compute and libvirtd restart and i am seeing the below issue [permission], tried multiple options and nothing worked.

2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [req-17b73d5b-a406-439e-8e4d-9a40c04a1f9a 2ec0f13b4e494d49b2901281fc640d71 c900d0d4a8d24a8ab29fdb06b5f81d99 - - -] [instance: 9730e067-3c4f-4241-9bae-693d95f53e00] Instance failed to spawn
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00] Traceback (most recent call last):
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2180, in _build_resources
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     yield resources
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2033, in _build_and_run_instance
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     block_device_info=block_device_info)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2596, in spawn
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     block_device_info=block_device_info)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4719, in _create_domain_and_network
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     xml, pause=pause, power_on=power_on)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4649, in _create_domain
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     guest.launch(pause=pause)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 142, in launch
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     self._encoded_xml, errors='ignore')
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 204, in __exit__
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     six.reraise(self.type_, self.value, self.tb)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 137, in launch
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     return self._domain.createWithFlags(flags)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     result = proxy_call(self._autowrap, f, *args, **kwargs)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     rv = execute(f, *args, **kwargs)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     six.reraise(c, e, tb)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     rv = meth(*args, **kwargs)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1065, in createWithFlags
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00] libvirtError: Unable to open file: /var/lib/nova/instances/9730e067-3c4f-4241-9bae-693d95f53e00/console.log: Permission denied
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00] 
2016-11-07 18:06:34.811 19880 INFO nova.compute.manager [req-17b73d5b-a406-439e-8e4d-9a40c04a1f9a 2ec0f13b4e494d49b2901281fc640d71 c900d0d4a8d24a8ab29fdb06b5f81d99 - - -] [instance: 9730e067-3c4f-4241-9bae-693d95f53e00] Terminating instance
Comment 24 Jiri Denemark 2016-11-08 03:52:30 EST
(In reply to Shanmugavel Balu from comment #22)
> 2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance:
> 9730e067-3c4f-4241-9bae-693d95f53e00] libvirtError: Unable to open file:
> /var/lib/nova/instances/9730e067-3c4f-4241-9bae-693d95f53e00/console.log:
> Permission denied

This is of course a completely different issue, and it really doesn't belong to this bz. Anyway, it's most likely caused by a "regression", which was inevitable, see bug 1371125. The required changes are described in https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Deployment_and_Administration_Guide/sect-Manipulating_the_domain_xml-Devices.html#sect-Devices-Host_physical_machine_interface

In short, the directories to which nova redirects domains' console output need to be labeled with virt_log_t.
Comment 27 Jiri Denemark 2016-11-08 07:32:01 EST
Nice, so this is actually similar to bug 1365500. Just that unlike CMT, ARAT is supported by some QEMU versions. Since QEMU has no interface to give us a list of supported CPU features (I asked for it and Eduardo is working on it upstream), we should just treat it as unsupported and ignore it when creating host-model CPU. This would mean even new QEMU won't get the feature turned on with host-model CPU, but that shouldn't be a big issue. The current host-model implementation is just wrong anyway and it's only a pure luck that it ever worked so removing one more CPU feature from the guest CPU doesn't make things any worse.
Comment 28 Dr. David Alan Gilbert 2016-11-08 13:39:16 EST
It looks like the new qemu has just hit the RHOS repos:

 qemu-kvm-rhev  x86_64  10:2.6.0-27.el7   rhel-7-server-openstack-9-rpms  2.4 M

so if you've got customers who've hit this then I guess that should fix it for them.
Comment 29 Shanmugavel Balu 2016-11-08 14:47:50 EST
Thanks, stdio_handler = "file" in qemu.conf worked.

Current pkg version on my setup:
qemu-kvm-rhev-2.6.0-27.el7.x86_64
libvirt-daemon-kvm-2.0.0-10.el7.x86_64
Comment 34 Stephen Gordon 2016-12-13 23:29:01 EST
(In reply to Jiri Denemark from comment #20)
> As explained above qemu-kvm-rhev package needs to be installed since
> openstack is not supported with qemu-kvm. And since running
> qemu-kvm-rhev-2.3.0 (from 7.2) is unsupported on RHEL 7.3, you need to
> install qemu-kvm-rhev-2.6.0 (from 7.3). Once installed, everything should
> work fine with libvirt from 7.3.

RDO CI and OpenStack CI folks who are running qemu without kvm enabled (as they have to run in public cloud environments where nested virt isn't enabled) are reporting that they still hit this issue even with qemu-kvm-(rh)ev 2.6.0 and libvirt-2.0.0.

They are likely going to end up not using host-model by modifying cpu_mode and cpu_model in nova.conf for now but it seems like the above is possibly another variation of the same root issue with regards to the introduction of a new flag.
Comment 35 Jiri Denemark 2016-12-14 04:01:57 EST
There should be no difference between KVM and TCG mode. If QEMU is new enough it will recognize the feature anyway. It certainly may not be able to use the feature, but this would only cause a warning.

Are you sure the QEMU version used is really 2.6.0?
Comment 36 Javier Peña 2016-12-14 05:03:40 EST
Since the cpu_mode issue in nova.conf might be a different one, I have created https://bugzilla.redhat.com/show_bug.cgi?id=1404627 with a detailed description.
Comment 37 chhu 2016-12-15 00:30:55 EST
(In reply to Jiri Denemark from comment #35)
> There should be no difference between KVM and TCG mode. If QEMU is new
> enough it will recognize the feature anyway. It certainly may not be able to
> use the feature, but this would only cause a warning.
> 
> Are you sure the QEMU version used is really 2.6.0?

And, please use libvirt newer than 2.0.8, as there is fix for Bug 1365500 - CPU feature cmt not found with 2.0.0-1.
Comment 38 chhu 2016-12-15 00:32:24 EST
(In reply to chhu from comment #37)
> (In reply to Jiri Denemark from comment #35)
> > There should be no difference between KVM and TCG mode. If QEMU is new
> > enough it will recognize the feature anyway. It certainly may not be able to
> > use the feature, but this would only cause a warning.
> > 
> > Are you sure the QEMU version used is really 2.6.0?
> 
And, please use libvirt newer than libvirt-2.0.0-8.el7, as there is fix for Bug 1365500 -CPU feature cmt not found with 2.0.0-1.
Comment 39 Jiri Denemark 2017-03-03 14:27:32 EST
This should be finally fixed by (in combination with QEMU 2.9.0):

commit 2a586b4402a7637e0bef9a2876d065c0ce6bfef1
Refs: v3.1.0-9-g2a586b440
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Mon Jan 30 16:10:22 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemucapstest: Update test data for QEMU 2.9.0

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 0bde051f3de02b1be25ea4a4d9f062abfa3d1397
Refs: v3.1.0-10-g0bde051f3
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Mon Jan 30 16:10:49 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    domaincapstest: Add test data for QEMU 2.9.0

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit d2f8f3052d48f284d56e27c98ce7a2ce6c656e59
Refs: v3.1.0-11-gd2f8f3052
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Wed Feb 15 10:18:53 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    docs: Update description of the host-model CPU mode

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 4c0723a1d75b981e8939c4c5b6bde7607fc7301e
Refs: v3.1.0-12-g4c0723a1d
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Mon Jan 30 16:30:13 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemu: Rename hostCPU/feature element in capabilities cache

    The element will be generalized in the following commits.

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 03a34f6b84da009291e8651aba71df8a6761d081
Refs: v3.1.0-13-g03a34f6b8
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Wed Feb 22 15:46:47 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemu: Prepare for more types in qemuMonitorCPUModelInfo

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 2fc215dd2ad4b88c1054da804c4c45b3d4e5c2fa
Refs: v3.1.0-14-g2fc215dd2
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Wed Feb 22 16:01:30 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemu: Store more types in qemuMonitorCPUModelInfo

    While query-cpu-model-expansion returns only boolean features on s390,
    but x86_64 reports some integer and string properties which we are
    interested in.

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit d7f054a512a911a386d9bbeec51379e4bb843ca5
Refs: v3.1.0-15-gd7f054a51
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Wed Feb 22 16:51:50 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Probe "max" CPU model in TCG

    Querying "host" CPU model expansion only makes sense for KVM. QEMU 2.9.0
    introduces a new "max" CPU model which can be used to ask QEMU what the
    best CPU it can provide to a TCG domain is.

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit f0138289920d5204c1654bc9b17115d1a315d62e
Refs: v3.1.0-16-gf01382899
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Wed Jan 11 14:36:34 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Get host CPU model from QEMU on x86_64

    Until now host-model CPU mode tried to enable all CPU features supported
    by the host CPU even if QEMU/KVM did not support them. This caused a
    number of issues and made host-model quite unreliable. Asking QEMU for
    the CPU it can provide and the current host makes host-model much more
    robust.

    This commit fixes the following bugs:

        https://bugzilla.redhat.com/show_bug.cgi?id=1018251
        https://bugzilla.redhat.com/show_bug.cgi?id=1371617
        https://bugzilla.redhat.com/show_bug.cgi?id=1372581
        https://bugzilla.redhat.com/show_bug.cgi?id=1404627
        https://bugzilla.redhat.com/show_bug.cgi?id=870071

    In addition to that, the following bug should be mostly limited to cases
    when an unsupported feature is explicitly requested:

       	https://bugzilla.redhat.com/show_bug.cgi?id=1335534

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit be3d59754b1a1da174ff1796882a0ceb35e198e8
Refs: v3.1.0-17-gbe3d59754
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Tue Jan 31 13:44:00 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Use enum for CPU model expansion type

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit bb3363c90b5b19c37f8e5b8f512eb00014d2dae4
Refs: v3.1.0-18-gbb3363c90
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Thu Feb 23 13:53:51 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Use full CPU model expansion on x86

    The static CPU model expansion is designed to return only canonical
    names of all CPU properties. To maintain backwards compatibility libvirt
    is stuck with different spelling of some of the features, but we need to
    use the full expansion to get the additional spellings. In addition to
    returning all spelling variants for all properties the full expansion
    will contain properties which are not guaranteed to be migration
    compatible. Thus, we need to combine both expansions. First we need to
    call the static expansion to limit the result to migratable properties.
    Then we can use the result of the static expansion as an input to the
    full expansion to get both canonical names and their aliases.

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 2f882dbfa92c14d585a786a42d284b63ffdca4e3
Refs: v3.1.0-19-g2f882dbfa
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Thu Feb 23 14:31:23 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Make virQEMUCapsInitCPUModel testable

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit d065934cd07c01fbb29f25bbb223eb4ce126a90e
Refs: v3.1.0-20-gd065934cd
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Wed Feb 1 17:48:41 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Switch host CPU data scripts to model expansion

    Instantiating "host" CPU and querying it using qom-get has been the only
    way of probing host CPU via QEMU until 2.9.0 implemented
    query-cpu-model-expansion for x86_64. Even though libvirt never really
    used the old way its result can be easily converted into the one
    produced by query-cpu-model-expansion. Thus we can reuse the original
    test data and possible get new data from hosts where QEMU does not
    support the new QMP command.

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit d46a1aa4d8caafe977cc41a80ef86af1d10e60b7
Refs: v3.1.0-21-gd46a1aa4d
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Mon Feb 13 14:59:42 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Convert all json data files to query-cpu-model-expansion

    Converted by running the following command, renaming the files as
    *.new, and committing only the *.new files.

        (cd tests/cputestdata; ./cpu-convert.py *.json)

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit a19696b5924e7512dcca4f30d15147036708389e
Refs: v3.1.0-22-ga19696b59
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Mon Feb 13 10:33:52 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Test virQEMUCapsInitCPUModel

    The original test didn't use family/model numbers to make better
    decisions about the CPU model and thus mis-detected the model in the two
    cases which are modified in this commit. The detected CPU models now
    match those obtained from raw CPUID data.

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 5e4fc2ef993343643587f2b079b63f2c9f038e6f
Refs: v3.1.0-23-g5e4fc2ef9
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Mon Feb 13 15:04:38 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Drop obsolete CPU test data files

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 8907204cd83f0ca29c48d19bbf2778132d8578a2
Refs: v3.1.0-24-g8907204cd
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Mon Feb 13 15:06:35 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Drop .new suffix from CPU test data files

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Comment 41 Joby James 2017-04-07 13:38:38 EDT
Hi All,

I am hitting this issue with RHEL 7.3 with qemu qemu-kvm-common-rhev-2.4.0-0.el7_2 and libvirt-daemon-driver-qemu-2.0.0*
I understand that I need to upgrade to qemu-kvm-rhev-2.6.0-27.el7.x86_64, libvirt-daemon-2.0.0-10.el7.x86_64.

From where I can download these packages? I have RHEL subscriptions for Red Hat OpenStack Platform



Instance Log:
(process:5076): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported
char device redirected to /dev/pts/1 (label charserial1)
2016-11-07T02:32:04.022357Z qemu-kvm: CPU feature arat not found
2016-11-07 02:32:04.101+0000: shutting down

Conductor:
2016-11-06 21:32:04.461 5713 ERROR nova.scheduler.utils [req-cefdd8bd-cb5c-4334-883b-57fb742984bd 2ec0f13b4e494d49b2901281fc640d71 c900d0d4a8d24a8ab29fdb06b5f81d99 - - -] [instance: d7fac7ba-7e30-42fe-
Comment 42 Kashyap Chamarthy 2017-04-07 14:39:04 EDT
(In reply to Joby James from comment #41)
> Hi All,
> 
> I am hitting this issue with RHEL 7.3 with qemu
> qemu-kvm-common-rhev-2.4.0-0.el7_2 and libvirt-daemon-driver-qemu-2.0.0*
> I understand that I need to upgrade to qemu-kvm-rhev-2.6.0-27.el7.x86_64,
> libvirt-daemon-2.0.0-10.el7.x86_64.
> 
> From where I can download these packages? I have RHEL subscriptions for Red
> Hat OpenStack Platform

Hi Joby,

If you have the right subscription for RHOS, and the said channels are enabled on your RHEL machine, you should get the right versions.

From my quick verification, if you're using RHOS-9 (which you are, from your description -- "OpenStack Nova,version=13.1.1-2.el7ost", which means "Mitaka" release, which means RHOS-9), or RHOS-10, this is the latest qemu-kvm-rhev version should be available to you: 

    qemu-kvm-rhev-2.6.0-28.el7_3.6 

Which should contain the necessary fixes

> 
> 
> Instance Log:
> (process:5076): GLib-WARNING **: gmem.c:482: custom memory allocation vtable
> not supported
> char device redirected to /dev/pts/1 (label charserial1)
> 2016-11-07T02:32:04.022357Z qemu-kvm: CPU feature arat not found
> 2016-11-07 02:32:04.101+0000: shutting down
> 
> Conductor:
> 2016-11-06 21:32:04.461 5713 ERROR nova.scheduler.utils
> [req-cefdd8bd-cb5c-4334-883b-57fb742984bd 2ec0f13b4e494d49b2901281fc640d71
> c900d0d4a8d24a8ab29fdb06b5f81d99 - - -] [instance: d7fac7ba-7e30-42fe-
Comment 43 Joby James 2017-04-07 14:46:47 EDT
(In reply to Kashyap Chamarthy from comment #42)
> (In reply to Joby James from comment #41)
> > Hi All,
> > 
> > I am hitting this issue with RHEL 7.3 with qemu
> > qemu-kvm-common-rhev-2.4.0-0.el7_2 and libvirt-daemon-driver-qemu-2.0.0*
> > I understand that I need to upgrade to qemu-kvm-rhev-2.6.0-27.el7.x86_64,
> > libvirt-daemon-2.0.0-10.el7.x86_64.
> > 
> > From where I can download these packages? I have RHEL subscriptions for Red
> > Hat OpenStack Platform
> 
> Hi Joby,
> 
> If you have the right subscription for RHOS, and the said channels are
> enabled on your RHEL machine, you should get the right versions.
> 
> From my quick verification, if you're using RHOS-9 (which you are, from your
> description -- "OpenStack Nova,version=13.1.1-2.el7ost", which means
> "Mitaka" release, which means RHOS-9), or RHOS-10, this is the latest
> qemu-kvm-rhev version should be available to you: 
> 
>     qemu-kvm-rhev-2.6.0-28.el7_3.6 
> 
> Which should contain the necessary fixes
> 
> > 
> > 
> > Instance Log:
> > (process:5076): GLib-WARNING **: gmem.c:482: custom memory allocation vtable
> > not supported
> > char device redirected to /dev/pts/1 (label charserial1)
> > 2016-11-07T02:32:04.022357Z qemu-kvm: CPU feature arat not found
> > 2016-11-07 02:32:04.101+0000: shutting down
> > 
> > Conductor:
> > 2016-11-06 21:32:04.461 5713 ERROR nova.scheduler.utils
> > [req-cefdd8bd-cb5c-4334-883b-57fb742984bd 2ec0f13b4e494d49b2901281fc640d71
> > c900d0d4a8d24a8ab29fdb06b5f81d99 - - -] [instance: d7fac7ba-7e30-42fe-


Hey Kashyap Chamarthy,

Thanks a lot for your reply. I am using Openstack liberty. My nova verion is 12.0.3

Is there any way to get qemu 2.6 on liberty.
Comment 44 Kashyap Chamarthy 2017-04-10 07:01:42 EDT
(In reply to Joby James from comment #43)
> (In reply to Kashyap Chamarthy from comment #42)
> > (In reply to Joby James from comment #41)
> > > Hi All,
> > > 
> > > I am hitting this issue with RHEL 7.3 with qemu
> > > qemu-kvm-common-rhev-2.4.0-0.el7_2 and libvirt-daemon-driver-qemu-2.0.0*
> > > I understand that I need to upgrade to qemu-kvm-rhev-2.6.0-27.el7.x86_64,
> > > libvirt-daemon-2.0.0-10.el7.x86_64.
> > > 
> > > From where I can download these packages? I have RHEL subscriptions for Red
> > > Hat OpenStack Platform
> > 
> > Hi Joby,
> > 
> > If you have the right subscription for RHOS, and the said channels are
> > enabled on your RHEL machine, you should get the right versions.
> > 
> > From my quick verification, if you're using RHOS-9 (which you are, from your
> > description -- "OpenStack Nova,version=13.1.1-2.el7ost", which means
> > "Mitaka" release, which means RHOS-9), or RHOS-10, this is the latest
> > qemu-kvm-rhev version should be available to you: 
> > 
> >     qemu-kvm-rhev-2.6.0-28.el7_3.6 

[...]

> Hey Kashyap Chamarthy,
> 
> Thanks a lot for your reply. I am using Openstack liberty. My nova verion is
> 12.0.3
> 
> Is there any way to get qemu 2.6 on liberty.

When you say "liberty", I am assuming you're on the equivalent supported RHOS release, which is RHOS-8.  In that case, yes, 'qemu-kvm-rhev-2.6.0-28.el7_3.6' should be available there, too.
Comment 48 Luyao Huang 2017-06-14 00:00:15 EDT
Test with qemu-kvm-1.5.3-141.el7.x86_64 and libvirt-3.2.0-9.el7.x86_64, and found that libvirt still will force add the arat flags when build qemu command line (even the host, guest and qemu not support it), more info check the bug 1018251 comment 9.
Comment 49 Jiri Denemark 2017-06-16 11:53:29 EDT
Oh well as I already commented in bug 1018251, there is a small bug in libvirt in the code which checks what features were disabled by QEMU. It just disables all features it finds in the filtered-features. However, the old QEMU does not know anything about CPU feature 'arat' and it naturally cannot list arat in filtered-features. Thus libvirt should also disable features which are not mentioned in feature-words.
Comment 50 Luyao Huang 2017-06-18 21:40:58 EDT
(In reply to Jiri Denemark from comment #49)
> Oh well as I already commented in bug 1018251, there is a small bug in
> libvirt in the code which checks what features were disabled by QEMU. It
> just disables all features it finds in the filtered-features. However, the
> old QEMU does not know anything about CPU feature 'arat' and it naturally
> cannot list arat in filtered-features. Thus libvirt should also disable
> features which are not mentioned in feature-words.

Thanks for your reply, according to bug 1018251 comment 15, move this bug back to assigned.

And also copy this problem from that bug:

1. prepare a intel host:

# lscpu 
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 58
Model name:            Intel(R) Xeon(R) CPU E3-1220 V2 @ 3.10GHz
Stepping:              9
CPU MHz:               2734.296
CPU max MHz:           3500.0000
CPU min MHz:           1600.0000
BogoMIPS:              6185.99
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              8192K
NUMA node0 CPU(s):     0-3
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts

virsh # capabilities 
<capabilities>

  <host>
    <uuid>a6989909-d9a7-11e2-9275-9296ddfd6bef</uuid>
    <cpu>
      <arch>x86_64</arch>
      <model>IvyBridge</model>
      <vendor>Intel</vendor>
      <topology sockets='1' cores='4' threads='1'/>
      <feature name='ds'/>
      <feature name='acpi'/>
      <feature name='ss'/>
      <feature name='ht'/>
      <feature name='tm'/>
      <feature name='pbe'/>
      <feature name='dtes64'/>
      <feature name='monitor'/>
      <feature name='ds_cpl'/>
      <feature name='vmx'/>
      <feature name='smx'/>
      <feature name='est'/>
      <feature name='tm2'/>
      <feature name='xtpr'/>
      <feature name='pdcm'/>
      <feature name='pcid'/>
      <feature name='osxsave'/>
      <feature name='arat'/>
      <feature name='xsaveopt'/>
      <feature name='invtsc'/>

2. install a old qemu for testing (since old qemu not support emulate all cpu flags):

# rpm -q qemu-kvm
qemu-kvm-1.5.3-141.el7.x86_64

3. start a guest with host-model:

  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>

4. check guest xml:

  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>IvyBridge</model>
    <vendor>Intel</vendor>
    <feature policy='disable' name='ds'/>
    <feature policy='disable' name='acpi'/>
    <feature policy='require' name='ss'/>
    <feature policy='disable' name='ht'/>
    <feature policy='disable' name='tm'/>
    <feature policy='disable' name='pbe'/>
    <feature policy='disable' name='dtes64'/>
    <feature policy='disable' name='monitor'/>
    <feature policy='disable' name='ds_cpl'/>
    <feature policy='disable' name='vmx'/>
    <feature policy='disable' name='smx'/>
    <feature policy='disable' name='est'/>
    <feature policy='disable' name='tm2'/>
    <feature policy='disable' name='xtpr'/>
    <feature policy='disable' name='pdcm'/>
    <feature policy='require' name='pcid'/>
    <feature policy='disable' name='osxsave'/>
    <feature policy='require' name='arat'/>    <<------ not valid flags
    <feature policy='require' name='xsaveopt'/>
    <feature policy='require' name='hypervisor'/>


qemu log:

CPU feature arat not found
CPU feature arat not found
Comment 51 Jiri Denemark 2017-06-19 09:10:23 EDT
(In reply to Jiri Denemark from comment #49)
> there is a small bug in libvirt in the code which checks what features were
> disabled by QEMU. It just disables all features it finds in the filtered-features.

So this bug is real, but very minor and fixing it won't really help much with old QEMU. This bug requires both new libvirt and new QEMU (>= 2.9.0) to be completely fixed. With older QEMU libvirt will be able to detect unsupported CPU feature only once QEMU starts. Thus the "arat" feature will not be shown in the domain XML, but libvirt will still ask for it when starting QEMU. And if QEMU is new enough, we won't ask for unsupported features and thus the bug in the code which checks for disabled features won't ever show up.

I sent the additional patch upstream for review: https://www.redhat.com/archives/libvir-list/2017-June/msg00778.html
Comment 52 Dr. David Alan Gilbert 2017-06-19 09:15:06 EDT
(In reply to Jiri Denemark from comment #51)
> (In reply to Jiri Denemark from comment #49)
> > there is a small bug in libvirt in the code which checks what features were
> > disabled by QEMU. It just disables all features it finds in the filtered-features.
> 
> So this bug is real, but very minor and fixing it won't really help much
> with old QEMU. This bug requires both new libvirt and new QEMU (>= 2.9.0) to
> be completely fixed. With older QEMU libvirt will be able to detect
> unsupported CPU feature only once QEMU starts. Thus the "arat" feature will
> not be shown in the domain XML, but libvirt will still ask for it when
> starting QEMU. And if QEMU is new enough, we won't ask for unsupported
> features and thus the bug in the code which checks for disabled features
> won't ever show up.
> 
> I sent the additional patch upstream for review:
> https://www.redhat.com/archives/libvir-list/2017-June/msg00778.html

Doesn't that break starting a VM on RHEL then?
Comment 53 Jiri Denemark 2017-06-19 09:42:17 EDT
It has no real effect (either positive or negative) on RHEL, unknown CPU features are not fatal for QEMU 1.5.3 from RHEL.
Comment 55 Luyao Huang 2017-06-20 05:30:53 EDT
Hi Jirka,

I am trying to verify this bug, and according to the comment 0, it is a problem that libvirt add a qemu unsupported flags in qemu command line when use host-model and this cause qemu fail to start. And the patches in this bug only work with qemu > 2.9.0, the condition to hit problem like this is:

1. libvirt recognize X cpu flag and a host-model Y contains flag X
2. qemu cannot recognize X cpu flag
3. qemu support query-cpu-model-expansion and query-cpu-definitions
4. start guest which use host-model in host which host-model is Y

However i cannot find this X and Y in our current test environment, and i modify the cpu_map.xml to meet 4 conditions:

1. find a new flags on host which cannot be recognized by libvirt and qemu:

2. add cpb to the cpu_map.xml:
    <feature name='invtsc' migratable='no'>
      <cpuid eax_in='0x80000007' edx='0x00000100'/>
    </feature>
+    <feature name='cpb'>
+      <cpuid eax_in='0x80000007' edx='0x00000200'/>
+    </feature>

    <model name='Opteron_G5'>
      <signature family='21' model='2'/>
      <vendor name='AMD'/>
      <feature name='3dnowprefetch'/>
...
      <feature name='mmx'/>
      <feature name='msr'/>
      <feature name='mtrr'/>
      <feature name='nx'/>
      <feature name='pae'/>
+      <feature name='cpb'/>

3. restart libvirtd and recheck the domcapabilities output:

virsh # domcapabilities 
...
  <cpu>
    <mode name='host-passthrough' supported='yes'/>
    <mode name='host-model' supported='yes'>
      <model fallback='forbid'>Opteron_G5</model>
      <vendor>AMD</vendor>
      <feature policy='require' name='vme'/>
      <feature policy='require' name='x2apic'/>
      <feature policy='require' name='tsc-deadline'/>
      <feature policy='require' name='hypervisor'/>
      <feature policy='require' name='arat'/>
      <feature policy='require' name='tsc_adjust'/>
      <feature policy='require' name='bmi1'/>
      <feature policy='require' name='mmxext'/>
      <feature policy='require' name='fxsr_opt'/>
      <feature policy='require' name='cmp_legacy'/>
      <feature policy='require' name='cr8legacy'/>
      <feature policy='require' name='osvw'/>
      <feature policy='require' name='invtsc'/>
      <feature policy='disable' name='rdtscp'/>
      <feature policy='disable' name='svm'/>
      <feature policy='disable' name='cpb'/>
    </mode>
...

4. start a guest which use host-model, and guest will fail to start with similar reason:

guest config xml:

  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>

virsh # start r7
error: Failed to start domain r7
error: internal error: qemu unexpectedly closed the monitor: 2017-06-20T09:06:26.976920Z qemu-kvm: -chardev pty,id=charserial0: char device redirected to /dev/pts/2 (label charserial0)
2017-06-20T09:06:26.977445Z qemu-kvm: -chardev pty,id=charredir0: char device redirected to /dev/pts/3 (label charredir0)
2017-06-20T09:06:26.990180Z qemu-kvm: warning: CPU(s) not present in any NUMA nodes: 6 7 8 9
2017-06-20T09:06:26.990388Z qemu-kvm: warning: All CPU(s) up to maxcpus should be described in NUMA config
2017-06-20T09:06:26.991107Z qemu-kvm: can't apply global Opteron_G5-x86_64-cpu.cpb=off: Property '.cpb' not found

(I checked thst query-cpu-model-expansion can show all the qemu support cpu flags.)


But as i said before, this is a invalid testing scenario, we usually don't change the cpu_map.xml.


Could you please help to check:
1. Is this kind of problem (i mentioned in this comment) is one of the problems you want to fix with patch mentioned in comment 39 ?
2. Can i use the steps in the bug 822148 comment 28 to verify this bug ?

Thanks a lot for your reply.
Comment 56 Jiri Denemark 2017-06-20 07:26:05 EDT
(In reply to Luyao Huang from comment #55)
> Could you please help to check:
> 1. Is this kind of problem (i mentioned in this comment) is one of the
> problems you want to fix with patch mentioned in comment 39 ?

No, this is an artificial problem. CPU models in cpu_map.xml are copied from QEMU so they will never contain features which QEMU doesn't recognize. It can only happen for new CPU models and old QEMU, but in this case QEMU won't advertise such CPU models as supported and thus libvirt won't try to use them.

> 2. Can i use the steps in the bug 822148 comment 28 to verify this bug ?

Yes (good job), but remove the new 'cbp' feature from Opteron_G5 CPU model. Then you should be able to see the feature in virsh capabilities, but virsh domcapabilities should not list it and starting the domain should succeed. With QEMU >= 2.9.0 of course. If you do this with older QEMU or libvirt, libvirt should try to set 'cpb'.
Comment 57 Luyao Huang 2017-06-21 03:52:18 EDT
(In reply to Jiri Denemark from comment #56)
> (In reply to Luyao Huang from comment #55)
> > Could you please help to check:
> > 1. Is this kind of problem (i mentioned in this comment) is one of the
> > problems you want to fix with patch mentioned in comment 39 ?
> 
> No, this is an artificial problem. CPU models in cpu_map.xml are copied from
> QEMU so they will never contain features which QEMU doesn't recognize. It
> can only happen for new CPU models and old QEMU, but in this case QEMU won't
> advertise such CPU models as supported and thus libvirt won't try to use
> them.
> 

Okay, make scene, i forget that QEMU won't advertise such CPU models as supported

> > 2. Can i use the steps in the bug 822148 comment 28 to verify this bug ?
> 
> Yes (good job), but remove the new 'cbp' feature from Opteron_G5 CPU model.
> Then you should be able to see the feature in virsh capabilities, but virsh
> domcapabilities should not list it and starting the domain should succeed.
> With QEMU >= 2.9.0 of course. If you do this with older QEMU or libvirt,
> libvirt should try to set 'cpb'.

Yeah, definitely, actually i removed it after add comment 55 to avoid i forget it :)

Thanks a lot for your reply
Comment 58 Luyao Huang 2017-06-21 06:00:34 EDT
Test with libvirt-3.2.0-11.el7.x86_64:

1. check domcapabilities:

virsh # domcapabilities 
...
  <cpu>
    <mode name='host-passthrough' supported='yes'/>
    <mode name='host-model' supported='yes'>
      <model fallback='forbid'>SandyBridge</model>
      <vendor>Intel</vendor>
      <feature policy='require' name='vme'/>
      <feature policy='require' name='ss'/>
      <feature policy='require' name='pcid'/>
      <feature policy='require' name='hypervisor'/>
      <feature policy='require' name='arat'/>
      <feature policy='require' name='tsc_adjust'/>
      <feature policy='require' name='pdpe1gb'/>
      <feature policy='require' name='invtsc'/>
      <feature policy='disable' name='xsave'/>
      <feature policy='disable' name='avx'/>
    </mode>
    <mode name='custom' supported='yes'>
      <model usable='yes'>qemu64</model>
      <model usable='yes'>qemu32</model>
      <model usable='no'>phenom</model>
      <model usable='yes'>pentium3</model>
      <model usable='yes'>pentium2</model>
      <model usable='yes'>pentium</model>
      <model usable='no'>n270</model>
      <model usable='yes'>kvm64</model>
      <model usable='yes'>kvm32</model>
      <model usable='no'>cpu64-rhel6</model>
      <model usable='yes'>coreduo</model>
      <model usable='yes'>core2duo</model>
      <model usable='no'>athlon</model>
      <model usable='yes'>Westmere</model>
      <model usable='no'>Skylake-Client</model>
      <model usable='no'>SandyBridge</model>
      <model usable='yes'>Penryn</model>
      <model usable='no'>Opteron_G5</model>
      <model usable='no'>Opteron_G4</model>
      <model usable='no'>Opteron_G3</model>
      <model usable='yes'>Opteron_G2</model>
      <model usable='yes'>Opteron_G1</model>
      <model usable='yes'>Nehalem</model>
      <model usable='no'>IvyBridge</model>
      <model usable='no'>Haswell</model>
      <model usable='no'>Haswell-noTSX</model>
      <model usable='yes'>Conroe</model>
      <model usable='no'>Broadwell</model>
      <model usable='no'>Broadwell-noTSX</model>
      <model usable='yes'>486</model>
    </mode>
  </cpu>
...

S1: start a guest with host-model + check full:

1. guest xml

  <cpu mode='host-model' check='full'>
    <model fallback='allow'/>

2. start guest, libvirt will report that xsaveopt is missing:

virsh # start r7
error: Failed to start domain r7
error: operation failed: guest CPU doesn't match specification: missing features: xsaveopt

3. modify xml to disable xsaveopt:

  <cpu mode='host-model' check='full'>
    <model fallback='allow'/>
    <feature policy='disable' name='xsaveopt'/>

4. start guest:

virsh # start r7
Domain r7 started

5. recheck the guest live xml:

  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>SandyBridge</model>
    <vendor>Intel</vendor>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='pcid'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='disable' name='xsave'/>
    <feature policy='disable' name='avx'/>
    <feature policy='disable' name='xsaveopt'/>

6. login guest and check cpu:

IN GUEST:

# lscpu
...
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes hypervisor lahf_lm tsc_adjust arat

7. check the qemu log, there is no warning.

S2: start a guest with host-model + check is partial

1. modify guest xml to:

  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>

2. start guest:

# virsh start r7
Domain r7 started


3. check the guest xml and that xsaveopt will be displayed as disabled:

  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>SandyBridge</model>
    <vendor>Intel</vendor>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='ss'/>
    <feature policy='require' name='pcid'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='pdpe1gb'/>
    <feature policy='disable' name='xsave'/>
    <feature policy='disable' name='avx'/>
    <feature policy='disable' name='xsaveopt'/>

4. check the guest log: 

warning: host doesn't support requested feature: CPUID.0DH:EAX.xsaveopt [bit 0]
warning: host doesn't support requested feature: CPUID.0DH:EAX.xsaveopt [bit 0]
warning: host doesn't support requested feature: CPUID.0DH:EAX.xsaveopt [bit 0]
warning: host doesn't support requested feature: CPUID.0DH:EAX.xsaveopt [bit 0]
warning: host doesn't support requested feature: CPUID.0DH:EAX.xsaveopt [bit 0]


S3: host-model + check is none:

the same result with check is partial
Comment 59 Luyao Huang 2017-06-21 06:42:12 EDT
Hi Jirka,

I found a (possible) problem when try to verify this bug, you can see that the domcapabilities output in the comment 58. There is no element to disable the xsaveopt in the cpu mode part. And i checked query-cpu-definitions output, i found that qemu already mention that it host missed this feature:

 {"name":"SandyBridge","typename":"SandyBridge-x86_64-cpu","unavailable-features":["xsaveopt"],"static":false,"migration-safe":true}

but libvirt didn't filter out this xsaveopt in the domcapabilities. And i didn't see any document to say that the cpu xml in the domcapabilities will be a right xml (i mean ).

Could you please help to check if this result is expected ? thanks a lot for your reply and sorry for bother you again.
Comment 60 Jiri Denemark 2017-06-21 06:56:41 EDT
Hmm, this is really strange. With new QEMU host-model is supposed to be usable with check='full'. From the results it looks as if QEMU reported xsaveopt as supported in a reply to query-cpu-model-expansion, but complained about it and disabled it when we actually tried to start QEMU without disabling xsaveopt. 

Could you attach full QEMU log and debug logs from libvirtd (generated after removing /var/cache/libvirt/qemu/* and restarting the daemon)?
Comment 61 Luyao Huang 2017-06-22 01:54:30 EDT
(In reply to Jiri Denemark from comment #60)
> Hmm, this is really strange. With new QEMU host-model is supposed to be
> usable with check='full'. From the results it looks as if QEMU reported
> xsaveopt as supported in a reply to query-cpu-model-expansion, but
> complained about it and disabled it when we actually tried to start QEMU
> without disabling xsaveopt. 
> 
> Could you attach full QEMU log and debug logs from libvirtd (generated after
> removing /var/cache/libvirt/qemu/* and restarting the daemon)?

No problem, i will attach the log in another comment
Comment 64 Luyao Huang 2017-06-22 02:01:14 EDT
Hi Jirka,

I attached libvirtd and qemu log, please help to check it.
Comment 65 Jiri Denemark 2017-06-23 03:44:56 EDT
Thanks for the logs. So it appears host-model CPU with check='full' does not always work. Specifically, it doesn't work when libvirt chooses a specific CPU model based on cpu_map.xml, but the CPU model in QEMU contains more features and at some of the additional features cannot be enabled. Libvirt doesn't explicitly disable such features and then it complains that QEMU disabled them.

Good thing is host-model with check='partial' works and once such domain is started, its live CPU definition with check='full' is correct.

So, could you please file a new BZ for this issue? And please, attach the output of http://libvirt.org/git/?p=libvirt.git;a=blob_plain;f=tests/cputestdata/cpu-gather.sh script run on the affected host to the new BZ.
Comment 66 Luyao Huang 2017-06-23 05:28:23 EDT
(In reply to Jiri Denemark from comment #65)
> Thanks for the logs. So it appears host-model CPU with check='full' does not
> always work. Specifically, it doesn't work when libvirt chooses a specific
> CPU model based on cpu_map.xml, but the CPU model in QEMU contains more
> features and at some of the additional features cannot be enabled. Libvirt
> doesn't explicitly disable such features and then it complains that QEMU
> disabled them.
> 
> Good thing is host-model with check='partial' works and once such domain is
> started, its live CPU definition with check='full' is correct.
> 
> So, could you please file a new BZ for this issue? And please, attach the
> output of
> http://libvirt.org/git/?p=libvirt.git;a=blob_plain;f=tests/cputestdata/cpu-
> gather.sh script run on the affected host to the new BZ.

Thanks a lot for your quick reply. I will file a new bug for this problem.

And verify this bug on libvirt-3.2.0-14.el7.x86_64 and qemu-kvm-rhev-2.9.0-14.el7.x86_64:

Since qemu already support arat on 2.9 and i will use another qemu unsupported flags for testing:


1. prepare a host which have a cpu flags qemu cannot emulate:

# lscpu |grep -E "rdtscp|svm"
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc art rep_good nopl nonstop_tsc extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 popcnt aes xsave avx f16c lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs xop skinit wdt lwp fma4 tce nodeid_msr tbm topoext perfctr_core perfctr_nb cpb hw_pstate bmi1 arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold

2. check domcapabilities

virsh # domcapabilities 

  <cpu>
    <mode name='host-passthrough' supported='yes'/>
    <mode name='host-model' supported='yes'>
      <model fallback='forbid'>Opteron_G5</model>
      <vendor>AMD</vendor>
      <feature policy='require' name='vme'/>
      <feature policy='require' name='x2apic'/>
      <feature policy='require' name='tsc-deadline'/>
      <feature policy='require' name='hypervisor'/>
      <feature policy='require' name='arat'/>
      <feature policy='require' name='tsc_adjust'/>
      <feature policy='require' name='bmi1'/>
      <feature policy='require' name='mmxext'/>
      <feature policy='require' name='fxsr_opt'/>
      <feature policy='require' name='cmp_legacy'/>
      <feature policy='require' name='cr8legacy'/>
      <feature policy='require' name='osvw'/>
      <feature policy='require' name='invtsc'/>
      <feature policy='disable' name='rdtscp'/>
      <feature policy='disable' name='svm'/>
    </mode>

3. start a guest with host-model:

guest xml:

  <cpu mode='host-model' check='partial'>
    <model fallback='allow'/>


# virsh start r7
Domain r7 started

4. recheck guest live xml:

  <cpu mode='custom' match='exact' check='full'>
    <model fallback='forbid'>Opteron_G5</model>
    <vendor>AMD</vendor>
    <feature policy='require' name='vme'/>
    <feature policy='require' name='x2apic'/>
    <feature policy='require' name='tsc-deadline'/>
    <feature policy='require' name='hypervisor'/>
    <feature policy='require' name='arat'/>
    <feature policy='require' name='tsc_adjust'/>
    <feature policy='require' name='bmi1'/>
    <feature policy='require' name='mmxext'/>
    <feature policy='require' name='fxsr_opt'/>
    <feature policy='require' name='cmp_legacy'/>
    <feature policy='require' name='cr8legacy'/>
    <feature policy='require' name='osvw'/>
    <feature policy='disable' name='rdtscp'/>
    <feature policy='disable' name='svm'/>

5. check qemu command line:

# ps aux|grep qemu
... -cpu Opteron_G5,vme=on,x2apic=on,tsc-deadline=on,hypervisor=on,arat=on,tsc_adjust=on,bmi1=on,mmxext=on,fxsr_opt=on,cmp_legacy=on,cr8legacy=on,osvw=on,rdtscp=off,svm=off

6. login guest and check:

# lscpu
...
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt pdpe1gb lm art rep_good nopl extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw xop fma4 tbm tsc_adjust bmi1 arat


More test scenarios please check bug 822148 comment 28.
Comment 67 Luyao Huang 2017-06-25 22:15:03 EDT
(In reply to Jiri Denemark from comment #65)
> Thanks for the logs. So it appears host-model CPU with check='full' does not
> always work. Specifically, it doesn't work when libvirt chooses a specific
> CPU model based on cpu_map.xml, but the CPU model in QEMU contains more
> features and at some of the additional features cannot be enabled. Libvirt
> doesn't explicitly disable such features and then it complains that QEMU
> disabled them.
> 
> Good thing is host-model with check='partial' works and once such domain is
> started, its live CPU definition with check='full' is correct.
> 
> So, could you please file a new BZ for this issue? And please, attach the
> output of
> http://libvirt.org/git/?p=libvirt.git;a=blob_plain;f=tests/cputestdata/cpu-
> gather.sh script run on the affected host to the new BZ.

New bug's link bug 1464832.
Comment 68 errata-xmlrpc 2017-08-01 13:14:13 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846
Comment 69 errata-xmlrpc 2017-08-01 19:55:08 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846
Comment 70 errata-xmlrpc 2017-08-01 21:27:35 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:1846

Note You need to log in before you can comment on or make changes to this bug.