Bug 1371617 - Libvirt passes unsupported "arat" flag to QEMU when using host-model
Libvirt passes unsupported "arat" flag to QEMU when using host-model
Status: ASSIGNED
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt (Show other bugs)
7.0
Unspecified Unspecified
high Severity high
: pre-dev-freeze
: ---
Assigned To: Jiri Denemark
chhu
: Upstream
Depends On:
Blocks: libvirtCPUconfig 1420851
  Show dependency treegraph
 
Reported: 2016-08-30 11:57 EDT by Phil Sutter
Modified: 2017-03-08 13:08 EST (History)
36 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 2758651 None None None 2016-11-14 22:52 EST

  None (edit)
Description Phil Sutter 2016-08-30 11:57:12 EDT
Trying to duplicate the host's CPU features can cause issues if some of them are not supported by qemu. This happened with the 'arat' CPU flag in a new installation. Here's the relevant libvirt instance log:

-----------------------
2016-08-27 11:57:53.865+0000: starting up libvirt version: 2.0.0, package: 6.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2016-08-23-01:18:42, x86-039.build.eng.bos.redhat.com), qemu version: 2.3.0 (qemu-kvm-rhev-2.3.0-31.el7_2.21), hostname: wsfd-netdev11.ntdv.lab.eng.bos.redhat.com
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=instance-00000001,debug-threads=on -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu IvyBridge,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,+smx,+est,+tm2,+xtpr,+pdcm,+pcid,+dca,+osxsave,+arat,+xsaveopt,+pdpe1gb -m 8192 -realtime mlock=off -smp 4,sockets=4,cores=1,threads=1 -uuid bbeea595-943c-4464-b446-560737af1828 -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Nova,version=13.1.1-2.el7ost,serial=83b07993-8311-43c2-806e-c2d9b898186f,uuid=bbeea595-943c-4464-b446-560737af1828,family=Virtual Machine' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-4-instance-00000001/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/nova/instances/bbeea595-943c-4464-b446-560737af1828/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=31 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:56:85:da,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/bbeea595-943c-4464-b446-560737af1828/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on

(process:28017): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported
char device redirected to /dev/pts/1 (label charserial1)
2016-08-27T11:57:53.949252Z qemu-kvm: CPU feature arat not found
2016-08-27 11:57:54.136+0000: shutting down
---------------------

This is the cpuinfo entry for one of the hypervisor's cores:

------------------
processor	: 31
vendor_id	: GenuineIntel
cpu family	: 6
model		: 62
model name	: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
stepping	: 4
microcode	: 0x428
cpu MHz		: 1897.187
cache size	: 20480 KB
physical id	: 1
siblings	: 16
core id		: 7
cpu cores	: 8
apicid		: 47
initial apicid	: 47
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt
bogomips	: 5205.93
clflush size	: 64
cache_alignment	: 64
address sizes	: 46 bits physical, 48 bits virtual
power management:
---------------

The list of allowed CPU flags can be extracted from the installed qemu binary by calling '/usr/libexec/qemu-kvm -cpu help'. In my case, 'arat' is not amongst the list. Using the help output's flag list should allow using a dynamic whitelist when building the qemu command line for cpu_mode=host-model configurations. The parsing overhead should be feasible given the version independence gained.
Comment 1 Daniel Berrange 2016-09-05 06:34:34 EDT
This is not an openstack bug - libvirt shouldn't be passing a feature that is not supported by qemu.
Comment 3 Jiri Denemark 2016-09-06 07:20:11 EDT
Well, currently libvirt doesn't check whether a requested CPU feature is supported by QEMU before starting a domain (QEMU does not provide any command which we could use for this). We could work with QEMU folks to fix this.

However, you are mixing libvirt from 7.3 and qemu-kvm-rhev from 7.2. The version of qemu-kvm-rhev (2.6.*) from 7.3 supports "arat" CPU feature and you won't see the problem there.

Moreover, copying host's CPU definition from capabilities XML is not really the best thing to do. Either use a host-passthrough or a host-model CPU in domain XML. "host-model" won't fix the problem right now, but in the (hopefully near) future it will only use CPU features that are supported on the current host/kernel/QEMU combination.

That said, even if we start checking supported CPU features, the domain will still fail to start if an unsupported feature is explicitly requested in domain XML. Libvirt will just provide a better error message. I'll take this bug as a request to do so since we have enough bugs that already cover the "host-model" issue.
Comment 4 Phil Sutter 2016-09-06 09:33:40 EDT
Hi,

I stumbled upon this issue when trying to set up OpenStack from RDO on a RHEL7.3 machine from beaker. There is no package qemu-kvm-rhev installed at all. Instead I have qemu-kvm-1.5.3-122.el7.x86_64 which also provides /usr/libexec/qemu-kvm used by OpenStack. All installed libvirt packages (apart from libvirt-python) are of version 2.0.0-6.el7. All these packages come from the beaker-server repository.

To my surprise, the passing of unsupported arat CPU feature seems not a fatal error. In my current setup, instances are running (and reachable) despite the error message which is found in /var/log/libvirt/qemu/instance-*.log. So it seems like there was another issue which prevented the machines from starting up.

Thanks, Phil
Comment 5 Daniel Berrange 2016-09-06 09:45:26 EDT
Use of qemu-kvm in combination with openstack is *not* supported - only qemu-kvm-rhev is permitted. We've got a bug open that will use RPM requires to prevent this mistake in future, by mandating qemu-kvm-rhev at the RPM level.
Comment 6 Dr. David Alan Gilbert 2016-10-24 06:11:42 EDT
I've seeing this on a gss box that they're trying to debug a different problem on with rhos;

qemu-kvm-rhev-2.3.0-31.el7_2.21.x86_64
libvirt-daemon-2.0.0-10.el7.x86_64

from cpuinfo:
model name	: Intel(R) Xeon(R) CPU E5-2609 v3 @ 1.90GHz
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdt
scp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc ap
erfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 fma 
cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes
 xsave avx f16c rdrand lahf_lm abm arat epb pln pts dtherm tpr_shadow vnmi flexp
riority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveo
pt cqm_llc cqm_occup_llc

LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=instance-00000081,debug-threads=on -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu Haswell-noTSX,+vme,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,+smx,+est,+tm2,+xtpr,+pdcm,+dca,+osxsave,+f16c,+rdrand,+arat,+tsc_adjust,+xsaveopt,+pdpe1gb,+abm -m 4096 -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -uuid e8eac58c-19b4-4d87-8ab8-0c8a214a6ee1 -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=2015.1.4-13.el7ost,serial=f37fc981-aa7c-4ef0-afa9-afdba8cb1469,uuid=e8eac58c-19b4-4d87-8ab8-0c8a214a6ee1' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-64-instance-00000081/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive 'file=rbd:vms/e8eac58c-19b4-4d87-8ab8-0c8a214a6ee1_disk:id=cinder:key=AQBCQzdW/pXyGxAAsiM5BFFqIiGlLMBBEE+0OA==:auth_supported=cephx\;none:mon_host=10.74.128.30\:6789\;10.74.128.31\:6789\;10.74.128.32\:6789,format=raw,if=none,id=drive-virtio-disk0,cache=writeback' -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=32,id=hostnet0,vhost=on,vhostfd=34 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:ee:94:11,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/e8eac58c-19b4-4d87-8ab8-0c8a214a6ee1/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on

2016-10-24T04:18:34.105911Z qemu-kvm: CPU feature arat not found
Comment 7 Dr. David Alan Gilbert 2016-10-24 06:56:26 EDT
I *think* openstack is using host-model for the cpu type by default, which I guess is the problem?
Comment 8 Jiri Denemark 2016-10-24 08:23:58 EDT
Partially. It's a combination of host-model and using libvirtd from 7.3 with qemu-kvm-rhev from 7.2. The old qemu-kvm-rhev does not support arat.
Comment 10 Jeremy 2016-11-04 13:05:37 EDT
seeing this issue as well:


instnace id  :4e117abe-c875-4821-8acf-3e64c142bcbb    


###nova-conductor.log
2016-11-04 11:16:28.615 23729 ERROR nova.scheduler.utils [req-8997e7bb-777e-4109-bbca-3af1421f38a9 612df114ddc5410aa09c709769f6d7c8 1b9ba84140514e9eaa4fd4d8ebcacb02 - - -] [instance: 4e117abe-c875-4821-8acf-3e64c142bcbb] Error from last host: ucs-c (node ucs-c.lwr04.cisco.com): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1905, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2082, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 4e117abe-c875-4821-8acf-3e64c142bcbb was re-scheduled: internal error: process exited while connecting to monitor: \n(process:15864): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported\n2016-11-04T15:16:28.209775Z qemu-kvm: CPU feature arat not found\n']
2016-11-04 11:16:31.269 23703 ERROR nova.scheduler.utils [req-8997e7bb-777e-4109-bbca-3af1421f38a9 612df114ddc5410aa09c709769f6d7c8 1b9ba84140514e9eaa4fd4d8ebcacb02 - - -] [instance: 4e117abe-c875-4821-8acf-3e64c142bcbb] Error from last host: ucs-d (node ucs-d.lwr04.cisco.com): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1905, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2082, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 4e117abe-c875-4821-8acf-3e64c142bcbb was re-scheduled: internal error: qemu unexpectedly closed the monitor: \n(process:14773): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported\n2016-11-04T15:16:30.582633Z qemu-kvm: CPU feature arat not found\n']
2016-11-04 11:16:31.273 23703 INFO oslo.messaging._drivers.impl_rabbit [req-8997e7bb-777e-4109-bbca-3af1421f38a9 612df114ddc5410aa09c709769f6d7c8 1b9ba84140514e9eaa4fd4d8ebcacb02 - - -] Connecting to AMQP server on 10.90.54.160:5672
2016-11-04 11:16:31.282 23703 INFO oslo.messaging._drivers.impl_rabbit [req-8997e7bb-777e-4109-bbca-3af1421f38a9 612df114ddc5410aa09c709769f6d7c8 1b9ba84140514e9eaa4fd4d8ebcacb02 - - -] Connected to AMQP server on 10.90.54.160:5672
2016-11-04 11:16:34.055 23765 ERROR nova.scheduler.utils [req-8997e7bb-777e-4109-bbca-3af1421f38a9 612df114ddc5410aa09c709769f6d7c8 1b9ba84140514e9eaa4fd4d8ebcacb02 - - -] [instance: 4e117abe-c875-4821-8acf-3e64c142bcbb] Error from last host: ucs-e (node ucs-e.lwr04.cisco.com): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1905, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2082, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 4e117abe-c875-4821-8acf-3e64c142bcbb was re-scheduled: internal error: process exited while connecting to monitor: \n(process:14071): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported\n2016-11-04T15:16:33.504622Z qemu-kvm: CPU feature arat not found\n']
2016-11-04 11:16:34.056 23765 WARNING nova.scheduler.utils [req-8997e7bb-777e-4109-bbca-3af1421f38a9 612df114ddc5410aa09c709769f6d7c8 1b9ba84140514e9eaa4fd4d8ebcacb02 - - -] Failed to compute_task_build_instances: Exceeded maximum number of retries. Exceeded max scheduling attempts 3 for instance 4e117abe-c875-4821-8acf-3e64c142bcbb. Last exception: [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1905, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2082, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance 4e117abe-c875-4821-8acf-3e64c142bcbb was re-scheduled: internal error: process exited while connecting to monitor: \n(process:14071): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported\n2016-11-04T15:16:33.504622Z qemu-kvm: CPU feature arat not found\n']
2016-11-04 11:16:34.057 23765 WARNING nova.scheduler.utils [req-8997e7bb-777e-4109-bbca-3af1421f38a9 612df114ddc5410aa09c709769f6d7c8 1b9ba84140514e9eaa4fd4d8ebcacb02 - - -] [instance: 4e117abe-c875-4821-8acf-3e64c142bcbb] Setting instance to ERROR state.


###from compute node, it has arat flag...
vendor_id       : GenuineIntel
cpu family      : 6
model           : 62
model name      : Intel(R) Xeon(R) CPU E5-2697 v2 @ 2.70GHz
stepping        : 4
microcode       : 0x428
cpu MHz         : 2248.382
cache size      : 30720 KB
physical id     : 0
siblings        : 24
core id         : 0
cpu cores       : 12
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt


###compute node /var/log/libvirt/qemu/instance-00000005.log 
2016-11-04 15:16:33.426+0000: starting up libvirt version: 2.0.0, package: 10.el7 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2016-09-21-10:15:26, x86-038.build.eng.bos.redhat.com), qemu version: 2.3.0 (qemu-kvm-rhev-2.3.0-31.el7_2.21), hostname: ucs-e.lwr04.cisco.com
LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name guest=instance-00000005,debug-threads=on -S -machine pc-i440fx-rhel7.2.0,accel=kvm,usb=off -cpu IvyBridge,+ds,+acpi,+ss,+ht,+tm,+pbe,+dtes64,+monitor,+ds_cpl,+vmx,+smx,+est,+tm2,+xtpr,+pdcm,+pcid,+dca,+osxsave,+arat,+xsaveopt,+pdpe1gb -m 512 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1 -uuid 4e117abe-c875-4821-8acf-3e64c142bcbb -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=12.0.4-8.el7ost,serial=fbe41075-92fc-486a-91b4-7086104ed812,uuid=4e117abe-c875-4821-8acf-3e64c142bcbb,family=Virtual Machine' -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-5-instance-00000005/monitor.sock,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-hpet -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/var/lib/nova/instances/4e117abe-c875-4821-8acf-3e64c142bcbb/disk,format=qcow2,if=none,id=drive-virtio-disk0,cache=none -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=29,id=hostnet0,vhost=on,vhostfd=31 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=fa:16:3e:11:9f:fc,bus=pci.0,addr=0x3 -chardev file,id=charserial0,path=/var/lib/nova/instances/4e117abe-c875-4821-8acf-3e64c142bcbb/console.log -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -device usb-tablet,id=input0,bus=usb.0,port=1 -vnc 0.0.0.0:0 -k en-us -device cirrus-vga,id=video0,bus=pci.0,addr=0x2 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 -msg timestamp=on

(process:14071): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported
char device redirected to /dev/pts/0 (label charserial1)
2016-11-04T15:16:33.504622Z qemu-kvm: CPU feature arat not found
2016-11-04 15:16:33.570+0000: shutting down
Comment 13 Shanmugavel Balu 2016-11-06 21:45:40 EST
I too seeing this issue, any workaround available?

Instance Log:
(process:5076): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported
char device redirected to /dev/pts/1 (label charserial1)
2016-11-07T02:32:04.022357Z qemu-kvm: CPU feature arat not found
2016-11-07 02:32:04.101+0000: shutting down

Conductor:
2016-11-06 21:32:04.461 5713 ERROR nova.scheduler.utils [req-cefdd8bd-cb5c-4334-883b-57fb742984bd 2ec0f13b4e494d49b2901281fc640d71 c900d0d4a8d24a8ab29fdb06b5f81d99 - - -] [instance: d7fac7ba-7e30-42fe-beb0-e2c605f8bc1b] Error from last host: chennai-compute1 (node chennai-compute1): [u'Traceback (most recent call last):\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 1905, in _do_build_and_run_instance\n    filter_properties)\n', u'  File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2082, in _build_and_run_instance\n    instance_uuid=instance.uuid, reason=six.text_type(e))\n', u'RescheduledException: Build of instance d7fac7ba-7e30-42fe-beb0-e2c605f8bc1b was re-scheduled: internal error: process exited while connecting to monitor: \n(process:5076): GLib-WARNING **: gmem.c:482: custom memory allocation vtable not supported\n2016-11-07T02:32:04.022357Z qemu-kvm: CPU feature arat not found\n']
Comment 20 Jiri Denemark 2016-11-07 17:27:33 EST
As explained above qemu-kvm-rhev package needs to be installed since openstack is not supported with qemu-kvm. And since running qemu-kvm-rhev-2.3.0 (from 7.2) is unsupported on RHEL 7.3, you need to install qemu-kvm-rhev-2.6.0 (from 7.3). Once installed, everything should work fine with libvirt from 7.3.

In addition to this, there are a few possible ways to work around this issue:

1) tell openstack not to use host-model by modifying cpu_mode and cpu_model in nova.conf

2) comment out <feature name='arat'>...</feature> and <model name='Skylake-Client'>...</model> in /usr/share/libvirt/cpu_map.xml


In the future, libvirt's host-model CPU mode will be fixed to avoid features unknown to QEMU. But this requires changes to both libvirt and QEMU and we're working on them upstream.

I also asked for a QEMU interface which would allow us to check what CPU features it supports, but since it's a new interface, it won't really help in this situation (which is caused by using new libvirt with old QEMU).
Comment 22 Shanmugavel Balu 2016-11-07 18:23:04 EST
Hello, I upgraded to qemu-kvm-rhev-2.6.0-27.el7.x86_64 and not seeing the above issue, but after upgrade i did nova compute and libvirtd restart and i am seeing the below issue [permission], tried multiple options and nothing worked.

2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [req-17b73d5b-a406-439e-8e4d-9a40c04a1f9a 2ec0f13b4e494d49b2901281fc640d71 c900d0d4a8d24a8ab29fdb06b5f81d99 - - -] [instance: 9730e067-3c4f-4241-9bae-693d95f53e00] Instance failed to spawn
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00] Traceback (most recent call last):
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2180, in _build_resources
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     yield resources
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2033, in _build_and_run_instance
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     block_device_info=block_device_info)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 2596, in spawn
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     block_device_info=block_device_info)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4719, in _create_domain_and_network
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     xml, pause=pause, power_on=power_on)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4649, in _create_domain
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     guest.launch(pause=pause)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 142, in launch
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     self._encoded_xml, errors='ignore')
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 204, in __exit__
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     six.reraise(self.type_, self.value, self.tb)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/guest.py", line 137, in launch
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     return self._domain.createWithFlags(flags)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 183, in doit
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     result = proxy_call(self._autowrap, f, *args, **kwargs)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 141, in proxy_call
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     rv = execute(f, *args, **kwargs)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 122, in execute
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     six.reraise(c, e, tb)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib/python2.7/site-packages/eventlet/tpool.py", line 80, in tworker
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     rv = meth(*args, **kwargs)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1065, in createWithFlags
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00]     if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self)
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00] libvirtError: Unable to open file: /var/lib/nova/instances/9730e067-3c4f-4241-9bae-693d95f53e00/console.log: Permission denied
2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance: 9730e067-3c4f-4241-9bae-693d95f53e00] 
2016-11-07 18:06:34.811 19880 INFO nova.compute.manager [req-17b73d5b-a406-439e-8e4d-9a40c04a1f9a 2ec0f13b4e494d49b2901281fc640d71 c900d0d4a8d24a8ab29fdb06b5f81d99 - - -] [instance: 9730e067-3c4f-4241-9bae-693d95f53e00] Terminating instance
Comment 24 Jiri Denemark 2016-11-08 03:52:30 EST
(In reply to Shanmugavel Balu from comment #22)
> 2016-11-07 18:06:34.810 19880 ERROR nova.compute.manager [instance:
> 9730e067-3c4f-4241-9bae-693d95f53e00] libvirtError: Unable to open file:
> /var/lib/nova/instances/9730e067-3c4f-4241-9bae-693d95f53e00/console.log:
> Permission denied

This is of course a completely different issue, and it really doesn't belong to this bz. Anyway, it's most likely caused by a "regression", which was inevitable, see bug 1371125. The required changes are described in https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Virtualization_Deployment_and_Administration_Guide/sect-Manipulating_the_domain_xml-Devices.html#sect-Devices-Host_physical_machine_interface

In short, the directories to which nova redirects domains' console output need to be labeled with virt_log_t.
Comment 27 Jiri Denemark 2016-11-08 07:32:01 EST
Nice, so this is actually similar to bug 1365500. Just that unlike CMT, ARAT is supported by some QEMU versions. Since QEMU has no interface to give us a list of supported CPU features (I asked for it and Eduardo is working on it upstream), we should just treat it as unsupported and ignore it when creating host-model CPU. This would mean even new QEMU won't get the feature turned on with host-model CPU, but that shouldn't be a big issue. The current host-model implementation is just wrong anyway and it's only a pure luck that it ever worked so removing one more CPU feature from the guest CPU doesn't make things any worse.
Comment 28 Dr. David Alan Gilbert 2016-11-08 13:39:16 EST
It looks like the new qemu has just hit the RHOS repos:

 qemu-kvm-rhev  x86_64  10:2.6.0-27.el7   rhel-7-server-openstack-9-rpms  2.4 M

so if you've got customers who've hit this then I guess that should fix it for them.
Comment 29 Shanmugavel Balu 2016-11-08 14:47:50 EST
Thanks, stdio_handler = "file" in qemu.conf worked.

Current pkg version on my setup:
qemu-kvm-rhev-2.6.0-27.el7.x86_64
libvirt-daemon-kvm-2.0.0-10.el7.x86_64
Comment 34 Stephen Gordon 2016-12-13 23:29:01 EST
(In reply to Jiri Denemark from comment #20)
> As explained above qemu-kvm-rhev package needs to be installed since
> openstack is not supported with qemu-kvm. And since running
> qemu-kvm-rhev-2.3.0 (from 7.2) is unsupported on RHEL 7.3, you need to
> install qemu-kvm-rhev-2.6.0 (from 7.3). Once installed, everything should
> work fine with libvirt from 7.3.

RDO CI and OpenStack CI folks who are running qemu without kvm enabled (as they have to run in public cloud environments where nested virt isn't enabled) are reporting that they still hit this issue even with qemu-kvm-(rh)ev 2.6.0 and libvirt-2.0.0.

They are likely going to end up not using host-model by modifying cpu_mode and cpu_model in nova.conf for now but it seems like the above is possibly another variation of the same root issue with regards to the introduction of a new flag.
Comment 35 Jiri Denemark 2016-12-14 04:01:57 EST
There should be no difference between KVM and TCG mode. If QEMU is new enough it will recognize the feature anyway. It certainly may not be able to use the feature, but this would only cause a warning.

Are you sure the QEMU version used is really 2.6.0?
Comment 36 Javier Peña 2016-12-14 05:03:40 EST
Since the cpu_mode issue in nova.conf might be a different one, I have created https://bugzilla.redhat.com/show_bug.cgi?id=1404627 with a detailed description.
Comment 37 chhu 2016-12-15 00:30:55 EST
(In reply to Jiri Denemark from comment #35)
> There should be no difference between KVM and TCG mode. If QEMU is new
> enough it will recognize the feature anyway. It certainly may not be able to
> use the feature, but this would only cause a warning.
> 
> Are you sure the QEMU version used is really 2.6.0?

And, please use libvirt newer than 2.0.8, as there is fix for Bug 1365500 - CPU feature cmt not found with 2.0.0-1.
Comment 38 chhu 2016-12-15 00:32:24 EST
(In reply to chhu from comment #37)
> (In reply to Jiri Denemark from comment #35)
> > There should be no difference between KVM and TCG mode. If QEMU is new
> > enough it will recognize the feature anyway. It certainly may not be able to
> > use the feature, but this would only cause a warning.
> > 
> > Are you sure the QEMU version used is really 2.6.0?
> 
And, please use libvirt newer than libvirt-2.0.0-8.el7, as there is fix for Bug 1365500 -CPU feature cmt not found with 2.0.0-1.
Comment 39 Jiri Denemark 2017-03-03 14:27:32 EST
This should be finally fixed by (in combination with QEMU 2.9.0):

commit 2a586b4402a7637e0bef9a2876d065c0ce6bfef1
Refs: v3.1.0-9-g2a586b440
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Mon Jan 30 16:10:22 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemucapstest: Update test data for QEMU 2.9.0

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 0bde051f3de02b1be25ea4a4d9f062abfa3d1397
Refs: v3.1.0-10-g0bde051f3
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Mon Jan 30 16:10:49 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    domaincapstest: Add test data for QEMU 2.9.0

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit d2f8f3052d48f284d56e27c98ce7a2ce6c656e59
Refs: v3.1.0-11-gd2f8f3052
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Wed Feb 15 10:18:53 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    docs: Update description of the host-model CPU mode

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 4c0723a1d75b981e8939c4c5b6bde7607fc7301e
Refs: v3.1.0-12-g4c0723a1d
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Mon Jan 30 16:30:13 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemu: Rename hostCPU/feature element in capabilities cache

    The element will be generalized in the following commits.

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 03a34f6b84da009291e8651aba71df8a6761d081
Refs: v3.1.0-13-g03a34f6b8
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Wed Feb 22 15:46:47 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemu: Prepare for more types in qemuMonitorCPUModelInfo

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 2fc215dd2ad4b88c1054da804c4c45b3d4e5c2fa
Refs: v3.1.0-14-g2fc215dd2
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Wed Feb 22 16:01:30 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:56 2017 +0100

    qemu: Store more types in qemuMonitorCPUModelInfo

    While query-cpu-model-expansion returns only boolean features on s390,
    but x86_64 reports some integer and string properties which we are
    interested in.

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit d7f054a512a911a386d9bbeec51379e4bb843ca5
Refs: v3.1.0-15-gd7f054a51
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Wed Feb 22 16:51:50 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Probe "max" CPU model in TCG

    Querying "host" CPU model expansion only makes sense for KVM. QEMU 2.9.0
    introduces a new "max" CPU model which can be used to ask QEMU what the
    best CPU it can provide to a TCG domain is.

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit f0138289920d5204c1654bc9b17115d1a315d62e
Refs: v3.1.0-16-gf01382899
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Wed Jan 11 14:36:34 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Get host CPU model from QEMU on x86_64

    Until now host-model CPU mode tried to enable all CPU features supported
    by the host CPU even if QEMU/KVM did not support them. This caused a
    number of issues and made host-model quite unreliable. Asking QEMU for
    the CPU it can provide and the current host makes host-model much more
    robust.

    This commit fixes the following bugs:

        https://bugzilla.redhat.com/show_bug.cgi?id=1018251
        https://bugzilla.redhat.com/show_bug.cgi?id=1371617
        https://bugzilla.redhat.com/show_bug.cgi?id=1372581
        https://bugzilla.redhat.com/show_bug.cgi?id=1404627
        https://bugzilla.redhat.com/show_bug.cgi?id=870071

    In addition to that, the following bug should be mostly limited to cases
    when an unsupported feature is explicitly requested:

       	https://bugzilla.redhat.com/show_bug.cgi?id=1335534

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit be3d59754b1a1da174ff1796882a0ceb35e198e8
Refs: v3.1.0-17-gbe3d59754
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Tue Jan 31 13:44:00 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Use enum for CPU model expansion type

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit bb3363c90b5b19c37f8e5b8f512eb00014d2dae4
Refs: v3.1.0-18-gbb3363c90
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Thu Feb 23 13:53:51 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Use full CPU model expansion on x86

    The static CPU model expansion is designed to return only canonical
    names of all CPU properties. To maintain backwards compatibility libvirt
    is stuck with different spelling of some of the features, but we need to
    use the full expansion to get the additional spellings. In addition to
    returning all spelling variants for all properties the full expansion
    will contain properties which are not guaranteed to be migration
    compatible. Thus, we need to combine both expansions. First we need to
    call the static expansion to limit the result to migratable properties.
    Then we can use the result of the static expansion as an input to the
    full expansion to get both canonical names and their aliases.

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 2f882dbfa92c14d585a786a42d284b63ffdca4e3
Refs: v3.1.0-19-g2f882dbfa
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Thu Feb 23 14:31:23 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    qemu: Make virQEMUCapsInitCPUModel testable

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit d065934cd07c01fbb29f25bbb223eb4ce126a90e
Refs: v3.1.0-20-gd065934cd
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Wed Feb 1 17:48:41 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Switch host CPU data scripts to model expansion

    Instantiating "host" CPU and querying it using qom-get has been the only
    way of probing host CPU via QEMU until 2.9.0 implemented
    query-cpu-model-expansion for x86_64. Even though libvirt never really
    used the old way its result can be easily converted into the one
    produced by query-cpu-model-expansion. Thus we can reuse the original
    test data and possible get new data from hosts where QEMU does not
    support the new QMP command.

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit d46a1aa4d8caafe977cc41a80ef86af1d10e60b7
Refs: v3.1.0-21-gd46a1aa4d
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Mon Feb 13 14:59:42 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Convert all json data files to query-cpu-model-expansion

    Converted by running the following command, renaming the files as
    *.new, and committing only the *.new files.

        (cd tests/cputestdata; ./cpu-convert.py *.json)

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit a19696b5924e7512dcca4f30d15147036708389e
Refs: v3.1.0-22-ga19696b59
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Mon Feb 13 10:33:52 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Test virQEMUCapsInitCPUModel

    The original test didn't use family/model numbers to make better
    decisions about the CPU model and thus mis-detected the model in the two
    cases which are modified in this commit. The detected CPU models now
    match those obtained from raw CPUID data.

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 5e4fc2ef993343643587f2b079b63f2c9f038e6f
Refs: v3.1.0-23-g5e4fc2ef9
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Mon Feb 13 15:04:38 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Drop obsolete CPU test data files

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

commit 8907204cd83f0ca29c48d19bbf2778132d8578a2
Refs: v3.1.0-24-g8907204cd
Author:     Jiri Denemark <jdenemar@redhat.com>
AuthorDate: Mon Feb 13 15:06:35 2017 +0100
Commit:     Jiri Denemark <jdenemar@redhat.com>
CommitDate: Fri Mar 3 19:57:57 2017 +0100

    cputest: Drop .new suffix from CPU test data files

    Signed-off-by: Jiri Denemark <jdenemar@redhat.com>

Note You need to log in before you can comment on or make changes to this bug.