Description of problem: Failed to migration HE VM from RHEL host to RHVH host Version-Release number of selected component (if applicable): Engine:4.2.4.1-0.1.el7 Hosts: 1. RHEL - 7.5 - 8.el7 Kernel Version:3.10.0 - 862.3.2.el7.x86_64 KVM Version:2.10.0 - 21.el7_5.3 LIBVIRT Version:libvirt-3.9.0-14.el7_5.5 VDSM Version:vdsm-4.20.29-1.el7ev 2. RHV-H OS Version:RHEL - 7.5 - 3.1.el7 OS Description:Red Hat Virtualization Host 4.2.3 (el7.5) Kernel Version:3.10.0 - 862.3.2.el7.x86_64 KVM Version:2.10.0 - 21.el7_5.3 LIBVIRT Version:libvirt-3.9.0-14.el7_5.5 VDSM Version:vdsm-4.20.27.2-1.el7ev Steps to Reproduce: Migrate HE VM from RHEL to RHVH vdsm log: 018-06-05 05:50:46,803+0300 ERROR (migsrc/1f41e617) [virt.vm] (vmId='1f41e617-5e95-4086-aa86-d93205bf482e') operation failed: guest CPU doesn't match specification: missing features: spec-ctrl (migration:290) 2018-06-05 05:50:46,916+0300 INFO (jsonrpc/4) [jsonrpc.JsonRpcServer] RPC call Host.ping2 succeeded in 0.00 seconds (__init__:573) 2018-06-05 05:50:46,920+0300 DEBUG (jsonrpc/3) [storage.TaskManager.Task] (Task='47035432-9b48-4896-90f5-76b123d9fe17') moving from state finished -> state preparing (task:602) 2018-06-05 05:50:46,920+0300 INFO (jsonrpc/3) [vdsm.api] START repoStats(domains=['a2a6f15e-b73b-4f3d-81bb-e5ccb5a5376b']) from=::1,52926, task_id=47035432-9b48-4896-90f5-76b123d9fe17 (api:46) 2018-06-05 05:50:46,920+0300 INFO (jsonrpc/3) [vdsm.api] FINISH repoStats return={'a2a6f15e-b73b-4f3d-81bb-e5ccb5a5376b': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.000288341', 'lastCheck': '0.5', 'valid': True}} from=::1,52926, task_id=47035432-9b48-4896-90f5-76b123d9fe17 (api:52) 2018-06-05 05:50:46,920+0300 DEBUG (jsonrpc/3) [storage.TaskManager.Task] (Task='47035432-9b48-4896-90f5-76b123d9fe17') finished: {'a2a6f15e-b73b-4f3d-81bb-e5ccb5a5376b': {'code': 0, 'actual': True, 'version': 4, 'acquired': True, 'delay': '0.000288341', 'lastCheck': '0.5', 'valid': True}} (task:1201) 2018-06-05 05:50:46,920+0300 DEBUG (jsonrpc/3) [storage.TaskManager.Task] (Task='47035432-9b48-4896-90f5-76b123d9fe17') moving from state finished -> state finished (task:602) 2018-06-05 05:50:46,920+0300 DEBUG (jsonrpc/3) [storage.ResourceManager.Owner] Owner.releaseAll requests {} resources {} (resourceManager:910) 2018-06-05 05:50:46,921+0300 DEBUG (jsonrpc/3) [storage.ResourceManager.Owner] Owner.cancelAll requests {} (resourceManager:947) 2018-06-05 05:50:46,921+0300 DEBUG (jsonrpc/3) [storage.TaskManager.Task] (Task='47035432-9b48-4896-90f5-76b123d9fe17') ref 0 aborting False (task:1002) 2018-06-05 05:50:46,921+0300 INFO (jsonrpc/3) [jsonrpc.JsonRpcServer] RPC call Host.getStorageRepoStats succeeded in 0.00 seconds (__init__:573) 2018-06-05 05:50:46,930+0300 INFO (jsonrpc/7) [jsonrpc.JsonRpcServer] RPC call Host.ping2 succeeded in 0.00 seconds (__init__:573) 2018-06-05 05:50:47,893+0300 ERROR (migsrc/1f41e617) [virt.vm] (vmId='1f41e617-5e95-4086-aa86-d93205bf482e') Failed to migrate (migration:455) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 437, in _regular_run self._startUnderlyingMigration(time.time()) File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 511, in _startUnderlyingMigration self._perform_with_downtime_thread(duri, muri) File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 580, in _perform_with_downtime_thread self._perform_migration(duri, muri) File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 529, in _perform_migration self._migration_flags) File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1746, in migrateToURI3 if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self) libvirtError: operation failed: guest CPU doesn't match specification: missing features: spec-ctrl engine log (correlation-id: vms_syncAction_3d82143b-da9f-4d63): 2018-06-05 05:46:19,570+03 INFO [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-19) [vms_syncAction_3d82143b-da9f-4d63] Lock Acquired to object 'EngineLock:{exclusiveLocks='[1f41e617-5e95-4086-aa86-d93205bf482e=VM]', sharedLocks=''}' 2018-06-05 05:46:19,982+03 INFO [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-19) [vms_syncAction_3d82143b-da9f-4d63] Running command: MigrateVmToServerCommand internal: false. Entities affected : ID: 1f41e617-5e95-4086-aa86-d93205bf482e Type: VMAction group MIGRATE_VM with role type USER 2018-06-05 05:46:20,084+03 INFO [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-19) [vms_syncAction_3d82143b-da9f-4d63] START, MigrateVDSCommand( MigrateVDSCommandParameters:{hostId='75a05865-47f1-4b6f-bcca-350e3369931e', vmId='1f41e617-5e95-4086-aa86-d93205bf482e', srcHost='lynx16.lab.eng.tlv2.redhat.com', dstVdsId='896e262b-9f5b-411e-8320-10e22689101e', dstHost='lynx17.lab.eng.tlv2.redhat.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='false', migrateCompressed='false', consoleAddress='null', maxBandwidth='5000', enableGuestEvents='false', maxIncomingMigrations='2', maxOutgoingMigrations='2', convergenceSchedule='null', dstQemu='10.46.16.32'}), log id: 5e1758bf 2018-06-05 05:46:20,087+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (default task-19) [vms_syncAction_3d82143b-da9f-4d63] START, MigrateBrokerVDSCommand(HostName = host_mixed_2, MigrateVDSCommandParameters:{hostId='75a05865-47f1-4b6f-bcca-350e3369931e', vmId='1f41e617-5e95-4086-aa86-d93205bf482e', srcHost='lynx16.lab.eng.tlv2.redhat.com', dstVdsId='896e262b-9f5b-411e-8320-10e22689101e', dstHost='lynx17.lab.eng.tlv2.redhat.com:54321', migrationMethod='ONLINE', tunnelMigration='false', migrationDowntime='0', autoConverge='false', migrateCompressed='false', consoleAddress='null', maxBandwidth='5000', enableGuestEvents='false', maxIncomingMigrations='2', maxOutgoingMigrations='2', convergenceSchedule='null', dstQemu='10.46.16.32'}), log id: 24987c0b 2018-06-05 05:46:21,094+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (default task-19) [vms_syncAction_3d82143b-da9f-4d63] FINISH, MigrateBrokerVDSCommand, log id: 24987c0b 2018-06-05 05:46:21,099+03 INFO [org.ovirt.engine.core.vdsbroker.MigrateVDSCommand] (default task-19) [vms_syncAction_3d82143b-da9f-4d63] FINISH, MigrateVDSCommand, return: MigratingFrom, log id: 5e1758bf 2018-06-05 05:46:21,114+03 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-19) [vms_syncAction_3d82143b-da9f-4d63] EVENT_ID: VM_MIGRATION_START(62), Migration started (VM: HostedEngine, Source: host_mixed_2, Destination: host_mixed_3, User: admin@internal-authz). 2018-06-05 05:46:23,058+03 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-8) [] VM '1f41e617-5e95-4086-aa86-d93205bf482e' was reported as Down on VDS '896e262b-9f5b-411e-8320-10e22689101e'(host_mixed_3) 2018-06-05 05:46:23,060+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-8) [] START, DestroyVDSCommand(HostName = host_mixed_3, DestroyVmVDSCommandParameters:{hostId='896e262b-9f5b-411e-8320-10e22689101e', vmId='1f41e617-5e95-4086-aa86-d93205bf482e', secondsToWait='0', gracefully='false', reason='', ignoreNoVm='true'}), log id: 46797286 2018-06-05 05:46:24,108+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-8) [] Failed to destroy VM '1f41e617-5e95-4086-aa86-d93205bf482e' because VM does not exist, ignoring 2018-06-05 05:46:24,108+03 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (ForkJoinPool-1-worker-8) [] FINISH, DestroyVDSCommand, log id: 46797286 2018-06-05 05:46:24,108+03 INFO [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-8) [] VM '1f41e617-5e95-4086-aa86-d93205bf482e'(HostedEngine) was unexpectedly detected as 'Down' on VDS '896e262b-9f5b-411e-8320-10e22689101e'(host_mixed_3) (expected on '75a05865-47f1-4b6f-bcca-350e3369931e') 2018-06-05 05:46:24,108+03 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-8) [] Migration of VM 'HostedEngine' to host 'host_mixed_3' failed: VM destroyed during the startup. Additional info: Host info CPU and capabilities (source rhel host) #cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Intel(R) Xeon(R) CPU E5-2603 v2 @ 1.80GHz stepping : 4 microcode : 0x42c cpu MHz : 1800.219 cache size : 10240 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt ibpb ibrs stibp dtherm arat pln pts spec_ctrl intel_stibp bogomips : 3599.99 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: virsh -r capabilities <capabilities> <host> <uuid>c322929e-3a01-4af0-aba3-45e4af67c073</uuid> <cpu> <arch>x86_64</arch> <model>IvyBridge-IBRS</model> <vendor>Intel</vendor> <microcode version='1068'/> <topology sockets='1' cores='4' threads='1'/> <feature name='ds'/> <feature name='acpi'/> <feature name='ss'/> <feature name='ht'/> <feature name='tm'/> <feature name='pbe'/> <feature name='dtes64'/> <feature name='monitor'/> <feature name='ds_cpl'/> <feature name='vmx'/> <feature name='smx'/> <feature name='est'/> <feature name='tm2'/> <feature name='xtpr'/> <feature name='pdcm'/> <feature name='pcid'/> <feature name='dca'/> <feature name='osxsave'/> <feature name='arat'/> <feature name='stibp'/> <feature name='xsaveopt'/> <feature name='pdpe1gb'/> <feature name='invtsc'/> <pages unit='KiB' size='4'/> <pages unit='KiB' size='1048576'/> </cpu> <power_management> <suspend_mem/> <suspend_disk/> <suspend_hybrid/> </power_management> <migration_features> <live/> <uri_transports> <uri_transport>tcp</uri_transport> <uri_transport>rdma</uri_transport> </uri_transports> </migration_features> <topology> <cells num='2'> <cell id='0'> <memory unit='KiB'>33503860</memory> <pages unit='KiB' size='4'>8375965</pages> <pages unit='KiB' size='1048576'>4</pages> <distances> <sibling id='0' value='10'/> <sibling id='1' value='21'/> </distances> <cpus num='4'> <cpu id='0' socket_id='0' core_id='0' siblings='0'/> <cpu id='1' socket_id='0' core_id='1' siblings='1'/> <cpu id='2' socket_id='0' core_id='2' siblings='2'/> <cpu id='3' socket_id='0' core_id='3' siblings='3'/> </cpus> </cell> <cell id='1'> <pages unit='KiB' size='4'>0</pages> <distances> <sibling id='0' value='21'/> <sibling id='1' value='10'/> </distances> <cpus num='4'> <cpu id='4' socket_id='1' core_id='0' siblings='4'/> <cpu id='5' socket_id='1' core_id='1' siblings='5'/> <cpu id='6' socket_id='1' core_id='2' siblings='6'/> <cpu id='7' socket_id='1' core_id='3' siblings='7'/> </cpus> </cell> </cells> </topology> <cache> <bank id='0' level='3' type='both' size='10' unit='MiB' cpus='0-3'/> <bank id='1' level='3' type='both' size='10' unit='MiB' cpus='4-7'/> </cache> <secmodel> <model>selinux</model> <doi>0</doi> <baselabel type='kvm'>system_u:system_r:svirt_t:s0</baselabel> <baselabel type='qemu'>system_u:system_r:svirt_tcg_t:s0</baselabel> </secmodel> <secmodel> <model>dac</model> <doi>0</doi> <baselabel type='kvm'>+107:+107</baselabel> <baselabel type='qemu'>+107:+107</baselabel> </secmodel> </host> <guest> <os_type>hvm</os_type> <arch name='i686'> <wordsize>32</wordsize> <emulator>/usr/libexec/qemu-kvm</emulator> <machine maxCpus='240'>pc-i440fx-rhel7.5.0</machine> <machine canonical='pc-i440fx-rhel7.5.0' maxCpus='240'>pc</machine> <machine maxCpus='240'>pc-i440fx-rhel7.0.0</machine> <machine maxCpus='240'>rhel6.3.0</machine> <machine maxCpus='240'>rhel6.4.0</machine> <machine maxCpus='240'>rhel6.0.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.1.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.2.0</machine> <machine maxCpus='255'>pc-q35-rhel7.3.0</machine> <machine maxCpus='240'>rhel6.5.0</machine> <machine maxCpus='384'>pc-q35-rhel7.4.0</machine> <machine maxCpus='240'>rhel6.6.0</machine> <machine maxCpus='240'>rhel6.1.0</machine> <machine maxCpus='240'>rhel6.2.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.3.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.4.0</machine> <machine maxCpus='384'>pc-q35-rhel7.5.0</machine> <machine canonical='pc-q35-rhel7.5.0' maxCpus='384'>q35</machine> <domain type='qemu'/> <domain type='kvm'> <emulator>/usr/libexec/qemu-kvm</emulator> </domain> </arch> <features> <cpuselection/> <deviceboot/> <disksnapshot default='on' toggle='no'/> <acpi default='on' toggle='yes'/> <apic default='on' toggle='no'/> <pae/> <nonpae/> </features> </guest> <guest> <os_type>hvm</os_type> <arch name='x86_64'> <wordsize>64</wordsize> <emulator>/usr/libexec/qemu-kvm</emulator> <machine maxCpus='240'>pc-i440fx-rhel7.5.0</machine> <machine canonical='pc-i440fx-rhel7.5.0' maxCpus='240'>pc</machine> <machine maxCpus='240'>pc-i440fx-rhel7.0.0</machine> <machine maxCpus='240'>rhel6.3.0</machine> <machine maxCpus='240'>rhel6.4.0</machine> <machine maxCpus='240'>rhel6.0.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.1.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.2.0</machine> <machine maxCpus='255'>pc-q35-rhel7.3.0</machine> <machine maxCpus='240'>rhel6.5.0</machine> <machine maxCpus='384'>pc-q35-rhel7.4.0</machine> <machine maxCpus='240'>rhel6.6.0</machine> <machine maxCpus='240'>rhel6.1.0</machine> <machine maxCpus='240'>rhel6.2.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.3.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.4.0</machine> <machine maxCpus='384'>pc-q35-rhel7.5.0</machine> <machine canonical='pc-q35-rhel7.5.0' maxCpus='384'>q35</machine> <domain type='qemu'/> <domain type='kvm'> <emulator>/usr/libexec/qemu-kvm</emulator> </domain> </arch> <features> <cpuselection/> <deviceboot/> <disksnapshot default='on' toggle='no'/> <acpi default='on' toggle='yes'/> <apic default='on' toggle='no'/> </features> </guest> </capabilities> Host info CPU and capabilities (destination rhvh host) # cat /proc/cpuinfo | more processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Intel(R) Xeon(R) CPU E5-2603 v2 @ 1.80GHz stepping : 4 microcode : 0x428 cpu MHz : 1800.000 cache size : 10240 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perf mon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadli ne_timer aes xsave avx f16c rdrand lahf_lm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm arat pln pts bogomips : 3600.02 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: # virsh -r capabilities <capabilities> <host> <uuid>705895e1-e886-4358-9c25-d84c1bfc9f47</uuid> <cpu> <arch>x86_64</arch> <model>IvyBridge</model> <vendor>Intel</vendor> <microcode version='1064'/> <topology sockets='1' cores='4' threads='1'/> <feature name='ds'/> <feature name='acpi'/> <feature name='ss'/> <feature name='ht'/> <feature name='tm'/> <feature name='pbe'/> <feature name='dtes64'/> <feature name='monitor'/> <feature name='ds_cpl'/> <feature name='vmx'/> <feature name='smx'/> <feature name='est'/> <feature name='tm2'/> <feature name='xtpr'/> <feature name='pdcm'/> <feature name='pcid'/> <feature name='dca'/> <feature name='osxsave'/> <feature name='arat'/> <feature name='xsaveopt'/> <feature name='pdpe1gb'/> <feature name='invtsc'/> <pages unit='KiB' size='4'/> <pages unit='KiB' size='2048'/> <pages unit='KiB' size='1048576'/> </cpu> <power_management> <suspend_mem/> <suspend_disk/> <suspend_hybrid/> </power_management> <migration_features> <live/> <uri_transports> <uri_transport>tcp</uri_transport> <uri_transport>rdma</uri_transport> </uri_transports> </migration_features> <topology> <cells num='2'> <cell id='0'> <memory unit='KiB'>33503860</memory> <pages unit='KiB' size='4'>8375965</pages> <pages unit='KiB' size='2048'>0</pages> <pages unit='KiB' size='1048576'>4</pages> <distances> <sibling id='0' value='10'/> <sibling id='1' value='21'/> </distances> <cpus num='4'> <cpu id='0' socket_id='0' core_id='0' siblings='0'/> <cpu id='1' socket_id='0' core_id='1' siblings='1'/> <cpu id='2' socket_id='0' core_id='2' siblings='2'/> <cpu id='3' socket_id='0' core_id='3' siblings='3'/> </cpus> </cell> <cell id='1'> <pages unit='KiB' size='4'>0</pages> <distances> <sibling id='0' value='21'/> <sibling id='1' value='10'/> </distances> <cpus num='4'> <cpu id='4' socket_id='1' core_id='0' siblings='4'/> <cpu id='5' socket_id='1' core_id='1' siblings='5'/> <cpu id='6' socket_id='1' core_id='2' siblings='6'/> <cpu id='7' socket_id='1' core_id='3' siblings='7'/> </cpus> </cell> </cells> </topology> <cache> <bank id='0' level='3' type='both' size='10' unit='MiB' cpus='0-3'/> <bank id='1' level='3' type='both' size='10' unit='MiB' cpus='4-7'/> </cache> <secmodel> <model>selinux</model> <doi>0</doi> <baselabel type='kvm'>system_u:system_r:svirt_t:s0</baselabel> <baselabel type='qemu'>system_u:system_r:svirt_tcg_t:s0</baselabel> </secmodel> <secmodel> <model>dac</model> <doi>0</doi> <baselabel type='kvm'>+107:+107</baselabel> <baselabel type='qemu'>+107:+107</baselabel> </secmodel> </host> <guest> <os_type>hvm</os_type> <arch name='i686'> <wordsize>32</wordsize> <emulator>/usr/libexec/qemu-kvm</emulator> <machine maxCpus='240'>pc-i440fx-rhel7.5.0</machine> <machine canonical='pc-i440fx-rhel7.5.0' maxCpus='240'>pc</machine> <machine maxCpus='240'>pc-i440fx-rhel7.0.0</machine> <machine maxCpus='240'>rhel6.3.0</machine> <machine maxCpus='240'>rhel6.4.0</machine> <machine maxCpus='240'>rhel6.0.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.1.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.2.0</machine> <machine maxCpus='255'>pc-q35-rhel7.3.0</machine> <machine maxCpus='240'>rhel6.5.0</machine> <machine maxCpus='384'>pc-q35-rhel7.4.0</machine> <machine maxCpus='240'>rhel6.6.0</machine> <machine maxCpus='240'>rhel6.1.0</machine> <machine maxCpus='240'>rhel6.2.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.3.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.4.0</machine> <machine maxCpus='384'>pc-q35-rhel7.5.0</machine> <machine canonical='pc-q35-rhel7.5.0' maxCpus='384'>q35</machine> <domain type='qemu'/> <domain type='kvm'> <emulator>/usr/libexec/qemu-kvm</emulator> </domain> </arch> <features> <cpuselection/> <deviceboot/> <disksnapshot default='on' toggle='no'/> <acpi default='on' toggle='yes'/> <apic default='on' toggle='no'/> <pae/> <nonpae/> </features> </guest> <guest> <os_type>hvm</os_type> <arch name='x86_64'> <wordsize>64</wordsize> <emulator>/usr/libexec/qemu-kvm</emulator> <machine maxCpus='240'>pc-i440fx-rhel7.5.0</machine> <machine canonical='pc-i440fx-rhel7.5.0' maxCpus='240'>pc</machine> <machine maxCpus='240'>pc-i440fx-rhel7.0.0</machine> <machine maxCpus='240'>rhel6.3.0</machine> <machine maxCpus='240'>rhel6.4.0</machine> <machine maxCpus='240'>rhel6.0.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.1.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.2.0</machine> <machine maxCpus='255'>pc-q35-rhel7.3.0</machine> <machine maxCpus='240'>rhel6.5.0</machine> <machine maxCpus='384'>pc-q35-rhel7.4.0</machine> <machine maxCpus='240'>rhel6.6.0</machine> <machine maxCpus='240'>rhel6.1.0</machine> <machine maxCpus='240'>rhel6.2.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.3.0</machine> <machine maxCpus='240'>pc-i440fx-rhel7.4.0</machine> <machine maxCpus='384'>pc-q35-rhel7.5.0</machine> <machine canonical='pc-q35-rhel7.5.0' maxCpus='384'>q35</machine> <domain type='qemu'/> <domain type='kvm'> <emulator>/usr/libexec/qemu-kvm</emulator> </domain> </arch> <features> <cpuselection/> <deviceboot/> <disksnapshot default='on' toggle='no'/> <acpi default='on' toggle='yes'/> <apic default='on' toggle='no'/> </features> </guest> </capabilities>
Created attachment 1447761 [details] engine log
Created attachment 1447762 [details] source host log
for more info see: https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv-4.2-ge-runner-tier1/151/ log: https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/rhv-4.2-ge-runner-tier1/151/artifact/
Israel, has the destination CPU been patched with IBRS support?
See output of 'virsh -r capabilities' for both host: The source host is with IBRS: <model>IvyBridge-IBRS</model> The destination host (the RHEVH) is without IBRS: <model>IvyBridge</model>
(In reply to Israel Pinto from comment #5) > See output of 'virsh -r capabilities' for both host: > The source host is with IBRS: <model>IvyBridge-IBRS</model> > The destination host (the RHEVH) is without IBRS: <model>IvyBridge</model> Cluster is 'Intel Westmere Family'
(In reply to Israel Pinto from comment #5) > See output of 'virsh -r capabilities' for both host: > The source host is with IBRS: <model>IvyBridge-IBRS</model> > The destination host (the RHEVH) is without IBRS: <model>IvyBridge</model> Yes, that was clear from the CPU flags: flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt ibpb ibrs stibp dtherm arat pln pts spec_ctrl intel_stibp vs.: flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm epb tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm arat pln pts (see the last flags). But the issue is with the HE VM - how is it configured. Can you check the shared configuration?
From the guest i see that 'spec_ctrl' is set and the model is like the host, But from the UI the Guest CPU Type is: Intel Westmere Family (is it also BZ ?) [root@hosted-engine-02 ~]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel Xeon E312xx (Sandy Bridge) stepping : 1 microcode : 0x1 cpu MHz : 1799.999 cache size : 16384 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm xsaveopt ibpb ibrs arat spec_ctrl bogomips : 3599.99 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management:
[root@lynx14 ~]# cat /run/ovirt-hosted-engine-ha/vm.conf # Editing the hosted engine VM is only possible via the manager UI\API # This file was generated at Tue Jun 5 17:04:36 2018 cpuType=Westmere emulatedMachine=pc-i440fx-rhel7.5.0 vmId=1f41e617-5e95-4086-aa86-d93205bf482e smp=4 memSize=16384 maxVCpus=64 spiceSecureChannels=smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir xmlBase64=PD94bWwgdmVyc2lvbj0nMS4wJyBlbmNvZGluZz0nVVRGLTgnPz4KPGRvbWFpbiB4bWxuczpvdmlydC10dW5lPSJodHRwOi8vb3ZpcnQub3JnL3ZtL3R1bmUvMS4wIiB4bWxuczpvdmlydC12bT0iaHR0cDovL292aXJ0Lm9yZy92bS8xLjAiIHR5cGU9Imt2bSI+PG5hbWU+SG9zdGVkRW5naW5lPC9uYW1lPjx1dWlkPjFmNDFlNjE3LTVlOTUtNDA4Ni1hYTg2LWQ5MzIwNWJmNDgyZTwvdXVpZD48bWVtb3J5PjE2Nzc3MjE2PC9tZW1vcnk+PGN1cnJlbnRNZW1vcnk+MTY3NzcyMTY8L2N1cnJlbnRNZW1vcnk+PG1heE1lbW9yeSBzbG90cz0iMTYiPjY3MTA4ODY0PC9tYXhNZW1vcnk+PHZjcHUgY3VycmVudD0iNCI+NjQ8L3ZjcHU+PHN5c2luZm8gdHlwZT0ic21iaW9zIj48c3lzdGVtPjxlbnRyeSBuYW1lPSJtYW51ZmFjdHVyZXIiPm9WaXJ0PC9lbnRyeT48ZW50cnkgbmFtZT0icHJvZHVjdCI+T1MtTkFNRTo8L2VudHJ5PjxlbnRyeSBuYW1lPSJ2ZXJzaW9uIj5PUy1WRVJTSU9OOjwvZW50cnk+PGVudHJ5IG5hbWU9InNlcmlhbCI+SE9TVC1TRVJJQUw6PC9lbnRyeT48ZW50cnkgbmFtZT0idXVpZCI+MWY0MWU2MTctNWU5NS00MDg2LWFhODYtZDkzMjA1YmY0ODJlPC9lbnRyeT48L3N5c3RlbT48L3N5c2luZm8+PGNsb2NrIG9mZnNldD0idmFyaWFibGUiIGFkanVzdG1lbnQ9IjAiPjx0aW1lciBuYW1lPSJydGMiIHRpY2twb2xpY3k9ImNhdGNodXAiLz48dGltZXIgbmFtZT0icGl0IiB0aWNrcG9saWN5PSJkZWxheSIvPjx0aW1lciBuYW1lPSJocGV0IiBwcmVzZW50PSJubyIvPjwvY2xvY2s+PGZlYXR1cmVzPjxhY3BpLz48dm1jb3JlaW5mby8+PC9mZWF0dXJlcz48Y3B1IG1hdGNoPSJleGFjdCI+PG1vZGVsPldlc3RtZXJlPC9tb2RlbD48dG9wb2xvZ3kgY29yZXM9IjQiIHRocmVhZHM9IjEiIHNvY2tldHM9IjE2Ii8+PG51bWE+PGNlbGwgaWQ9IjAiIGNwdXM9IjAsMSwyLDMiIG1lbW9yeT0iMTY3NzcyMTYiLz48L251bWE+PC9jcHU+PGNwdXR1bmUvPjxkZXZpY2VzPjxpbnB1dCB0eXBlPSJtb3VzZSIgYnVzPSJwczIiLz48Y2hhbm5lbCB0eXBlPSJ1bml4Ij48dGFyZ2V0IHR5cGU9InZpcnRpbyIgbmFtZT0ib3ZpcnQtZ3Vlc3QtYWdlbnQuMCIvPjxzb3VyY2UgbW9kZT0iYmluZCIgcGF0aD0iL3Zhci9saWIvbGlidmlydC9xZW11L2NoYW5uZWxzLzFmNDFlNjE3LTVlOTUtNDA4Ni1hYTg2LWQ5MzIwNWJmNDgyZS5vdmlydC1ndWVzdC1hZ2VudC4wIi8+PC9jaGFubmVsPjxjaGFubmVsIHR5cGU9InVuaXgiPjx0YXJnZXQgdHlwZT0idmlydGlvIiBuYW1lPSJvcmcucWVtdS5ndWVzdF9hZ2VudC4wIi8+PHNvdXJjZSBtb2RlPSJiaW5kIiBwYXRoPSIvdmFyL2xpYi9saWJ2aXJ0L3FlbXUvY2hhbm5lbHMvMWY0MWU2MTctNWU5NS00MDg2LWFhODYtZDkzMjA1YmY0ODJlLm9yZy5xZW11Lmd1ZXN0X2FnZW50LjAiLz48L2NoYW5uZWw+PGNvbnRyb2xsZXIgdHlwZT0ic2NzaSIgbW9kZWw9InZpcnRpby1zY3NpIiBpbmRleD0iMCI+PGFsaWFzIG5hbWU9InVhLTAzYmM3ZWRiLTJiYTQtNDhhYy04YzVkLWU4Njk5NGRiZWI2NyIvPjxhZGRyZXNzIGJ1cz0iMHgwMCIgZG9tYWluPSIweDAwMDAiIGZ1bmN0aW9uPSIweDAiIHNsb3Q9IjB4MDUiIHR5cGU9InBjaSIvPjwvY29udHJvbGxlcj48Z3JhcGhpY3MgdHlwZT0ic3BpY2UiIHBvcnQ9Ii0xIiBhdXRvcG9ydD0ieWVzIiBwYXNzd2Q9IioqKioqIiBwYXNzd2RWYWxpZFRvPSIxOTcwLTAxLTAxVDAwOjAwOjAxIiB0bHNQb3J0PSItMSI+PGNoYW5uZWwgbmFtZT0ibWFpbiIgbW9kZT0ic2VjdXJlIi8+PGNoYW5uZWwgbmFtZT0iaW5wdXRzIiBtb2RlPSJzZWN1cmUiLz48Y2hhbm5lbCBuYW1lPSJjdXJzb3IiIG1vZGU9InNlY3VyZSIvPjxjaGFubmVsIG5hbWU9InBsYXliYWNrIiBtb2RlPSJzZWN1cmUiLz48Y2hhbm5lbCBuYW1lPSJyZWNvcmQiIG1vZGU9InNlY3VyZSIvPjxjaGFubmVsIG5hbWU9ImRpc3BsYXkiIG1vZGU9InNlY3VyZSIvPjxjaGFubmVsIG5hbWU9InNtYXJ0Y2FyZCIgbW9kZT0ic2VjdXJlIi8+PGNoYW5uZWwgbmFtZT0idXNicmVkaXIiIG1vZGU9InNlY3VyZSIvPjxsaXN0ZW4gdHlwZT0ibmV0d29yayIgbmV0d29yaz0idmRzbS1vdmlydG1nbXQiLz48L2dyYXBoaWNzPjxjb250cm9sbGVyIHR5cGU9ImlkZSIgaW5kZXg9IjAiPjxhZGRyZXNzIGJ1cz0iMHgwMCIgZG9tYWluPSIweDAwMDAiIGZ1bmN0aW9uPSIweDEiIHNsb3Q9IjB4MDEiIHR5cGU9InBjaSIvPjwvY29udHJvbGxlcj48Y29udHJvbGxlciB0eXBlPSJ2aXJ0aW8tc2VyaWFsIiBpbmRleD0iMCIgcG9ydHM9IjE2Ij48YWxpYXMgbmFtZT0idWEtN2I1NTg0NjMtZTAyZC00Y2JlLWI5NGUtYzM4ZWMzZGJjYTI0Ii8+PGFkZHJlc3MgYnVzPSIweDAwIiBkb21haW49IjB4MDAwMCIgZnVuY3Rpb249IjB4MCIgc2xvdD0iMHgwNiIgdHlwZT0icGNpIi8+PC9jb250cm9sbGVyPjxncmFwaGljcyB0eXBlPSJ2bmMiIHBvcnQ9Ii0xIiBhdXRvcG9ydD0ieWVzIiBwYXNzd2Q9IioqKioqIiBwYXNzd2RWYWxpZFRvPSIxOTcwLTAxLTAxVDAwOjAwOjAxIiBrZXltYXA9ImVuLXVzIj48bGlzdGVuIHR5cGU9Im5ldHdvcmsiIG5ldHdvcms9InZkc20tb3ZpcnRtZ210Ii8+PC9ncmFwaGljcz48cm5nIG1vZGVsPSJ2aXJ0aW8iPjxiYWNrZW5kIG1vZGVsPSJyYW5kb20iPi9kZXYvdXJhbmRvbTwvYmFja2VuZD48YWxpYXMgbmFtZT0idWEtODIyNTgwOGItZjQxNy00YWEyLWIwNjAtYzkyZWJhOGZiMDc4Ii8+PC9ybmc+PHNvdW5kIG1vZGVsPSJpY2g2Ij48YWxpYXMgbmFtZT0idWEtODJjMzlkOTEtNmM0YS00NTUxLWFmMmMtODk2ODg2M2M1MmM0Ii8+PGFkZHJlc3MgYnVzPSIweDAwIiBkb21haW49IjB4MDAwMCIgZnVuY3Rpb249IjB4MCIgc2xvdD0iMHgwNCIgdHlwZT0icGNpIi8+PC9zb3VuZD48Y29udHJvbGxlciB0eXBlPSJ1c2IiIG1vZGVsPSJwaWl4My11aGNpIiBpbmRleD0iMCI+PGFkZHJlc3MgYnVzPSIweDAwIiBkb21haW49IjB4MDAwMCIgZnVuY3Rpb249IjB4MiIgc2xvdD0iMHgwMSIgdHlwZT0icGNpIi8+PC9jb250cm9sbGVyPjx2aWRlbz48bW9kZWwgdHlwZT0icXhsIiB2cmFtPSIzMjc2OCIgaGVhZHM9IjEiIHJhbT0iNjU1MzYiIHZnYW1lbT0iMTYzODQiLz48YWxpYXMgbmFtZT0idWEtYTkzYzNkMjYtYzMwZi00YzA4LWJjNDMtYjAzODdiZGZmNDgyIi8+PGFkZHJlc3MgYnVzPSIweDAwIiBkb21haW49IjB4MDAwMCIgZnVuY3Rpb249IjB4MCIgc2xvdD0iMHgwMiIgdHlwZT0icGNpIi8+PC92aWRlbz48bWVtYmFsbG9vbiBtb2RlbD0idmlydGlvIj48c3RhdHMgcGVyaW9kPSI1Ii8+PGFsaWFzIG5hbWU9InVhLWRmZjdkYjcyLWJjNGEtNGM2Mi1hZGY2LTQwZDUxOTgwNWEzMCIvPjxhZGRyZXNzIGJ1cz0iMHgwMCIgZG9tYWluPSIweDAwMDAiIGZ1bmN0aW9uPSIweDAiIHNsb3Q9IjB4MDgiIHR5cGU9InBjaSIvPjwvbWVtYmFsbG9vbj48Y2hhbm5lbCB0eXBlPSJzcGljZXZtYyI+PHRhcmdldCB0eXBlPSJ2aXJ0aW8iIG5hbWU9ImNvbS5yZWRoYXQuc3BpY2UuMCIvPjwvY2hhbm5lbD48aW50ZXJmYWNlIHR5cGU9ImJyaWRnZSI+PG1vZGVsIHR5cGU9InZpcnRpbyIvPjxsaW5rIHN0YXRlPSJ1cCIvPjxzb3VyY2UgYnJpZGdlPSJvdmlydG1nbXQiLz48YWxpYXMgbmFtZT0idWEtNWQzOGNkYzEtOTAyMy00ODA4LTg1MjktZTk1ZTY5MmZlYjViIi8+PGFkZHJlc3MgYnVzPSIweDAwIiBkb21haW49IjB4MDAwMCIgZnVuY3Rpb249IjB4MCIgc2xvdD0iMHgwMyIgdHlwZT0icGNpIi8+PG1hYyBhZGRyZXNzPSIwMDoxNjozZTo3YjplMDowNyIvPjxmaWx0ZXJyZWYgZmlsdGVyPSJ2ZHNtLW5vLW1hYy1zcG9vZmluZyIvPjxiYW5kd2lkdGgvPjwvaW50ZXJmYWNlPjxkaXNrIHR5cGU9ImZpbGUiIGRldmljZT0iY2Ryb20iIHNuYXBzaG90PSJubyI+PGRyaXZlciBuYW1lPSJxZW11IiB0eXBlPSJyYXciIGVycm9yX3BvbGljeT0icmVwb3J0Ii8+PHNvdXJjZSBmaWxlPSIiIHN0YXJ0dXBQb2xpY3k9Im9wdGlvbmFsIi8+PHRhcmdldCBkZXY9ImhkYyIgYnVzPSJpZGUiLz48cmVhZG9ubHkvPjxhbGlhcyBuYW1lPSJ1YS1jY2NiODIyNS1mZWZlLTQzMGEtODhlMS0wNjMyODgwZWY5ZmQiLz48YWRkcmVzcyBidXM9IjEiIGNvbnRyb2xsZXI9IjAiIHVuaXQ9IjAiIHR5cGU9ImRyaXZlIiB0YXJnZXQ9IjAiLz48L2Rpc2s+PGRpc2sgc25hcHNob3Q9Im5vIiB0eXBlPSJmaWxlIiBkZXZpY2U9ImRpc2siPjx0YXJnZXQgZGV2PSJ2ZGEiIGJ1cz0idmlydGlvIi8+PHNvdXJjZSBmaWxlPSIvcmhldi9kYXRhLWNlbnRlci8wMDAwMDAwMC0wMDAwLTAwMDAtMDAwMC0wMDAwMDAwMDAwMDAvYTJhNmYxNWUtYjczYi00ZjNkLTgxYmItZTVjY2I1YTUzNzZiL2ltYWdlcy84NGFmYjBlNS1iNGExLTQ5MjYtODI3NC0yNmJmYmE4ZjUwNmYvNThmNTQ5ZDAtNTQ2Yy00Mzg5LWEwODUtYTcyODc1NTBiZDcxIi8+PGRyaXZlciBuYW1lPSJxZW11IiBpbz0idGhyZWFkcyIgdHlwZT0icmF3IiBlcnJvcl9wb2xpY3k9InN0b3AiIGNhY2hlPSJub25lIi8+PGFsaWFzIG5hbWU9InVhLTg0YWZiMGU1LWI0YTEtNDkyNi04Mjc0LTI2YmZiYThmNTA2ZiIvPjxhZGRyZXNzIGJ1cz0iMHgwMCIgZG9tYWluPSIweDAwMDAiIGZ1bmN0aW9uPSIweDAiIHNsb3Q9IjB4MDciIHR5cGU9InBjaSIvPjxzZXJpYWw+ODRhZmIwZTUtYjRhMS00OTI2LTgyNzQtMjZiZmJhOGY1MDZmPC9zZXJpYWw+PC9kaXNrPjxsZWFzZT48a2V5PjU4ZjU0OWQwLTU0NmMtNDM4OS1hMDg1LWE3Mjg3NTUwYmQ3MTwva2V5Pjxsb2Nrc3BhY2U+YTJhNmYxNWUtYjczYi00ZjNkLTgxYmItZTVjY2I1YTUzNzZiPC9sb2Nrc3BhY2U+PHRhcmdldCBvZmZzZXQ9IkxFQVNFLU9GRlNFVDo1OGY1NDlkMC01NDZjLTQzODktYTA4NS1hNzI4NzU1MGJkNzE6YTJhNmYxNWUtYjczYi00ZjNkLTgxYmItZTVjY2I1YTUzNzZiIiBwYXRoPSJMRUFTRS1QQVRIOjU4ZjU0OWQwLTU0NmMtNDM4OS1hMDg1LWE3Mjg3NTUwYmQ3MTphMmE2ZjE1ZS1iNzNiLTRmM2QtODFiYi1lNWNjYjVhNTM3NmIiLz48L2xlYXNlPjwvZGV2aWNlcz48cG0+PHN1c3BlbmQtdG8tZGlzayBlbmFibGVkPSJubyIvPjxzdXNwZW5kLXRvLW1lbSBlbmFibGVkPSJubyIvPjwvcG0+PG9zPjx0eXBlIGFyY2g9Ing4Nl82NCIgbWFjaGluZT0icGMtaTQ0MGZ4LXJoZWw3LjUuMCI+aHZtPC90eXBlPjxzbWJpb3MgbW9kZT0ic3lzaW5mbyIvPjwvb3M+PG1ldGFkYXRhPjxvdmlydC10dW5lOnFvcy8+PG92aXJ0LXZtOnZtPjxtaW5HdWFyYW50ZWVkTWVtb3J5TWIgdHlwZT0iaW50Ij4xMDI0PC9taW5HdWFyYW50ZWVkTWVtb3J5TWI+PGNsdXN0ZXJWZXJzaW9uPjQuMjwvY2x1c3RlclZlcnNpb24+PG92aXJ0LXZtOmN1c3RvbS8+PG92aXJ0LXZtOmRldmljZSBtYWNfYWRkcmVzcz0iMDA6MTY6M2U6N2I6ZTA6MDciPjxvdmlydC12bTpjdXN0b20vPjwvb3ZpcnQtdm06ZGV2aWNlPjxvdmlydC12bTpkZXZpY2UgZGV2dHlwZT0iZGlzayIgbmFtZT0idmRhIj48b3ZpcnQtdm06cG9vbElEPjAwMDAwMDAwLTAwMDAtMDAwMC0wMDAwLTAwMDAwMDAwMDAwMDwvb3ZpcnQtdm06cG9vbElEPjxvdmlydC12bTp2b2x1bWVJRD41OGY1NDlkMC01NDZjLTQzODktYTA4NS1hNzI4NzU1MGJkNzE8L292aXJ0LXZtOnZvbHVtZUlEPjxvdmlydC12bTpzaGFyZWQ+ZXhjbHVzaXZlPC9vdmlydC12bTpzaGFyZWQ+PG92aXJ0LXZtOmltYWdlSUQ+ODRhZmIwZTUtYjRhMS00OTI2LTgyNzQtMjZiZmJhOGY1MDZmPC9vdmlydC12bTppbWFnZUlEPjxvdmlydC12bTpkb21haW5JRD5hMmE2ZjE1ZS1iNzNiLTRmM2QtODFiYi1lNWNjYjVhNTM3NmI8L292aXJ0LXZtOmRvbWFpbklEPjwvb3ZpcnQtdm06ZGV2aWNlPjxsYXVuY2hQYXVzZWQ+ZmFsc2U8L2xhdW5jaFBhdXNlZD48cmVzdW1lQmVoYXZpb3I+YXV0b19yZXN1bWU8L3Jlc3VtZUJlaGF2aW9yPjwvb3ZpcnQtdm06dm0+PC9tZXRhZGF0YT48L2RvbWFpbj4= vmName=HostedEngine display=qxl devices={index:0,iface:virtio,format:raw,bootOrder:1,address:{type:pci,slot:0x07,bus:0x00,domain:0x0000,function:0x0},volumeID:58f549d0-546c-4389-a085-a7287550bd71,imageID:84afb0e5-b4a1-4926-8274-26bfba8f506f,readonly:false,domainID:a2a6f15e-b73b-4f3d-81bb-e5ccb5a5376b,deviceId:84afb0e5-b4a1-4926-8274-26bfba8f506f,poolID:00000000-0000-0000-0000-000000000000,device:disk,shared:exclusive,propagateErrors:off,type:disk} devices={nicModel:pv,macAddr:00:16:3e:7b:e0:07,linkActive:true,network:ovirtmgmt,deviceId:5d38cdc1-9023-4808-8529-e95e692feb5b,address:{type:pci,slot:0x03,bus:0x00,domain:0x0000,function:0x0},device:bridge,type:interface} devices={alias:video0,specParams:{vram:32768,vgamem:16384,heads:1,ram:65536},deviceId:a93c3d26-c30f-4c08-bc43-b0387bdff482,address:{type:pci,slot:0x02,bus:0x00,domain:0x0000,function:0x0},device:qxl,type:video} devices={device:spice,type:graphics,deviceId:2705e98a-7610-4069-aae7-6a88ef47738b,address:None} devices={device:vnc,type:graphics,deviceId:7eb9ded3-473f-488e-a676-73c17ec62f57,address:None} devices={index:2,iface:ide,shared:false,readonly:true,deviceId:8c3179ac-b322-4f5c-9449-c52e3665e0ae,address:{controller:0,target:0,unit:0,bus:1,type:drive},device:cdrom,path:,type:disk} devices={device:ide,specParams:{index:0},type:controller,deviceId:279a65b6-3489-44c0-abc8-111828062e34,address:{type:pci,slot:0x01,bus:0x00,domain:0x0000,function:0x1}} devices={alias:ua-8225808b-f417-4aa2-b060-c92eba8fb078,specParams:{source:urandom},deviceId:8225808b-f417-4aa2-b060-c92eba8fb078,address:{type:pci,slot:0x09,bus:0x00,domain:0x0000,function:0x0},device:virtio,model:virtio,type:rng} devices={device:usb,specParams:{index:0,model:piix3-uhci},type:controller,deviceId:91f25294-76be-4082-b544-58be835d9638,address:{type:pci,slot:0x01,bus:0x00,domain:0x0000,function:0x2}} devices={device:scsi,model:virtio-scsi,type:controller,deviceId:03bc7edb-2ba4-48ac-8c5d-e86994dbeb67,address:{type:pci,slot:0x05,bus:0x00,domain:0x0000,function:0x0}} devices={device:virtio-serial,type:controller,deviceId:7b558463-e02d-4cbe-b94e-c38ec3dbca24,address:{type:pci,slot:0x06,bus:0x00,domain:0x0000,function:0x0}} devices={device:console,type:console}
Simone, how do we launch the HE? Where did it get the extra feature? (BTW, we should probably close it as a documentation item, but I'd like to understand the details first)
With up-to-date 4.2 rpms on the host, we directly launch it with the XML for libvirt generated by the engine and saved by the engine in the OVF_STORE volumes. That XML is base64 encoded and saved in vm.conf in xmlBase64 field. 4.1 hosts are instead consuming vm.conf ignoring xmlBase64 field. So in the XML from xmlBase64 field on https://bugzilla.redhat.com/show_bug.cgi?id=1585986#c9 I read: <cpu match="exact"> <model>Westmere</model> <topology cores="4" threads="1" sockets="16" /> <numa> <cell id="0" cpus="0,1,2,3" memory="16777216" /> </numa> </cpu>
Moving back to virt team since on hosted engine we don't touch such settings. Are the 2 hosts on the same cluster?
1. Both hosts are on the same cluster. 2. The cluster CPU family type is older then the hosts Host are IvyBridge , Cluster is: Westmere
logs only contain HostedEngine VM as incoming migration, and it's coming with SandyBridge already. Please attach logs with HE parameters when it originally starts
I see first record of HE VM is started on lynx14: The agnet and broker on this host are started after it: MainThread::INFO::2018-06-04 22:40:26,567::hosted_engine::491::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_monitoring_loop) Current state EngineUp (score: 3400) In the vm.conf.fallback you can see that the cpu-type is SandyBridge-IBRS -rw-r--r--. 1 vdsm kvm 1391 Jun 5 17:03 /run/ovirt-hosted-engine-ha/vm.conf.fallback vmId=1f41e617-5e95-4086-aa86-d93205bf482e memSize=16384 display=vnc devices={index:2,iface:ide,address:{ controller:0, target:0,unit:0, bus:1, type:drive},specParams:{},readonly:true,deviceId:dff26b50-1111-4c60-bde8-7d017e60d99c,path:,device:cdrom,shared:false,type:disk} devices={index:0,iface:virtio,format:raw,poolID:00000000-0000-0000-0000-000000000000,volumeID:58f549d0-546c-4389-a085-a7287550bd71,imageID:84afb0e5-b4a1-4926-8274-26bfba8f506f,specParams:{},readonly:false,domainID:a2a6f15e-b73b-4f3d-81bb-e5ccb5a5376b,optional:false,deviceId:58f549d0-546c-4389-a085-a7287550bd71,address:{bus:0x00, slot:0x06, domain:0x0000, type:pci, function:0x0},device:disk,shared:exclusive,propagateErrors:off,type:disk,bootOrder:1} devices={device:scsi,model:virtio-scsi,type:controller} devices={nicModel:pv,macAddr:00:16:3e:7b:e0:07,linkActive:true,network:ovirtmgmt,specParams:{},deviceId:bf42fa4a-13eb-4ec1-b864-1e4f792b8693,address:{bus:0x00, slot:0x03, domain:0x0000, type:pci, function:0x0},device:bridge,type:interface} devices={device:console,type:console} devices={device:vga,alias:video0,type:video} devices={device:vnc,type:graphics} vmName=HostedEngine spiceSecureChannels=smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir smp=4 maxVCpus=4 cpuType=SandyBridge-IBRS emulatedMachine=rhel6.5.0 devices={device:virtio,specParams:{source:urandom},model:virtio,type:rng} vdsm log: 2018-06-05 18:01:11,895+0300 INFO (jsonrpc/6) [jsonrpc.JsonRpcServer] RPC call Host.ping2 succeeded in 0.00 seconds (__init__:573) 2018-06-05 18:01:11,899+0300 INFO (jsonrpc/0) [api.virt] START getStats() from=::1,41404, vmId=1f41e617-5e95-4086-aa86-d93205bf482e (api:46) 2018-06-05 18:01:11,901+0300 INFO (jsonrpc/0) [api.virt] FINISH getStats return={'status': {'message': 'Done', 'code': 0}, 'statsList': [{'vcpuCount': '4', 'memUsage': '25', 'acpiEnable': 'true', 'displayInfo': [{'tlsPort': '5901', 'ipAddress': '10.46.16.29', 'type': 'spice', 'port': '5900'}, {'tlsPort': '-1', 'ipAddress': '10.46.16.29', 'type': 'vnc', 'port': '5902'}], 'guestFQDN': u'hosted-engine-02.lab.eng.tlv2.redhat.com', 'vmId': '1f41e617-5e95-4086-aa86-d93205bf482e', 'session': 'Unknown', 'vmType': 'kvm', 'timeOffset': '-1', 'balloonInfo': {'balloon_max': '16777216', 'balloon_min': '1048576', 'balloon_target': '16777216', 'balloon_cur': '16777216'}, 'disksUsage': [{u'path': u'/', u'total': '7638876160', u'fs': u'xfs', u'used': '3544621056'}, {u'path': u'/boot', u'total': '1063256064', u'fs': u'xfs', u'used': '178348032'}, {u'path': u'/home', u'total': '1063256064', u'fs': u'xfs', u'used': '33783808'}, {u'path': u'/var', u'total': '21464350720', u'fs': u'xfs', u'used': '414855168'}, {u'path': u'/var/log', u'total': '10726932480', u'fs': u'xfs', u'used': '66744320'}, {u'path': u'/var/log/audit', u'total': '1063256064', u'fs': u'xfs', u'used': '34189312'}], 'network': {'vnet0': {'macAddr': '00:16:3e:7b:e0:07', 'rxDropped': '0', 'tx': '246041090', 'rxErrors': '0', 'txDropped': '0', 'rx': '125043285', 'txErrors': '0', 'state': 'unknown', 'sampleTime': 4367026.95, 'speed': '1000', 'name': 'vnet0'}}, 'vmJobs': {}, 'cpuUser': '5.40', 'elapsedTime': '72790', 'memoryStats': {'swap_out': 0, 'majflt': 0, 'minflt': 186, 'mem_cached': '583088', 'mem_free': '11905848', 'mem_buffers': '0', 'swap_in': 0, 'pageflt': 186, 'mem_total': '16258532', 'mem_unused': '11905848'}, 'cpuSys': '1.20', 'appsList': (u'kernel-3.10.0-862.3.2.el7', u'ovirt-guest-agent-common-1.0.14-3.el7ev', u'cloud-init-0.7.9-24.el7'), 'guestOs': u'3.10.0-862.3.2.el7.x86_64', 'vmName': 'HostedEngine', 'status': 'Up', 'clientIp': '', 'hash': '-4550103827432548493', 'guestCPUCount': 4, 'cpuUsage': '1619630000000', 'vcpuPeriod': 100000L, 'guestTimezone': {u'zone': u'Asia/Jerusalem', u'offset': 120}, 'vcpuQuota': '-1', 'statusTime': '4367026950', 'kvmEnable': 'true', 'disks': {'vda': {'readLatency': '0', 'flushLatency': '84263', 'readRate': '0.0', 'writeRate': '39526.4', 'writtenBytes': '1435334656', 'truesize': '4398194688', 'apparentsize': '62277025792', 'readOps': '7162', 'writeLatency': '601236', 'imageID': '84afb0e5-b4a1-4926-8274-26bfba8f506f', 'readBytes': '175622656', 'writeOps': '119159'}, 'hdc': {'readLatency': '0', 'flushLatency': '0', 'readRate': '0.0', 'writeRate': '0.0', 'writtenBytes': '0', 'truesize': '0', 'apparentsize': '0', 'readOps': '0', 'writeLatency': '0', 'readBytes': '0', 'writeOps': '0'}}, 'monitorResponse': '0', 'guestOsInfo': {u'kernel': u'3.10.0-862.3.2.el7.x86_64', u'arch': u'x86_64', u'version': u'7.5', u'distribution': u'Red Hat Enterprise Linux Server', u'type': u'linux', u'codename': u'Maipo'}, 'username': u'None', 'guestName': u'hosted-engine-02.lab.eng.tlv2.redhat.com', 'lastLogin': 1528208020.854908, 'guestIPs': u'10.46.16.197', 'guestContainers': [], 'netIfaces': [{u'inet6': [], u'hw': u'00:16:3e:7b:e0:07', u'inet': [u'10.46.16.197'], u'name': u'eth0'}]}]} from=::1,41404, vmId=1f41e617-5e95-4086-aa86-d93205bf482e (api:52)
Created attachment 1448712 [details] vdsm log HE VM first started
(In reply to Israel Pinto from comment #16) > Created attachment 1448712 [details] > vdsm log HE VM first started I do not see any VM start in that attached log file. Please doublecheck
If i want to reproduce it, i need to start the HE VM first on the host with the IBRS. And them migration it to host without IBRS?
(In reply to Israel Pinto from comment #19) > If i want to reproduce it, i need to start the HE VM first on the host with > the IBRS. And them migration it to host without IBRS? I do not know, you opened the bug:) From your description it indeed sounds as a way how to reproduce it.
Israel, in the original report you said the type is Westmere, and Simone pointed out the configuration looks like that. But in comment #15 you said the configuration contains SandyBridge-IBRS (which corresponds to the actual type the VM is running with) - so...if you started with that type and later changed configuration and didn't restart it then it is still running with the original CPU which is not going to migrate to a "lower" host
(In reply to Michal Skrivanek from comment #21) > Israel, > in the original report you said the type is Westmere, and Simone pointed out > the configuration looks like that. But in comment #15 you said the > configuration contains SandyBridge-IBRS (which corresponds to the actual > type the VM is running with) - so...if you started with that type and later > changed configuration and didn't restart it then it is still running with > the original CPU which is not going to migrate to a "lower" host Yes the cluster is set with Westmere and 2 host are with set with SandyBridge-IBRS (the rhel hosts) and one host is with IvyBridge See virsh -r capabilities in https://bugzilla.redhat.com/show_bug.cgi?id=1585986#c0 The HE VM was migrate from the SandyBridge-IBRS to the IvyBridge and failed. I understand it since the mis-configuration of the hosts and clusters But is it really problem we need to handle or the error here here is since the setting is not right? If so then it just need to be documented (if it not)
it indeed is a problem, I just do not understand how did you get to that state. Please describe exactly how to reproduce this
Deployed SHE over NFS on IBRS capable host rose07 and this is what I see in vm.conf on host: rose07 ~]# cat /run/ovirt-hosted-engine-ha/vm.conf |grep spec cpuType=SandyBridge,+spec-ctrl devices={alias:video0,specParams:{vram:32768,vgamem:16384,heads:1,ram:65536},deviceId:20d306f4-c9f3-42f3-b5f6-5da1e3cca6e2,address:None,device:qxl,type:video} devices={device:usb,specParams:{index:0,model:piix3-uhci},type:controller,deviceId:aba6e8f6-a48a-4054-9c2b-f45d973ef243,address:None} devices={alias:None,specParams:{source:urandom},deviceId:8a6dc0d9-17c0-41f0-b9d3-b1a98d8b618d,address:None,device:virtio,model:virtio,type:rng} rose07 ~]# virsh -r capabilities | head <capabilities> <host> <uuid>a8821478-30a0-4281-8552-7d98a34b74bf</uuid> <cpu> <arch>x86_64</arch> <model>IvyBridge-IBRS</model> <vendor>Intel</vendor> <microcode version='31'/> <topology sockets='1' cores='4' threads='2'/> rose07 ~]# cat /sys/kernel/debug/x86/ibrs_enabled 0 [root@rose07 ~]# cat /sys/kernel/debug/x86/pti_enabled 1 [root@rose07 ~]# cat /sys/kernel/debug/x86/ibpb_enabled 1 On SHE-VM on rose07: he-1 ~]# cat /sys/kernel/debug/x86/ibrs_enabled 0 [root@nsednev-he-1 ~]# cat /sys/kernel/debug/x86/pti_enabled 1 [root@nsednev-he-1 ~]# cat /sys/kernel/debug/x86/ibpb_enabled 1 This is puma18, host without IBRS capability and this is what I see on in vm.conf of SHE VM that was deployed on it: puma18 ~]# virsh -r capabilities | headcat /sys/kernel/debug/x86/ibrs_enabled -bash: headcat: command not found puma18 ~]# cat /sys/kernel/debug/x86/ibrs_enabled 0 [root@puma18 ~]# cat /sys/kernel/debug/x86/pti_enabled 1 [root@puma18 ~]# cat /sys/kernel/debug/x86/ibpb_enabled 0 puma18 ~]# cat /run/ovirt-hosted-engine-ha/vm.conf | grep spec devices={alias:video0,specParams:{vram:32768,vgamem:16384,heads:1,ram:65536},deviceId:bac7b5d8-d0f0-4c54-8f3f-b8cd03beb72c,address:{type:pci,slot:0x02,bus:0x00,domain:0x0000,function:0x0},device:qxl,type:video} devices={device:ide,specParams:{index:0},type:controller,deviceId:4a0c0494-7957-4e7b-b30b-e50d0a3ebae3,address:{type:pci,slot:0x01,bus:0x00,domain:0x0000,function:0x1}} devices={alias:ua-43a67516-6c0a-422e-a677-d0700038bcc0,specParams:{source:urandom},deviceId:43a67516-6c0a-422e-a677-d0700038bcc0,address:{type:pci,slot:0x09,bus:0x00,domain:0x0000,function:0x0},device:virtio,model:virtio,type:rng} devices={device:usb,specParams:{index:0,model:piix3-uhci},type:controller,deviceId:d8b569fd-54a4-4333-8510-11970a053700,address:{type:pci,slot:0x01,bus:0x00,domain:0x0000,function:0x2}} You can clearly see that on SHE-VM that is running over rose07 (IBRS capable host) there is +spec-ctrl capability flag within the configuration of SHE-VM.
Nikolai, what is the question? What's the cluster's CPU setting? If there is a mismatch between HE configuration and the Cluster setting then this needs to be solved at the HE VM configuration level, Simone. First I'd like a confirmation that it's indeed the case, and how to get to that state.
(In reply to Michal Skrivanek from comment #25) > Nikolai, what is the question? > > What's the cluster's CPU setting? > If there is a mismatch between HE configuration and the Cluster setting then > this needs to be solved at the HE VM configuration level, Simone. > First I'd like a confirmation that it's indeed the case, and how to get to > that state. CPU type in UI is " Intel SandyBridge IBRS Family", CPU Architecture is x84_64 that is if I'm thinking correctly about the expected answer to your question. I've set the needinfo to ask Francesco to take a look at what was possible the root cause of the failed migration.
Tested on these components: ovirt-engine-4.2.4.1-0.1.el7.noarch ovirt-hosted-engine-ha-2.2.13-1.el7ev.noarch ovirt-hosted-engine-setup-2.2.22-1.el7ev.noarch rhvm-appliance-4.2-20180601.0.el7.noarch Linux 3.10.0-862.6.1.el7.x86_64 #1 SMP Mon Jun 4 15:33:25 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.5 (Maipo)
how did you deploy it on puma18 with SandyBridge-IBRS CPU when that host is not IBRS capable - how is it that host operational in a SandyBridge-IBRS cluster?
(In reply to Michal Skrivanek from comment #25) > What's the cluster's CPU setting? > If there is a mismatch between HE configuration and the Cluster setting then > this needs to be solved at the HE VM configuration level, Simone. Hosted-engine-setup isn't going to explicitly set cluster CPU type or VM CPU type. It simply adds the first host letting the engine implicitly choose the rest from there.
I reproduce it: 1. Deploy HE on XXX-IBRS host (in our case Intel SandyBridge IBRS) VM config file have the spec-ctrl cpu flag. cat /run/ovirt-hosted-engine-ha/vm.conf |grep spec cpuType=SandyBridge,+spec-ctrl After deploy the cluster is XXX IBRS Family in out case: Intel SandyBridge IBRS Family 2. Update cluster CPU type to None XXX-IBRS (in out case Intel Conroe Family) 3. Deploy Host which is not XXX-IBRS 4. Migrate VM from XXX-IBRS host to None IBRS host. Migration failed: vdsm source host log: 2018-06-11 15:41:57,634+0300 ERROR (migsrc/64875e38) [virt.vm] (vmId='64875e38-0adb-4f43-a2d2-82e0d7372efe') operation failed: guest CPU doesn't match specification: missing features: xsave,avx,spec-ctrl,xsaveopt (migration:290) 2018-06-11 15:41:58,551+0300 ERROR (migsrc/64875e38) [virt.vm] (vmId='64875e38-0adb-4f43-a2d2-82e0d7372efe') Failed to migrate (migration:455) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 437, in _regular_run self._startUnderlyingMigration(time.time()) File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 509, in _startUnderlyingMigration self._perform_with_conv_schedule(duri, muri) File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 587, in _perform_with_conv_schedule self._perform_migration(duri, muri) File "/usr/lib/python2.7/site-packages/vdsm/virt/migration.py", line 529, in _perform_migration self._migration_flags) File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 98, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1746, in migrateToURI3 if ret == -1: raise libvirtError ('virDomainMigrateToURI3() failed', dom=self) libvirtError: operation failed: guest CPU doesn't match specification: missing features: xsave,avx,spec-ctrl,xsaveopt
Created attachment 1450033 [details] logs_11_6_2018
(In reply to Simone Tiraboschi from comment #29) > (In reply to Michal Skrivanek from comment #25) > > What's the cluster's CPU setting? > > If there is a mismatch between HE configuration and the Cluster setting then > > this needs to be solved at the HE VM configuration level, Simone. > > Hosted-engine-setup isn't going to explicitly set cluster CPU type or VM CPU > type. It simply adds the first host letting the engine implicitly choose the > rest from there. It either need to follow desired Cluster CPU picked at deployment time or the Cluster created during the deployment need to match the first host. IIUC that works fine and the cluster is created with a correct CPU type. Now if that Cluster configuration is later changed to a "lower" CPU then the HE VM need to be reconfigured accordingly. This is a HE specific logic - there's a different code path for HE VM skipping the Cluster update checks - and a solution suitable for HE VM need to be implemented. Deferring to Integration team as to how.
(In reply to Michal Skrivanek from comment #32) > It either need to follow desired Cluster CPU picked at deployment time or > the Cluster created during the deployment need to match the first host. IIUC > that works fine and the cluster is created with a correct CPU type. > Now if that Cluster configuration is later changed to a "lower" CPU then the > HE VM need to be reconfigured accordingly. This is a HE specific logic - > there's a different code path for HE VM skipping the Cluster update checks - > and a solution suitable for HE VM need to be implemented. Deferring to > Integration team as to how. We can implement all the upfront checks we want, but they are not going to solve this bug or prevent it: if the user can lower the cluster specs on engine side with no checks or impact on the running hosted-engine VM, this will still be going to happen regardless of any initial check.
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
Some updates i try the following to see if it solve the cpu flag problem. 1. After adding the new host with lower CPU type, wait for 1 hour for the HE VM to updated with CPU flags. Results: Checking the cpu flags on the VM after 1 hour get the same flags no change. 2. Stop HE VM and run it on the host with low CPU. Results: VM failed to run, vdsm log, flags issue: 2018-06-21 15:51:27,582+0300 ERROR (vm/8e2ce9a7) [virt.vm] (vmId='8e2ce9a7-d1d7-477b-8b91-f6e6ee48dc62') The vm start process failed (vm:943) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 872, in _startUnderlyingVm self._run() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2876, in _run dom.createWithFlags(flags) File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1099, in createWithFlags if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) libvirtError: the CPU is incompatible with host CPU: Host CPU does not provide required features: xsave, avx, spec-ctrl 2018-06-21 15:51:27,583+0300 INFO (vm/8e2ce9a7) [virt.vm] (vmId='8e2ce9a7-d1d7-477b-8b91-f6e6ee48dc62') Changed state to Down: the CPU is incompatible with host CPU: Host CPU does not provide required features: xsave, avx, spec-ctrl (code=1) (vm:1683)
I think the issue is simply that the generation id of the VM is not bumped and so the OVF is not recomputed.
I do not see an issue with 1h refresh, you do not move your critical component to different CPU models very often. I'd be fine with closing as WONTFIX, but if you have a patch already, fine as well...
Not identified as blocker for 4.2.7, moving to 4.2.8
Bhushan, can you please execute sudo -u postgres scl enable rh-postgresql95 -- psql -d engine -c "SELECT vm_name, cluster_name, cluster_cpu_name, cpu_name, custom_cpu_name FROM vms WHERE origin=6" on the engine VM and share its output?
Bhushan, I can only suggest to try editing the hosted-engine VM description or change the number of cores or the memory amount just to try forcing a quick regeneration of the OVF store disks. And run onm the engine VM grep "UploadStreamCommand.*3ee26ff5-afb1-412a-89ef-a289ac109fed" /var/log/ovirt-engine/engine.log until you see something new.
Another option is try manually forcing an OVF_STOREs update from the storage domains tab: see the attached screenshot.
Created attachment 1546714 [details] trigger OVF_STORE update
I opened a more specific bug here: https://bugzilla.redhat.com/1691562
I'm not sure that the other bug is valid. The updates are continuing to run. Here's the latest successful one for the hosted engine, but there are 5 days of updates before this (with a regeneration at 13:35 and one at 14:35), and days after with no update for Hosted Engine. Other VMs are updated as normal, though. 2019-03-06 14:35:25,923+01 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-25) [11b12d15] START, SetVolumeDescriptionVDSCommand( SetVolumeDescriptionVDSCommandParameters:{storagePoolId='b74b6e90-2fa7-11e9-90a8-00163e1f0044', ignoreFailoverLimit='false', storageDomainId='3ee26ff5-afb1-412a-89ef-a289ac109fed', imageGroupId='556be753-69bd-413c-87e2-e170d3fca9db', imageId='5cd0085a-094d-4806-a251-1af759241b4d'}), log id: 7669d4bc 2019-03-06 14:35:25,923+01 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-25) [11b12d15] -- executeIrsBrokerCommand: calling 'setVolumeDescription', parameters: 2019-03-06 14:35:25,923+01 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-25) [11b12d15] ++ spUUID=b74b6e90-2fa7-11e9-90a8-00163e1f0044 2019-03-06 14:35:25,923+01 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-25) [11b12d15] ++ sdUUID=3ee26ff5-afb1-412a-89ef-a289ac109fed 2019-03-06 14:35:25,923+01 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-25) [11b12d15] ++ imageGroupGUID=556be753-69bd-413c-87e2-e170d3fca9db 2019-03-06 14:35:25,923+01 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-25) [11b12d15] ++ volUUID=5cd0085a-094d-4806-a251-1af759241b4d 2019-03-06 14:35:25,923+01 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-25) [11b12d15] ++ description={"Updated":true,"Size":30720,"Last Updated":"Wed Mar 06 14:35:25 CET 2019","Storage Domains":[{"uuid":"3ee26ff5-afb1-412a-89ef-a289ac109fed"}],"Disk Description":"OVF_STORE"} 2019-03-06 14:35:25,971+01 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.SetVolumeDescriptionVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-25) [11b12d15] FINISH, SetVolumeDescriptionVDSCommand, log id: 7669d4bc 2019-03-06 14:35:25,979+01 INFO [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-25) [11b12d15] Lock freed to object 'EngineLock:{exclusiveLocks='[3ee26ff5-afb1-412a-89ef-a289ac109fed=STORAGE]', sharedLocks='[b74b6e90-2fa7-11e9-90a8-00163e1f0044=OVF_UPDATE]'}' 2019-03-06 14:35:27,075+01 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedThreadFactory-engineScheduled-Thread-76) [] Polling and updating Async Tasks: 4 tasks, 4 tasks to poll now 2019-03-06 14:35:27,081+01 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedThreadFactory-engineScheduled-Thread-76) [] SPMAsyncTask::PollTask: Polling task '0e05 However, from an overview of the code, I'm not sure we actually update the timer every hour. I could be wrong here. Instead, it appears to be updated as part of UpdateVmCommand, and I don't use a USER_UPDATE_VM command for HostedEngine. The cluster level change to sandybridge is also before the log starts... It is possible that HostedEngine is an exception to these updates, since it's intended to be explicitly managed, and not to violate normal validations. Bhushan, have you tried comment#53? Andrej, please keep me honest here.
(In reply to Ryan Barry from comment #57) > It is possible that HostedEngine is an exception to these updates, since > it's intended to be explicitly managed, and not to violate normal > validations. Please take care that, although we don't recommend it, nothing is really preventing the user from creating regular VMs on the hosted-engine storage domain so it should absolutely behave as a regular storage domain from this point of view.
The OVFs in hosted engine storage domain are updated every hour, same as any other domain. The OVF update is also triggered whenever the HE VM is updated by UpdateVmCommand. Looking at the engine logs, the OVF update is triggering every hour: cat engine.log | grep -i 'Successfully updated VM OVFs in Data Center' 2019-03-21 06:52:57,263+01 INFO [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-46) [422a9d00] Successfully updated VM OVFs in Data Center 'Default' 2019-03-21 07:52:57,278+01 INFO [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-19) [2eb148ea] Successfully updated VM OVFs in Data Center 'Default' 2019-03-21 08:52:57,294+01 INFO [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-30) [d63f4e6] Successfully updated VM OVFs in Data Center 'Default' 2019-03-21 09:52:57,310+01 INFO [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-12) [58e79de8] Successfully updated VM OVFs in Data Center 'Default' In the DB dump, the HE OVF also contains the incorrect CPU name: <CustomCpuName>Skylake-Server,+spec-ctrl,+ssbd</CustomCpuName> The problem seems to be that updating the cluster CPU did not increase the VM generation number, so a new OVF is not generated. As a workaround, the HE VM generation number can be increased by editing the HE VM somehow, as mentioned in comment#53.
(In reply to Andrej Krejcir from comment #59) > The problem seems to be that updating the cluster CPU did not increase the > VM generation number, so a new OVF is not generated. Yes, because HE is explicitly skipped during UpdateClusterCommand right here: https://github.com/oVirt/ovirt-engine/blob/6bae27af75ddf187d1b3306cffb847244327c452/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/UpdateClusterCommand.java#L157
Re-targeting to 4.3.6 not being identified as blocker for 4.3.5.
Moving to 4.4 since it depends on bug #1691562 which is targeted to 4.4
sync2jira
comment #59 is still relevant, I believe. without that the HE VM is *not* updated when editing a cluster.
So 2 things to work on: - can we get UpdateVmCommand triggered on the vms running in a cluster when the cluster is updated? this will ensure VMs will have the correct values after the cluster update. - seems like we have a bit of confusion on what's next step, there's also https://bugzilla.redhat.com/show_bug.cgi?id=1691562 which at this point seems a duplicate.
Not quite a duplicate, both must be done, but it's already properly in depends on
This bug will be fixed by patches in Bug 1691562. However it is no longer possible to decrease the compatibility version of existing Data Centers through the UI. This was changed in Bug 1753628. So on new deployments, the API have to be used to decrease the version of DC where the HE VM is running.
This bug is in modified and targeting 4.4.2. Can we re-target to 4.4.0 and move to qe?
Deployed latest HE on IBRS host and attached regular none-IBRS ha-host to the environment. Attached additional 2 IBRS ha-hosts. Tested with Software Version:4.4.1.2-0.10.el8ev Tried to migrate IBRS to none IBRS ha-host ocelot01.qa.lab.tlv.redhat.com and engine got disconnected me from it during migration, then it failed to get started on ocelot01. --== Host ocelot01.qa.lab.tlv.redhat.com (id: 2) status ==-- Host ID : 2 Host timestamp : 5166 Score : 3400 Engine status : {"vm": "up", "health": "bad", "detail": "Up", "reason": "failed liveliness check"} Hostname : ocelot01.qa.lab.tlv.redhat.com Local maintenance : False stopped : False crc32 : f4602df7 conf_on_shared_storage : True local_conf_timestamp : 5166 Status up-to-date : True Extra metadata (valid at timestamp): metadata_parse_version=1 metadata_feature_version=1 timestamp=5166 (Tue Jun 23 13:52:22 2020) host-id=2 score=3400 vm_conf_refresh_time=5166 (Tue Jun 23 13:52:22 2020) conf_on_shared_storage=True maintenance=False state=EngineStarting stopped=False Some CPU details about the environment: Engine CPU appeared as follows: nsednev-he-1 ~]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel Xeon E3-12xx v2 (Ivy Bridge) stepping : 9 microcode : 0x1 cpu MHz : 3292.522 cache size : 16384 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology cpuid tsc_known_freq pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm cpuid_fault pti fsgsbase smep erms xsaveopt arat bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit bogomips : 6585.04 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: IBRS ha-host's CPUs appeared as follows: 1. alma07 ~]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz stepping : 9 microcode : 0x21 cpu MHz : 1760.423 cache size : 8192 KB physical id : 0 siblings : 8 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm arat pln pts md_clear flush_l1d bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit srbds bogomips : 6585.56 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: 2. alma04 ~]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Intel(R) Xeon(R) CPU E5-2603 v2 @ 1.80GHz stepping : 4 microcode : 0x42e cpu MHz : 1799.888 cache size : 10240 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm arat pln pts md_clear flush_l1d bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit bogomips : 3599.79 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: 3. alma03 ~]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 62 model name : Intel(R) Xeon(R) CPU E5-2603 v2 @ 1.80GHz stepping : 4 microcode : 0x42e cpu MHz : 1799.983 cache size : 10240 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm arat pln pts md_clear flush_l1d bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit bogomips : 3599.64 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: None IBRS host was: ocelot01 ~]# virsh -r capabilities | head <capabilities> <host> <uuid>e602cd31-f7b3-4843-b13e-c5553cce84b2</uuid> <cpu> <arch>x86_64</arch> <model>Skylake-Server-IBRS</model> <vendor>Intel</vendor> <microcode version='33581318'/> <counter name='tsc' frequency='2099999000' scaling='yes'/> ocelot01 ~]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz stepping : 4 microcode : 0x2006906 cpu MHz : 1394.002 cache size : 22528 KB physical id : 0 siblings : 32 core id : 0 cpu cores : 16 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear flush_l1d bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa itlb_multihit bogomips : 4200.00 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: Neither engine or ocelot ha-host had IBRS flags on their CPUs. From ha-broker on ocelot I saw: MainThread::ERROR::2020-06-23 13:00:21,457::hosted_engine::564::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngin e::(_initialize_broker) Failed to start necessary monitors MainThread::ERROR::2020-06-23 13:00:21,460::agent::143::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Traceba ck (most recent call last): File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 85, in start_monitor response = self._proxy.start_monitor(type, options) File "/usr/lib64/python3.6/xmlrpc/client.py", line 1112, in __call__ return self.__send(self.__name, args) File "/usr/lib64/python3.6/xmlrpc/client.py", line 1452, in __request verbose=self.__verbose File "/usr/lib64/python3.6/xmlrpc/client.py", line 1154, in request return self.single_request(host, handler, request_body, verbose) File "/usr/lib64/python3.6/xmlrpc/client.py", line 1166, in single_request http_conn = self.send_request(host, handler, request_body, verbose) File "/usr/lib64/python3.6/xmlrpc/client.py", line 1279, in send_request self.send_content(connection, request_body) File "/usr/lib64/python3.6/xmlrpc/client.py", line 1309, in send_content connection.endheaders(request_body) File "/usr/lib64/python3.6/http/client.py", line 1249, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/usr/lib64/python3.6/http/client.py", line 1036, in _send_output self.send(msg) File "/usr/lib64/python3.6/http/client.py", line 974, in send self.connect() File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/unixrpc.py", line 74, in connect self.sock.connect(base64.b16decode(self.host)) FileNotFoundError: [Errno 2] No such file or directory During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 131, in _run_agent return action(he) File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/agent.py", line 55, in action_proper return he.start_monitoring() File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 437, in start_monitoring self._initialize_broker() File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/agent/hosted_engine.py", line 561, in _initialize_broker m.get('options', {})) File "/usr/lib/python3.6/site-packages/ovirt_hosted_engine_ha/lib/brokerlink.py", line 91, in start_monitor ).format(t=type, o=options, e=e) ovirt_hosted_engine_ha.lib.exceptions.RequestError: brokerlink - failed to start monitor via ovirt-ha-broker: [Errno 2] No such file or directory, [monitor: 'network', options: {'addr': '10.35.95.254', 'network_test': 'dns', 'tcp_t_address': '', 'tcp_t_port': ''}] MainThread::ERROR::2020-06-23 13:00:21,460::agent::144::ovirt_hosted_engine_ha.agent.agent.Agent::(_run_agent) Trying to restart agent MainThread::INFO::2020-06-23 13:00:21,460::agent::89::ovirt_hosted_engine_ha.agent.agent.Agent::(run) Agent shutting down Services were up and running: ocelot01 ~]# systemctl status ovirt-ha-agent -l ● ovirt-ha-agent.service - oVirt Hosted Engine High Availability Monitoring Agent Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-agent.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2020-06-23 13:00:43 IDT; 55min ago Main PID: 12886 (ovirt-ha-agent) Tasks: 2 (limit: 788464) Memory: 50.3M CGroup: /system.slice/ovirt-ha-agent.service └─12886 /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-agent Jun 23 13:00:43 ocelot01.qa.lab.tlv.redhat.com systemd[1]: Started oVirt Hosted Engine High Availability Monitoring A> ocelot01 ~]# systemctl status ovirt-ha-broker ● ovirt-ha-broker.service - oVirt Hosted Engine High Availability Communications Broker Loaded: loaded (/usr/lib/systemd/system/ovirt-ha-broker.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2020-06-23 13:00:21 IDT; 56min ago Main PID: 12597 (ovirt-ha-broker) Tasks: 14 (limit: 788464) Memory: 66.7M CGroup: /system.slice/ovirt-ha-broker.service ├─12597 /usr/libexec/platform-python /usr/share/ovirt-hosted-engine-ha/ovirt-ha-broker ├─29511 /bin/sh /usr/sbin/hosted-engine --check-liveliness └─29512 /usr/bin/python3 -m ovirt_hosted_engine_setup.check_liveliness Jun 23 13:00:21 ocelot01.qa.lab.tlv.redhat.com systemd[1]: Started oVirt Hosted Engine High Availability Communicatio> Jun 23 13:00:54 ocelot01.qa.lab.tlv.redhat.com ovirt-ha-broker[12597]: ovirt-ha-broker mgmt_bridge.MgmtBridge ERROR F> Jun 23 13:00:59 ocelot01.qa.lab.tlv.redhat.com ovirt-ha-broker[12597]: ovirt-ha-broker mgmt_bridge.MgmtBridge ERROR F> Jun 23 13:01:46 ocelot01.qa.lab.tlv.redhat.com ovirt-ha-broker[12597]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.> Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/ovirt> timeout=float(cfg["smtp-timeout"])) File "/usr/lib64/python3.6/smtplib.py", line> (code, msg) = self.connect(host, port) File "/usr/lib64/python3.6/smtplib.py", line> self.sock = self._get_socket(host, port, s> File "/usr/lib64/python3.6/smtplib.py", line> self.source_address) File "/usr/lib64/python3.6/socket.py", line > raise err File "/usr/lib64/python3.6/socket.py", line > sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection> Jun 23 13:01:56 ocelot01.qa.lab.tlv.redhat.com ovirt-ha-broker[12597]: ovirt-ha-broker ovirt_hosted_engine_ha.broker.> Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/ovirt> Network configurations were as follows on source and destination hosts: Source alma07.qa.lab.tlv.redhat.com ovirtmgmt: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.35.92.7 netmask 255.255.252.0 broadcast 10.35.95.255 inet6 2620:52:0:235c:92e2:baff:fe7d:3638 prefixlen 64 scopeid 0x0<global> inet6 fe80::92e2:baff:fe7d:3638 prefixlen 64 scopeid 0x20<link> ether 90:e2:ba:7d:36:38 txqueuelen 1000 (Ethernet) RX packets 6449423 bytes 51361455338 (47.8 GiB) RX errors 0 dropped 79 overruns 0 frame 0 TX packets 4972215 bytes 70838937200 (65.9 GiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 Destination ocelot01.qa.lab.tlv.redhat.com ovirtmgmt: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.35.30.1 netmask 255.255.255.0 broadcast 10.35.30.255 inet6 2620:52:0:231e:ae1f:6bff:fe57:ae82 prefixlen 64 scopeid 0x0<global> inet6 fe80::ae1f:6bff:fe57:ae82 prefixlen 64 scopeid 0x20<link> ether ac:1f:6b:57:ae:82 txqueuelen 1000 (Ethernet) RX packets 968471 bytes 10410120624 (9.6 GiB) RX errors 0 dropped 5 overruns 0 frame 0 TX packets 841870 bytes 91202373 (86.9 MiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 Migration failed not because of the IBRS flag, but because of different network subnets between source and destination ha-hosts (they are in different subnets, inet 10.35.92.7 netmask 255.255.252.0 broadcast 10.35.95.255 on source and inet 10.35.30.1 netmask 255.255.255.0 broadcast 10.35.30.255 on destination). After unsuccessful migration attempt, engine got automatically started on alma04 as the ha-host with the best score available. Moving this bug to verified forth to the fact that engine have no IBRS flag on it's CPU now and that it doesn't inherit IBRS flag from IBRS capable ha-host on which it being deployed. Please feel free to reopen if it still doesn't work for you. Tested on: ovirt-hosted-engine-setup-2.4.4-1.el8ev.noarch ovirt-hosted-engine-ha-2.4.3-1.el8ev.noarch Red Hat Enterprise Linux release 8.2 (Ootpa) Linux 4.18.0-193.10.1.el8_2.x86_64 #1 SMP Fri Jun 19 15:31:45 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux rhvm-appliance.x86_64 2:4.4-20200604.0.el8ev @rhv-4.4.1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: RHV Manager (ovirt-engine) 4.4 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:3247