Description of problem: I'm opening this bug for both clean install and upgrade, though I assume we'll need to split it later. We need to ensure that: 1. Clean installation on a host with IBRS compatible fixes (CPU + kernel + qemu-kvm + libvirt) can select an IBRS enabled vCPU type for the SHE VM. 2. We need to provide a procedure on upgrade to switch to such CPU type (already doable today?)
I've tried to deploy IBRS CPU type over Gluster and failed with: [ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_hosts": [{"address": "panther09.qa.lab.tlv.redhat.com", "affinity_labels": [], "auto_numa_status": "disable", "certificate": {"organization": "qa.lab.tlv.redhat.com", "subject": "O=qa.lab.tlv.redhat.com,CN=panther09.qa.lab.tlv.redhat.com"}, "cluster": {"href": "/ovirt-engine/api/clusters/21dcb67a-1f8b-11e8-bc70-00163eeeeeee", "id": "21dcb67a-1f8b-11e8-bc70-00163eeeeeee"}, "comment": "", "cpu": {"name": "Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz", "speed": 3021.0, "topology": {"cores": 8, "sockets": 1, "threads": 2}}, "device_passthrough": {"enabled": false}, "devices": [], "external_network_provider_configurations": [], "external_status": "ok", "hardware_information": {"manufacturer": "Dell Inc.", "product_name": "PowerEdge FC430", "serial_number": "4D50CB2", "supported_rng_sources": ["hwrng", "random"], "uuid": "4C4C4544-0044-3510-8030-B4C04F434232"}, "hooks": [], "href": "/ovirt-engine/api/hosts/cbc265b3-1b2b-484c-9f4b-36a71ab5181b", "id": "cbc265b3-1b2b-484c-9f4b-36a71ab5181b", "iscsi": {"initiator": "iqn.1994-05.com.redhat:a515d04e7e2f"}, "katello_errata": [], "kdump_status": "disabled", "ksm": {"enabled": false}, "libvirt_version": {"build": 0, "full_version": "libvirt-3.9.0-13.el7", "major": 3, "minor": 9, "revision": 0}, "max_scheduling_memory": 66862448640, "memory": 67267198976, "name": "panther09.qa.lab.tlv.redhat.com", "network_attachments": [], "nics": [], "numa_nodes": [], "numa_supported": false, "os": {"custom_kernel_cmdline": "", "reported_kernel_cmdline": "BOOT_IMAGE=/vmlinuz-3.10.0-858.el7.x86_64 root=/dev/mapper/vg0-lv_root ro rhgb quiet crashkernel=auto rd.lvm.lv=vg0/lv_root rd.lvm.lv=vg0/lv_swap console=ttyS1,115200n8 LANG=en_US.UTF-8", "type": "RHEL", "version": {"full_version": "7.5 - 8.el7", "major": 7, "minor": 5}}, "permissions": [], "port": 54321, "power_management": {"automatic_pm_enabled": true, "enabled": false, "kdump_detection": true, "pm_proxies": []}, "protocol": "stomp", "se_linux": {"mode": "enforcing"}, "spm": {"priority": 5, "status": "none"}, "ssh": {"fingerprint": "SHA256:pvf83fk8qaHH3w0mjHAVEDPOs6cOnGH42tPf7xKf/gk", "port": 22}, "statistics": [], "status": "non_responsive", "storage_connection_extensions": [], "summary": {"active": 1, "migrating": 0, "total": 1}, "tags": [], "transparent_huge_pages": {"enabled": true}, "type": "rhel", "unmanaged_networks": [], "update_available": false, "version": {"build": 19, "full_version": "vdsm-4.20.19-1.el7ev", "major": 4, "minor": 20, "revision": 0}}]}, "attempts": 50, "changed": false} Please specify the storage you would like to use (glusterfs, iscsi, fc, nfs)[nfs]: Components on host: ovirt-hosted-engine-setup-2.2.11-1.el7ev.noarch ovirt-hosted-engine-ha-2.2.6-1.el7ev.noarch Linux 3.10.0-858.el7.x86_64 #1 SMP Tue Feb 27 08:59:23 EST 2018 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.5 (Maipo) Moving back to assigned.
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Created attachment 1403702 [details] sosreport from host
Are you sure it was IBRS enabled? What did VDSM say?
(In reply to Yaniv Kaul from comment #5) > Are you sure it was IBRS enabled? What did VDSM say? It was the IBRS capable host. Vdsm log was attached within the sosreport. Is there any additional actions required for IBRS to work?
(In reply to Nikolai Sednev from comment #6) > (In reply to Yaniv Kaul from comment #5) > > Are you sure it was IBRS enabled? What did VDSM say? > > It was the IBRS capable host. > Vdsm log was attached within the sosreport. From VDSM log which you've attached: kernelFeatures': {u'IBRS': 0, u'PTI': 1, u'IBPB': 0}, > Is there any additional actions required for IBRS to work? I think IBRS should be equal to 1.
https://access.redhat.com/articles/3311301 These three debugfs tunables can be enabled or disabled on the kernel command line at boot, or at runtime via debugfs controls. The tunables control Page Table Isolation (pti), Indirect Branch Restricted Speculation (ibrs), and Indirect Branch Prediction Barriers (ibpb). Red Hat enables each of these features by default as needed to protect the architecture detected at boot. Architectural Defaults By default, each of the 3 tunables that apply to an architecture will be enabled automatically at boot time, based upon the architecture detected. Intel Defaults: pti 1 ibrs 1 ibpb 1 -> fix variant#1 #2 #3 pti 1 ibrs 0 ibpb 0 -> fix variant#1 #3 (for older Intel systems with no microcode update available) panther09 ~]# cat /sys/kernel/debug/x86/ibrs_enabled 0 I see that defaults somehow are not the same in documentation vs. real host that was cleanly reprovisioned to RHEL7.5. This might partially explain failure of deployment. panther09 ~]# systemctl status microcode -l ● microcode.service - Load CPU microcode update Loaded: loaded (/usr/lib/systemd/system/microcode.service; enabled; vendor preset: enabled) Active: inactive (dead) since Sun 2018-03-04 15:54:55 IST; 4min 15s ago Process: 888 ExecStart=/usr/bin/bash -c grep -l GenuineIntel /proc/cpuinfo | xargs grep -l -E "model[[:space:]]*: 79$" > /dev/null || echo 1 > /sys/devices/system/cpu/microcode/reload (code=exited, status=0/SUCCESS) Main PID: 888 (code=exited, status=0/SUCCESS) ar 04 15:54:55 panther09.qa.lab.tlv.redhat.com systemd[1]: Starting Load CPU microcode update... Mar 04 15:54:55 panther09.qa.lab.tlv.redhat.com systemd[1]: Started Load CPU microcode update. panther09 ~]# dmesg | grep microcode [ 0.000000] microcode: microcode updated early to revision 0x3a, date = 2017-01-30 [ 2.065211] microcode: CPU0 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065222] microcode: CPU1 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065234] microcode: CPU2 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065246] microcode: CPU3 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065257] microcode: CPU4 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065269] microcode: CPU5 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065281] microcode: CPU6 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065291] microcode: CPU7 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065302] microcode: CPU8 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065315] microcode: CPU9 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065338] microcode: CPU10 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065348] microcode: CPU11 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065370] microcode: CPU12 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065381] microcode: CPU13 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065392] microcode: CPU14 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065402] microcode: CPU15 sig=0x306f2, pf=0x1, revision=0x3a [ 2.065470] microcode: Microcode Update Driver: v2.01 <tigran.co.uk>, Peter Oruba Might be that host have to go through BIOS FW upgrade to get ibrs_enabled. Please provide your input.
Might be - it's your server - I don't know if you have an updated BIOS FW - or the latest Microcode from Intel for this CPU. But the fact is, you don't have IBRS right now. So: 1. Please test with a host we know has IBRS. 2. Were you asked in any point in time to choose such CPU? If not, how did we get to this situation in the 1st place (the failure) ?
(In reply to Yaniv Kaul from comment #9) > Might be - it's your server - I don't know if you have an updated BIOS FW - > or the latest Microcode from Intel for this CPU. > > But the fact is, you don't have IBRS right now. > > So: > 1. Please test with a host we know has IBRS. > 2. Were you asked in any point in time to choose such CPU? If not, how did > we get to this situation in the 1st place (the failure) ? Regarding first topic, the host is IBRS capable, I've received it especially from one of QA teams to verify this bug, although it appears to be with unpatched BIOS and so I've already opened a ticket for that matter. Can you please rephrase your second topic? I don't quite getting the point.
(In reply to Nikolai Sednev from comment #10) > (In reply to Yaniv Kaul from comment #9) > > Might be - it's your server - I don't know if you have an updated BIOS FW - > > or the latest Microcode from Intel for this CPU. > > > > But the fact is, you don't have IBRS right now. > > > > So: > > 1. Please test with a host we know has IBRS. > > 2. Were you asked in any point in time to choose such CPU? If not, how did > > we get to this situation in the 1st place (the failure) ? > > Regarding first topic, the host is IBRS capable, I've received it especially > from one of QA teams to verify this bug, although it appears to be with > unpatched BIOS and so I've already opened a ticket for that matter. > > Can you please rephrase your second topic? I don't quite getting the point. When using the Cockpit wizard, you should not be asked about the CPU type. Where were you asked? As part of command-line interactive setup / otopi / answer file?
(In reply to Yaniv Kaul from comment #11) > (In reply to Nikolai Sednev from comment #10) > > (In reply to Yaniv Kaul from comment #9) > > > Might be - it's your server - I don't know if you have an updated BIOS FW - > > > or the latest Microcode from Intel for this CPU. > > > > > > But the fact is, you don't have IBRS right now. > > > > > > So: > > > 1. Please test with a host we know has IBRS. > > > 2. Were you asked in any point in time to choose such CPU? If not, how did > > > we get to this situation in the 1st place (the failure) ? > > > > Regarding first topic, the host is IBRS capable, I've received it especially > > from one of QA teams to verify this bug, although it appears to be with > > unpatched BIOS and so I've already opened a ticket for that matter. > > > > Can you please rephrase your second topic? I don't quite getting the point. > > When using the Cockpit wizard, you should not be asked about the CPU type. > Where were you asked? As part of command-line interactive setup / otopi / > answer file? Ah...I was running not from the Cockpit, but via CLI.
(In reply to Nikolai Sednev from comment #12) > Ah...I was running not from the Cockpit, but via CLI. We are asking the cluster CPU type only on the vintage (--noansible) flow (since in that case hosted-engine-setup has to start the engine VM as we want it to get imported by the auto-import process). On the new flow we simply let the engine choose by itself according to host capabilities.
I see some strange host's behavior. In one hand ibrs should be supported on host, in the other hand, looks like capability is turned off. By default ibrs should be on on host if it's architecture supports that functionality: https://access.redhat.com/articles/3311301. [root@panther09 ~]# cat /sys/kernel/debug/x86/ibrs_enabled 0 [root@panther09 ~]# virsh -r capabilities | head <capabilities> <host> <uuid>4c4c4544-0044-3510-8030-b4c04f434232</uuid> <cpu> <arch>x86_64</arch> <model>Haswell-noTSX-IBRS</model> <vendor>Intel</vendor> <microcode version='60'/> <topology sockets='1' cores='8' threads='2'/> Any ideas? Components on host: ovirt-hosted-engine-ha-2.2.6-1.el7ev.noarch ovirt-hosted-engine-setup-2.2.12-1.el7ev.noarch Red Hat Enterprise Linux Server release 7.5 (Maipo) Linux 3.10.0-858.el7.x86_64 #1 SMP Tue Feb 27 08:59:23 EST 2018 x86_64 x86_64 x86_64 GNU/Linux The host's BIOS had been recently upgraded.
I've manually cast echo 2 > /sys/kernel/debug/x86/ibrs_enabled and then tested with the script: panther09 ~]# ./spectre-meltdown-checker.sh Spectre and Meltdown mitigation detection tool v0.35 Checking for vulnerabilities on current system Kernel is Linux 3.10.0-858.el7.x86_64 #1 SMP Tue Feb 27 08:59:23 EST 2018 x86_64 CPU is Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz Hardware check * Hardware support (CPU microcode) for mitigation techniques * Indirect Branch Restricted Speculation (IBRS) * SPEC_CTRL MSR is available: YES * CPU indicates IBRS capability: YES (SPEC_CTRL feature bit) * Indirect Branch Prediction Barrier (IBPB) * PRED_CMD MSR is available: YES * CPU indicates IBPB capability: YES (SPEC_CTRL feature bit) * Single Thread Indirect Branch Predictors (STIBP) * SPEC_CTRL MSR is available: YES * CPU indicates STIBP capability: YES * Enhanced IBRS (IBRS_ALL) * CPU indicates ARCH_CAPABILITIES MSR availability: NO * ARCH_CAPABILITIES MSR advertises IBRS_ALL capability: NO * CPU explicitly indicates not being vulnerable to Meltdown (RDCL_NO): NO * CPU microcode is known to cause stability problems: NO (model 63 stepping 2 ucode 0x3c) * CPU vulnerability to the three speculative execution attacks variants * Vulnerable to Variant 1: YES * Vulnerable to Variant 2: YES * Vulnerable to Variant 3: YES CVE-2017-5753 [bounds check bypass] aka 'Spectre Variant 1' * Mitigated according to the /sys interface: YES (kernel confirms that the mitigation is active) * Kernel has array_index_mask_nospec: NO * Kernel has the Red Hat/Ubuntu patch: YES > STATUS: NOT VULNERABLE (Mitigation: Load fences) CVE-2017-5715 [branch target injection] aka 'Spectre Variant 2' * Mitigated according to the /sys interface: YES (kernel confirms that the mitigation is active) * Mitigation 1 * Kernel is compiled with IBRS/IBPB support: YES * Currently enabled features * IBRS enabled for Kernel space: YES * IBRS enabled for User space: YES * IBPB enabled: YES * Mitigation 2 * Kernel compiled with retpoline option: YES * Kernel compiled with a retpoline-aware compiler: UNKNOWN > STATUS: NOT VULNERABLE (Mitigation: IBRS (kernel and user space)) CVE-2017-5754 [rogue data cache load] aka 'Meltdown' aka 'Variant 3' * Mitigated according to the /sys interface: YES (kernel confirms that the mitigation is active) * Kernel supports Page Table Isolation (PTI): YES * PTI enabled and active: YES * Running as a Xen PV DomU: NO > STATUS: NOT VULNERABLE (Mitigation: PTI) A false sense of security is worse than no security at all, see --disclaimer With manually enabled IBRS, I still can't deploy SHE as being blocked by: https://bugzilla.redhat.com/show_bug.cgi?id=1551289 https://bugzilla.redhat.com/show_bug.cgi?id=1551291
I've manually cast echo 2 > /sys/kernel/debug/x86/ibrs_enabled and deployed Node 0 over NFS on these components on host: [ INFO ] Hosted Engine successfully deployed ovirt-hosted-engine-ha-2.2.7-1.el7ev.noarch ovirt-hosted-engine-setup-2.2.13-1.el7ev.noarch rhvm-appliance-4.2-20180202.0.el7.noarch Linux 3.10.0-861.el7.x86_64 #1 SMP Wed Mar 14 10:21:01 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.5 (Maipo) Works for me, moving to verified.
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.