Bug 1877420 - f33 guest with cpu mode=host-model fails, mode=host-passthrough works, on AMD EPYC [NEEDINFO]
Summary: f33 guest with cpu mode=host-model fails, mode=host-passthrough works, on AMD...
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: libvirt
Version: 33
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
: 1891885 1993349 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-09 15:17 UTC by Artem
Modified: 2021-08-16 09:20 UTC (History)
25 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
dgilbert: needinfo? (jamartis)


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1869046 0 unspecified CLOSED Click on CPUs, Error refreshing hardware page: attrib must be dict, not NoneType 2021-02-22 00:41:40 UTC

Description Artem 2020-09-09 15:17:32 UTC
## Description of problem:
When click install Fedora nothing happens, Anaconda won't start due error.


## Version-Release number of selected component (if applicable):
anaconda-33.25.2-2.fc33


## How reproducible:
Run Anaconda in GUI or in terminal:

  sudo anaconda


## Actual results:
Starting installer, one moment...
anaconda 33.25.2-1.fc33 for anaconda bluesky (pre-release) started.
 * installation log files are stored in /tmp during the installation
 * shell is available on TTY2 and in second TMUX pane (ctrl+b, then press 2)
 * when reporting a bug add logs from /tmp as separate text/plain attachments
Anaconda received signal 04!.
/usr/lib64/python3.9/site-packages/pyanaconda/_isys.so(+0x13c7)[0x7fe7fa0d73c7]
/lib64/libc.so.6(+0x3dc50)[0x7fe8090d7c50]
/lib64/libgmp.so.10(__gmpn_sqr_basecase_zen+0xa0)[0x7fe7f967db00]
[New LWP 2739]
[New LWP 2740]
[New LWP 2834]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007fe809167d0f in wait4 () from /lib64/libc.so.6
Saved corefile /tmp/anaconda.core.2730
[Inferior 1 (process 2730) detached]
Killed


## Additional info:
Testing image https://kojipkgs.fedoraproject.org/compose/branched/latest-Fedora-33/compose/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-33-20200909.n.0.iso

Tested in virt-manager with default options for new VM.

Comment 1 Artem 2020-09-10 08:39:51 UTC
I suspect this related to this bug with virt-manager itself and new python 3.9 in f33 https://bugzilla.redhat.com/show_bug.cgi?id=1869046
Because on old one existed machines Anaconda starts without this issue.

Comment 2 Vladimír Slávik 2020-09-11 12:55:29 UTC
I can't reproduce using the latest stable F33 Live compose, and your link gives me 404. The Anaconda I have is anaconda-33.25.2-1.fc33 so that differs only in -1/-2. No idea how to proceed.

https://kojipkgs.fedoraproject.org/compose/branched/Fedora-33-20200910.n.0/compose/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-33-20200910.n.0.iso

Comment 3 Artem 2020-09-11 16:00:47 UTC
I've installed successfully as well it in gnome-boxes-3.37.90-1.fc33, but tried many times do the same in virt-manager and always same error. Very weird. We should readdress this virt-manager probably?

But still this looks very strange to me, i've tried install old f31 in virt-manager and have no issues with Anaconda.

Comment 4 Vladimír Slávik 2020-09-14 09:53:33 UTC
I agree, if that's the common part of the problem, switching to virt-manager...

Comment 5 Cole Robinson 2020-09-14 16:14:15 UTC
It probably has nothing to do with bug 1869046 which is just a virt-manager UI issue. That shouldn't trigger any different guest behavior

But if it works in gnome-boxes and not in virt-manager it could be down to how we configure the VMs differently.

Please provide:

gnome-boxes VM XML: virsh --connect qemu:///session dumpxml $VMNAME1 
virt-manager VM XML: virsh --connect qemu:///system dumpxml $VMNAME2

Comment 6 Artem 2020-09-14 17:21:50 UTC
Here is XML dump of gnome-boxes config: https://atim.fedorapeople.org/gnome-boxes.xml

Downloading now daily f33 build with 17KB/s so this will takes some time...

Comment 7 Artem 2020-09-15 08:05:03 UTC
Downloaded finally. I am still able to reproduce this with latest f33 image in virt-manager.

GNOME Boxes config: https://atim.fedorapeople.org/gnome-boxes.xml
Virt Manager config: https://atim.fedorapeople.org/virt-manager.xml

Comment 8 Cole Robinson 2020-09-15 21:36:18 UTC
(In reply to Artem from comment #7)
> Downloaded finally. I am still able to reproduce this with latest f33 image
> in virt-manager.
> 
> GNOME Boxes config: https://atim.fedorapeople.org/gnome-boxes.xml
> Virt Manager config: https://atim.fedorapeople.org/virt-manager.xml

My only guess is the CPU model. Can you try reproducing with virt-manager
again, but at the end of the New VM dialog, choose 'Customize' options,
navigate to CPU page, uncheck 'copy host CPU', and enter 'host-passthrough'
into the text field, click apply, then kick off the install.

Comment 9 Artem 2020-09-16 06:59:55 UTC
Yep, changing CPU model resolves this issue. With 'host-passthrough' i have error "Permission denied" but with 'kvm64' settings saved fine and Anaconda started in f33 without issue. This bug 1869046 still annoys when changing CPU settings for new machine. :)

Comment 10 Cole Robinson 2020-09-16 14:03:10 UTC
Okay thanks for checking. For reference here is the expanded virt-manager host-model CPU:

<cpu mode="custom" match="exact" check="full">
<model fallback="forbid">EPYC</model>
<vendor>AMD</vendor>
<feature policy="require" name="x2apic"/>
<feature policy="require" name="tsc-deadline"/>
<feature policy="require" name="hypervisor"/>
<feature policy="require" name="tsc_adjust"/>
<feature policy="require" name="arch-capabilities"/>
<feature policy="require" name="cmp_legacy"/>
<feature policy="require" name="xop"/>
<feature policy="require" name="fma4"/>
<feature policy="require" name="tbm"/>
<feature policy="require" name="perfctr_core"/>
<feature policy="require" name="virt-ssbd"/>
<feature policy="disable" name="npt"/>
<feature policy="disable" name="nrip-save"/>
<feature policy="require" name="rdctl-no"/>
<feature policy="require" name="skip-l1dfl-vmentry"/>
<feature policy="require" name="mds-no"/>
<feature policy="require" name="pschange-mc-no"/>
<feature policy="disable" name="monitor"/>
<feature policy="disable" name="rdrand"/>
<feature policy="disable" name="rdseed"/>
<feature policy="disable" name="adx"/>
<feature policy="disable" name="smap"/>
<feature policy="disable" name="clflushopt"/>
<feature policy="disable" name="sha-ni"/>
<feature policy="disable" name="xsavec"/>
<feature policy="disable" name="xgetbv1"/>
<feature policy="disable" name="svm"/>
<feature policy="require" name="topoext"/>
</cpu>

Comment 11 Cole Robinson 2020-09-16 14:04:47 UTC
Artem you mentioned a 'permission denied' error. Can you provide the full error? I'm not sure what permission issue should be hitting here

Also bug 1869046 should be fixed with the latest virt-manager build pushed yesterday. Probably won't affect this issue though

Comment 12 Artem 2020-09-16 18:50:26 UTC
(In reply to Cole Robinson from comment #11)
> Artem you mentioned a 'permission denied' error. Can you provide the full
> error? I'm not sure what permission issue should be hitting here

Sure. What i can tell at this moment: i tried now with 'host-passthrough' and no issue with permission denied. Also Anaconda starts OK with 'host-passthrough'. I didn't changed any settings since morning, but there was update though. Kernel was updated as well.

I'll keep watching and if this issue with permission denied appears again i'll try to catch output and provide a feedback. But this is unlikely selinux issue, no errors in 'ausearch'.

> Also bug 1869046 should be fixed with the latest virt-manager build pushed
> yesterday. Probably won't affect this issue though

Great, i'll try this update soon.

Comment 13 Artem 2020-09-18 15:57:06 UTC
I've tested this case in new Virt Manager 3.0 and tried to create new machine and start Anaconda in f33 and it won't start with default configuration, but with 'host-passthrough' Anaconda works fine. And no more this weird issue with permission denied.

Here is new XML dump from virt-manager 3.0: https://atim.fedorapeople.org/virt-manager-2.xml

Comment 14 Vladimír Slávik 2020-09-21 09:56:06 UTC
Is it just Anaconda? If you install the system, will it work but break if you change to default configuration?

Comment 15 Cole Robinson 2020-09-21 20:45:13 UTC
Artem can you provide /var/log/libvirt/qemu/$VMNAME.log as well?

I wonder if this is the same as https://bugzilla.redhat.com/show_bug.cgi?id=1854045
Which is a kernel issue affecting something close to your CPU.
Interested in trying kernel 5.9.0 from rawhide? dnf --enablerepos=\*rawhide update kernel\*

Comment 16 Artem 2020-09-25 15:08:11 UTC
(In reply to Cole Robinson from comment #15)
> Artem can you provide /var/log/libvirt/qemu/$VMNAME.log as well?

Log: https://atim.fedorapeople.org/fedora-2.log

> Which is a kernel issue affecting something close to your CPU.

To clarify better since i don't know this OK that mine CPU identified as EPYC: mine CPU was AMD Athlon X4 845.

I've upgraded now to Ryzen 3300X and everything is work in virt-manager out of box (same host Fedora system and installation). This CPU identified automatically as 'EPYC-IBPB'. Anaconda works. Tested with this ISO https://kojipkgs.fedoraproject.org/compose/branched/Fedora-33-20200925.n.0/compose/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-33-20200925.n.0.iso

(In reply to Vladimír Slávik from comment #14)
> Is it just Anaconda? If you install the system, will it work but break if
> you change to default configuration?

Great idea, but i can't test this now since i've upgraded to new CPU.

I think the problem only affects this AMD Athlon X4 845 CPU.

Comment 17 Dr. David Alan Gilbert 2020-09-25 17:41:48 UTC
Artem: As well as what Cole was asking for, can you show an   'lscpu' from both your host Athlon X4 and a guest running on it?

Comment 18 Dr. David Alan Gilbert 2020-09-25 17:43:26 UTC
(My reading is gmp uses mulx and adcx instructions that need the ADX and BMI2 cpu flags)

Comment 19 Dr. David Alan Gilbert 2020-09-25 17:52:28 UTC
although hmm, I suspect the code might be basing it's choice of instructions on cpu model/family rather than flags; which then breaks when it gets told it's an EPYC without those flags.

Comment 20 Dr. David Alan Gilbert 2020-09-25 17:54:31 UTC
Jakub:
  I had a quick scan of gmp's code; Am I right in thinking gmp decides on which CPU optimisation to use based on cpu family/model rather than checking the flags themselves?

Comment 21 Artem 2020-09-25 18:25:42 UTC
(In reply to Dr. David Alan Gilbert from comment #17)
> Artem: As well as what Cole was asking for, can you show an   'lscpu' from
> both your host Athlon X4 and a guest running on it?

It is really hard to do this right now for this old Athlon X4. :( Maybe we can find this info in fedora kernel tests (https://apps.fedoraproject.org/kerneltest/stats) at least for host? I did many tests on Athlon X4 CPU.

But i also found some my old notes about available instructions on this CPU:

gcc -march=native -E -v - </dev/null 2>&1 | grep cc1

 /usr/libexec/gcc/x86_64-redhat-linux/8/cc1 -E -quiet -v - -march=bdver4 -mmmx -mno-3dnow -msse -msse2 -msse3 -mssse3 -msse4a -mcx16 -msahf -mmovbe -maes -mno-sha -mpclmul -mpopcnt -mabm -mlwp -mfma -mfma4 -mxop -mbmi -mno-sgx -mbmi2 -mno-pconfig -mno-wbnoinvd -mtbm -mavx -mavx2 -msse4.2 -msse4.1 -mlzcnt -mno-rtm -mno-hle -mrdrnd -mf16c -mfsgsbase -mno-rdseed -mprfchw -mno-adx -mfxsr -mxsave -mxsaveopt -mno-avx512f -mno-avx512er -mno-avx512cd -mno-avx512pf -mno-prefetchwt1 -mno-clflushopt -mno-xsavec -mno-xsaves -mno-avx512dq -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-avx5124fmaps -mno-avx5124vnniw -mno-clwb -mmwaitx -mno-clzero -mno-pku -mno-rdpid -mno-gfni -mno-shstk -mno-avx512vbmi2 -mno-avx512vnni -mno-vaes -mno-vpclmulqdq -mno-avx512bitalg -mno-movdiri -mno-movdir64b --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=1024 -mtune=bdver4

Comment 22 Cole Robinson 2020-11-14 20:26:38 UTC
*** Bug 1891885 has been marked as a duplicate of this bug. ***

Comment 23 Phil Seymour 2021-04-23 22:32:39 UTC
So this is happening also on fedora silverblue 34 installed as a libvirt qemu/kvm guest
Host - rhel 7.9
Install image - Fedora-Silverblue-ostree-x86_64-34-1.2.iso
Kernel - 5.11.12-300.fc34.86_64
Host system - CPU: AMD A6-9200 RADEON R4, 5 COMPUTE CORES 2C+3G @ 2x 2GHz

Selecting host cpu pass-through gets around the problem though.

Please advise if you need any more info.

Comment 24 Vendula Poncova 2021-08-16 09:20:43 UTC
*** Bug 1993349 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.