Hide Forgot
See below internal error in RHEL-8.3 aarch64 kvm host. Sep 07 21:52:08 hp-moonshot-03-c03.lab.eng.rdu2.redhat.com libvirtd[32318]: hostname: hp-moonshot-03-c03.lab.eng.rdu2.redhat.com Sep 07 21:52:08 hp-moonshot-03-c03.lab.eng.rdu2.redhat.com libvirtd[32318]: this function is not supported by the connection driver: cannot detect host CPU model for aarch64 architecture Sep 07 21:52:08 hp-moonshot-03-c03.lab.eng.rdu2.redhat.com libvirtd[32318]: this function is not supported by the connection driver: cannot detect host CPU model for aarch64 architecture Sep 07 21:52:08 hp-moonshot-03-c03.lab.eng.rdu2.redhat.com libvirtd[32318]: internal error: Invalid unsigned integer value '-1' in file '/sys/devices/system/cpu/cpu0/topology/die_id' Sep 07 21:52:08 hp-moonshot-03-c03.lab.eng.rdu2.redhat.com libvirtd[32318]: Failed to query host NUMA topology, faking single NUMA node Sep 07 21:52:15 hp-moonshot-03-c03.lab.eng.rdu2.redhat.com systemd[1]: Listening on Virtual machine log manager socket. [root@hp-moonshot-03-c03 os_tests_result]# lscpu Architecture: aarch64 Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 1 Core(s) per socket: 2 Socket(s): 4 NUMA node(s): 1 Vendor ID: APM Model: 1 Model name: X-Gene Stepping: 0x0 BogoMIPS: 100.00 NUMA node0 CPU(s): 0-7 Flags: fp asimd evtstrm cpuid [root@hp-moonshot-03-c03 os_tests_result]# rpm -qa|grep libvirt-6 python3-libvirt-6.0.0-1.module+el8.3.0+6423+e4cb6418.aarch64 libvirt-6.0.0-27.module+el8.3.0+7602+4b93512e.aarch64 Below patch may be required to fix it on aarch64 platform. commit 0137bf0dab2738d5443e2f407239856e2aa25bb3 Author: Daniel Henrique Barboza <danielhb413> Date: Mon Mar 16 21:01:34 2020 -0300 virhostcpu.c: fix 'die_id' parsing for Power hosts Commit 7b79ee2f78 makes assumptions about die_id parsing in the sysfs that aren't true for Power hosts. In both Power8 and Power9, running 5.6 and 4.18 kernel respectively, 'die_id' is set to -1: $ cat /sys/devices/system/cpu/cpu0/topology/die_id -1 This breaks virHostCPUGetDie() parsing because it is trying to retrieve an unsigned integer, causing problems during VM start: virFileReadValueUint:4128 : internal error: Invalid unsigned integer value '-1' in file '/sys/devices/system/cpu/cpu0/topology/die_id' This isn't necessarily a PowerPC only behavior. Linux kernel commit 0e344d8c70 added in the former Documentation/cputopology.txt, now Documentation/admin-guide/cputopology.rst, that: To be consistent on all architectures, include/linux/topology.h provides default definitions for any of the above macros that are not defined by include/asm-XXX/topology.h: 1) topology_physical_package_id: -1 2) topology_die_id: -1 (...) This means that it might be expected that an architecture that does not implement the die_id element will mark it as -1 in sysfs. It is not required to change die_id implementation from uInt to Int because of that. Instead, let's change the parsing of the die_id in virHostCPUGetDie() to read an integer value and, in case it's -1, default it to zero like in case of file not found. This is enough to solve the issue Power hosts are experiencing. Fixes: 7b79ee2f78bbf2af76df2f6466919e19ae05aeeb Signed-off-by: Daniel Henrique Barboza <danielhb413> Reviewed-by: Michal Privoznik <mprivozn> Version-Release number of selected components (if applicable): RHEL Version: RHEL-8.3(4.18.0-235.el8.aarch64) How reproducible: 100% Steps to Reproduce: 1. Start a RHEL-8.3 aarch64 host 2. Install and enable libvirtd 3. Install a RHEL guest and check journal log. Actual results: Found internal error from libvirtd. Expected results: No such error from libvirtd. Additional info: - N/A
We don't have a die ID on AArch64, so I agree with the proposed solution in comment 0.
Daniel, would you please backport you upstream patch mentioned in comment 0? Seems like this is non-x86_64 problem. Thanks.
Hi, (In reply to Jaroslav Suchanek from comment #3) > Daniel, would you please backport you upstream patch mentioned in comment 0? > Seems like this is non-x86_64 problem. > > Thanks. Backport of upstream 0137bf0dab2738d544 to RHEL-8.3.0 was posted downstream.
Xiao, Would you take this as the QA contact and help verify it? Actually, I just realized that you reproduced this issue on RHEL-8.3 non-AV. For aarch64, only AV is supported for a single customer. Maybe it's good to have the fix anyways, but would you help checking if AV has this issue (probably not, due to the date of the upstream commit...) Thanks!
(In reply to Luiz Capitulino from comment #8) > Maybe it's good to have the fix anyways, but would you help > checking if AV has this issue (probably not, due to the date > of the upstream commit...) > The commit is already present in the rhel-av-8.3.0 tree. The issue shouldn't be reproduced with AV. Thanks, DHB
(In reply to Luiz Capitulino from comment #8) > Xiao, > > Would you take this as the QA contact and help verify it? yes. > > Actually, I just realized that you reproduced this issue on > RHEL-8.3 non-AV. For aarch64, only AV is supported for a single > customer. > > Maybe it's good to have the fix anyways, but would you help > checking if AV has this issue (probably not, due to the date > of the upstream commit...) > AV does not have this issue. # rpm -qa|grep libvirt-6 python3-libvirt-6.0.0-1.module+el8.3.0+6423+e4cb6418.aarch64 libvirt-6.6.0-6.module+el8.3.0+8125+aefcf088.aarch64 # journalctl -u libvirtd -- Logs begin at Thu 2020-10-15 22:10:05 EDT, end at Fri 2020-10-16 02:10:24 EDT. -- Oct 16 02:10:19 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com systemd[1]: Starting Virtualization daemon... Oct 16 02:10:19 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com systemd[1]: Started Virtualization daemon. Oct 16 02:10:20 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com dnsmasq[1965]: listening on virbr0(#3): 192.168.122.1 Oct 16 02:10:20 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com dnsmasq[1984]: started, version 2.79 cachesize 150 Oct 16 02:10:20 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com dnsmasq[1984]: compile time options: IPv6 GNU-getopt DBus no-i18n IDN2 DHCP DHCPv6 no-Lua TFTP no-conntrack ipse> Oct 16 02:10:20 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com dnsmasq-dhcp[1984]: DHCP, IP range 192.168.122.2 -- 192.168.122.254, lease time 1h Oct 16 02:10:20 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com dnsmasq-dhcp[1984]: DHCP, sockets bound exclusively to interface virbr0 Oct 16 02:10:20 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com dnsmasq[1984]: reading /etc/resolv.conf Oct 16 02:10:20 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com dnsmasq[1984]: using nameserver 10.19.42.41#53 Oct 16 02:10:20 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com dnsmasq[1984]: using nameserver 10.11.5.19#53 Oct 16 02:10:20 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com dnsmasq[1984]: using nameserver 10.5.30.160#53 Oct 16 02:10:20 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com dnsmasq[1984]: read /etc/hosts - 2 addresses Oct 16 02:10:20 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com dnsmasq[1984]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses Oct 16 02:10:20 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com dnsmasq-dhcp[1984]: read /var/lib/libvirt/dnsmasq/default.hostsfile Oct 15 22:12:26 ampere-hr330a-05.khw4.lab.eng.bos.redhat.com systemd[1]: libvirtd.service: Succeeded.
Tested pass in RHEL-8.4.0-20201222.n.0, so move it to 'VERIFIED'. # rpm -qa |grep libvirt-6 python3-libvirt-6.0.0-1.module+el8.3.0+6423+e4cb6418.aarch64 libvirt-6.0.0-32.module+el8.4.0+9172+b707c649.aarch64 # uname -r 4.18.0-265.el8.aarch64 # cat /sys/devices/system/cpu/cpu0/topology/die_id -1 Dec 23 03:19:25 ampere-hr330a-04.khw4.lab.eng.bos.redhat.com libvirtd[22765]: libvirt version: 6.0.0, package: 32.module+el8.4.0+9172+b707c649 (Red Hat, Inc. <http://bugz> Dec 23 03:19:25 ampere-hr330a-04.khw4.lab.eng.bos.redhat.com libvirtd[22765]: hostname: ampere-hr330a-04.khw4.lab.eng.bos.redhat.com Dec 23 03:19:25 ampere-hr330a-04.khw4.lab.eng.bos.redhat.com libvirtd[22765]: this function is not supported by the connection driver: cannot detect host CPU model for aa> Dec 23 03:19:25 ampere-hr330a-04.khw4.lab.eng.bos.redhat.com libvirtd[22765]: this function is not supported by the connection driver: cannot detect host CPU model for aa> Dec 23 03:19:26 ampere-hr330a-04.khw4.lab.eng.bos.redhat.com systemd[1]: Listening on Virtual machine log manager socket.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:1762