Bug 1806857

Summary: "An error occurred, but the cause is unknown" raised when starting VM with non-existed numa node in <numatune>
Product: Red Hat Enterprise Linux 8 Reporter: jiyan <jiyan>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Jing Qi <jinqi>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.2CC: dyuan, jdenemar, jsuchane, lcong, lmen, mprivozn, pkrempa, virt-maint, xuzhang, yalzhang
Target Milestone: rcKeywords: Reopened, Triaged, Upstream
Target Release: 8.0Flags: pm-rhel: mirror+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-7.8.0-1.module+el8.6.0+12978+7d7a0321 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2001390 (view as bug list) Environment:
Last Closed: 2022-05-10 13:18:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version: 6.8.0
Embargoed:
Bug Depends On: 1724866    
Bug Blocks: 2001390    

Description jiyan 2020-02-25 07:58:36 UTC
Description of problem:
"An error occurred, but the cause is unknown" raised when starting VM with non-existed numa node in <numatune>

Version-Release number of selected component (if applicable):
kernel-4.18.0-167.el8.x86_64
qemu-kvm-2.12.0-99.module+el8.2.0+5827+8c39933c.x86_64
libvirt-4.5.0-40.module+el8.2.0+5761+d16d25e7.x86_64

How reproducible:
100%

Steps to Reproduce:
# virsh domstate avocado-vt-vm1
shut off

# virsh numatune avocado-vt-vm1
numa_mode      : strict
numa_nodeset   : 0,41

# virsh start avocado-vt-vm1
error: Failed to start domain avocado-vt-vm1
error: An error occurred, but the cause is unknown

# numactl --hard
available: 4 nodes (0-3)
node 0 cpus: 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60
node 0 size: 31714 MB
node 0 free: 29560 MB
node 1 cpus: 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61
node 1 size: 32196 MB
node 1 free: 31555 MB
node 2 cpus: 2 6 10 14 18 22 26 30 34 38 42 46 50 54 58 62
node 2 size: 0 MB
node 2 free: 0 MB
node 3 cpus: 3 7 11 15 19 23 27 31 35 39 43 47 51 55 59 63
node 3 size: 0 MB
node 3 free: 0 MB
node distances:
node   0   1   2   3 
  0:  10  16  16  16 
  1:  16  10  16  16 
  2:  16  16  10  16 
  3:  16  16  16  10 

Actual results:
Unreasonable err raised

Expected results:
There should be reasonable err for this scenario

Additional info:
For RHEL-820fast
# rpm -qa libvirt 
libvirt-6.0.0-6.module+el8.2.0+5821+109ee33c.x86_64

# virsh start test82-1 
error: Failed to start domain test82-1
error: unsupported configuration: NUMA node 40 is unavailable

Comment 2 Michal Privoznik 2021-02-15 12:55:19 UTC
Merged upstream as:

9e0d4b9240 virnuma: Report error when NUMA -> CPUs translation fails

v6.7.0-86-g9e0d4b9240

Comment 3 Jiri Denemark 2021-03-05 19:16:09 UTC
Oops, this is for RHEL. This was addressed in bug 1724866 for RHEL-AV 8.3.0.

Comment 5 RHEL Program Management 2021-08-25 07:27:06 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.

Comment 6 Michal Privoznik 2021-08-25 08:01:48 UTC
Le sigh. This bug is fixed and we're just waiting for RHEL to pick up rebased version. There's no reason to close this bug!

Comment 8 Jing Qi 2021-10-18 02:14:25 UTC
Tested with libvirt-daemon-7.8.0-1.module+el8.6.0+12982+5e169f40.x86_64 &
qemu-kvm-6.1.0-3.module+el8.6.0+12982+5e169f40.x86_64

In a machine with 8 numa nodes -
available: 8 nodes (0-7)
node 0 cpus: 0 1 16 17
node 0 size: 15731 MB
node 0 free: 13377 MB
node 1 cpus: 2 3 18 19
node 1 size: 0 MB
node 1 free: 0 MB
node 2 cpus: 4 5 20 21
node 2 size: 0 MB
node 2 free: 0 MB
node 3 cpus: 6 7 22 23
node 3 size: 0 MB
node 3 free: 0 MB
node 4 cpus: 8 9 24 25
node 4 size: 16062 MB
node 4 free: 14859 MB
node 5 cpus: 10 11 26 27
node 5 size: 0 MB
node 5 free: 0 MB
node 6 cpus: 12 13 28 29
node 6 size: 0 MB
node 6 free: 0 MB
node 7 cpus: 14 15 30 31
node 7 size: 0 MB
node 7 free: 0 MB
node distances:
node   0   1   2   3   4   5   6   7 
  0:  10  16  16  16  32  32  32  32 
  1:  16  10  16  16  32  32  32  32 
  2:  16  16  10  16  32  32  32  32 
  3:  16  16  16  10  32  32  32  32 
  4:  32  32  32  32  10  16  16  16 
  5:  32  32  32  32  16  10  16  16 
  6:  32  32  32  32  16  16  10  16 
  7:  32  32  32  32  16  16  16  10 


1. Set node 0 to non-exist physical numa node '8'
# virsh numatune avocado-vt-vm1 0 8

2. Tried to start vm and failed with the error message -
# virsh start avocado-vt-vm1
error: Failed to start domain 'avocado-vt-vm1'
error: operation failed: NUMA node 8 is not available

Comment 11 liang cong 2022-03-01 05:46:19 UTC
Tested on:
# rpm -q libvirt qemu-kvm
libvirt-8.0.0-5.module+el8.6.0+14344+04da0821.x86_64
qemu-kvm-6.2.0-8.module+el8.6.0+14324+050a5215.x86_64

on a machine with 1 numa node:
# numactl --hard
available: 1 nodes (0)
node 0 cpus: 0 1
node 0 size: 3732 MB
node 0 free: 912 MB
node distances:
node   0 
  0:  10 

define a vm with non-existed numa node in <numatune>:
# virsh numatune qcow2_test
numa_mode      : strict
numa_nodeset   : 0,41


start the vm, the error message change to:
# virsh start qcow2_test
error: Failed to start domain 'qcow2_test'
error: Invalid value '0,41' for 'cpuset.mems': Invalid argument

Comment 13 errata-xmlrpc 2022-05-10 13:18:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1759