Bug 1265575 - Boot a guest with numa will case a warning at "arch/x86/kernel/smpboot.c"(need do "abrt-cli list" to check) in guest
Boot a guest with numa will case a warning at "arch/x86/kernel/smpboot.c"(nee...
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm-rhev (Show other bugs)
7.2
x86_64 Linux
medium Severity medium
: rc
: ---
Assigned To: Eduardo Habkost
Virtualization Bugs
:
Depends On:
Blocks: 1266163
  Show dependency treegraph
 
Reported: 2015-09-23 05:16 EDT by Yanan Fu
Modified: 2016-03-28 05:33 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1266163 (view as bug list)
Environment:
Last Closed: 2015-09-23 18:39:36 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Yanan Fu 2015-09-23 05:16:00 EDT
Description of problem:
Boot a guest with numa:
-cpu Opteron_G5,enforce \
-m 4G \
-smp 8,cores=8,threads=1,sockets=1 \
-object memory-backend-file,prealloc=yes,size=1024M,id=ram-node0,host-nodes=0,policy=bind,mem-path=/mnt/hugetlbfs \
-numa node,nodeid=0,cpus=0,cpus=1,memdev=ram-node0 \
-object memory-backend-file,prealloc=yes,size=1024M,id=ram-node1,host-nodes=1,policy=bind,mem-path=/mnt/hugetlbfs \
-numa node,nodeid=1,cpus=2,cpus=3,memdev=ram-node1 \
-object memory-backend-file,prealloc=yes,size=1024M,id=ram-node2,host-nodes=2,policy=bind,mem-path=/mnt/hugetlbfs \
-numa node,nodeid=2,cpus=4,cpus=5,memdev=ram-node2 \
-object memory-backend-file,prealloc=yes,size=1024M,id=ram-node3,host-nodes=3,policy=bind,mem-path=/mnt/hugetlbfs \
-numa node,nodeid=3,cpus=6,cpus=7,memdev=ram-node3 \

login the guest with "root", then it will hint:

ABRT has detected 1 problem(s). For more info run: abrt-cli list --since 1442989197

#abrt-cli list
id 8ae0e3fceb82907c303ceb28a34a672c20bd8eb7
reason:         WARNING: at arch/x86/kernel/smpboot.c:283 topology_sane.isra.3+0x80/0x90()
time:           Wed 23 Sep 2015 02:24:26 PM CST
cmdline:        BOOT_IMAGE=/vmlinuz-3.10.0-316.el7.x86_64 root=/dev/mapper/rhel-root ro crashkernel=auto rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap rhgb quiet console=tty0 console=ttys0,115200 softlockup_panic=1
uid:            0
Directory:      /var/spool/abrt/oops-2015-09-23-14:24:26-894-0
Run 'abrt-cli report /var/spool/abrt/oops-2015-09-23-14:24:26-894-0' for creating a case in Red Hat Customer Portal

Version-Release number of selected component (if applicable):
host: kernel:3.10.0-316.el7.x86_64
      qemu-kvm-rhel:qemu-kvm-rhev-2.3.0-24.el7.x86_64
guest:kernel:3.10.0-316.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.boot a guest with numa
2.login the guest with "root" (if not, will can't see this info)
3.guest will hint "ABRT has detected 1 problem(s). For more info run: abrt-cli list --since 1442989197"

Actual results:
ABRT has detected 1 problem(s). For more info run: abrt-cli list --since 1442989197

Expected results:
no this waring

Additional info:
Test with other environment:
1.change host qemu version.
  host qemu:qemu-kvm-rhev-2.1.2-23.el7_1.9 
  guest kernel:kernel:3.10.0-316.el7.x86_64
     ---->have the same issue
2.change guest kernel version.
  host qemu:qemu-kvm-rhev-2.3.0-24.el7.x86_64
  guest kernel:kernel:3.10.0-229.12.1.el7_1.1246514.x86_64
     ---->have the same issue
3.boot guest without numa, it is ok.
   -cpu Opteron_G5,enforce  \
   -m 4G \
   -smp 8,cores=8,threads=1,sockets=1 \
  
command line:
/usr/libexec/qemu-kvm -name rhel7 -machine pc,accel=kvm,usb=off -realtime mlock=off -cpu Opteron_G5,enforce -sandbox off -m 4G -smp 8,cores=8,threads=1,sockets=1 -object memory-backend-file,prealloc=yes,size=1024M,id=ram-node0,host-nodes=0,policy=bind,mem-path=/mnt/hugetlbfs -numa node,nodeid=0,cpus=0,cpus=1,memdev=ram-node0 -object memory-backend-file,prealloc=yes,size=1024M,id=ram-node1,host-nodes=1,policy=bind,mem-path=/mnt/hugetlbfs -numa node,nodeid=1,cpus=2,cpus=3,memdev=ram-node1 -object memory-backend-file,prealloc=yes,size=1024M,id=ram-node2,host-nodes=2,policy=bind,mem-path=/mnt/hugetlbfs -numa node,nodeid=2,cpus=4,cpus=5,memdev=ram-node2 -object memory-backend-file,prealloc=yes,size=1024M,id=ram-node3,host-nodes=3,policy=bind,mem-path=/mnt/hugetlbfs -numa node,nodeid=3,cpus=6,cpus=7,memdev=ram-node3 -boot order=c,menu=on,splash-time=3000,strict=on -device ich9-usb-ehci1,id=usb0,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb0.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb0.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb0.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-scsi-pci,id=scsi0,cmd_per_lun=234,bus=pci.0,addr=0x8 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x7 -drive file=/root/rhel-7.2.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,discard=unmap,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x9,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -device usb-tablet,id=usb-tablet1 -monitor stdio -qmp tcp:0:4444,server,nowait -serial unix:/tmp/ttym,server,nowait -spice port=5901,addr=0.0.0.0,seamless-migration=on,disable-ticketing -k en-us -device qxl-vga,id=video0,ram_size=134217728,vram_size=67108864,vgamem_mb=16,bus=pci.0,addr=0x2 -device pci-bridge,bus=pci.0,id=bridge1,chassis_nr=1,addr=0xc -netdev tap,id=hostnet0,vhost=on,id=hostnet0,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=b6:af:42:b8:46:18,bus=bridge1,addr=0x14 -rtc base=utc,clock=host,driftfix=slew -device vfio-pci,host=47:04.0,id=vf-04-0
Comment 2 Eduardo Habkost 2015-09-23 18:39:36 EDT
This is just the guest kernel telling you that the NUMA topology you created does not make sense. Linux expects all cores and threads inside each socket to appear in the same NUMA node. In your configuration you have 1 8-core socket, but 4 NUMA nodes.
Comment 3 Eduardo Habkost 2015-09-23 18:52:32 EDT
This is the 5th time the same invalid configuration is reported as a bug. See:
* Bug 1101116
* Bug 1095203
* Bug 1066286
* Bug 756903

We need to update our test plans and documentation so we stop trying to boot VMs with NUMA configurations that don't make sense, and reporting invalid bugs when the guest (correctly) complains about it.
Comment 4 juzhang 2015-09-23 19:54:03 EDT
(In reply to Eduardo Habkost from comment #3)
> This is the 5th time the same invalid configuration is reported as a bug.
> See:
> * Bug 1101116
> * Bug 1095203
> * Bug 1066286
> * Bug 756903
> 
> We need to update our test plans and documentation so we stop trying to boot
> VMs with NUMA configurations that don't make sense, and reporting invalid
> bugs when the guest (correctly) complains about it.

Thanks Eduardo.

Hi Xiangchun,

Could you double check whether our test case? Please update it if something wrong.

Best Regards,
Junyi
Comment 5 juzhang 2015-09-23 22:24:13 EDT
(In reply to Eduardo Habkost from comment #2)
> This is just the guest kernel telling you that the NUMA topology you created
> does not make sense. Linux expects all cores and threads inside each socket
> to appear in the same NUMA node. In your configuration you have 1 8-core
> socket, but 4 NUMA nodes.

Thanks Eduardo. QE could highlight in our test plan about "all cores and threads inside each socket to appear in the same NUMA node. Can not be assigned in different Numa node". How about our customer, they might assign the cores in same socket into different numa node as well. We know it's impossible in physical host. But indeed could do it in VM. We might need to tell customer clearly in release note. Or, it is possible to stop vm boot if cores and threads inside each socket to assigned in different numa node?

Best Regards,
Junyi
Comment 6 Yanan Fu 2015-09-23 22:52:49 EDT
(In reply to Eduardo Habkost from comment #2)
> This is just the guest kernel telling you that the NUMA topology you created
> does not make sense. Linux expects all cores and threads inside each socket
> to appear in the same NUMA node. In your configuration you have 1 8-core
> socket, but 4 NUMA nodes.

with following command line, it is ok.

-cpu Opteron_G5,enforce \
-m 4G \
-smp 8,cores=2,threads=1,sockets=4 \
-object memory-backend-file,prealloc=yes,size=1024M,id=ram-node0,host-nodes=0,policy=bind,mem-path=/mnt/hugetlbfs \
-numa node,nodeid=0,cpus=0,cpus=1,memdev=ram-node0 \
-object memory-backend-file,prealloc=yes,size=1024M,id=ram-node1,host-nodes=1,policy=bind,mem-path=/mnt/hugetlbfs \
-numa node,nodeid=1,cpus=2,cpus=3,memdev=ram-node1 \
-object memory-backend-file,prealloc=yes,size=1024M,id=ram-node2,host-nodes=2,policy=bind,mem-path=/mnt/hugetlbfs \
-numa node,nodeid=2,cpus=4,cpus=5,memdev=ram-node2 \
-object memory-backend-file,prealloc=yes,size=1024M,id=ram-node3,host-nodes=3,policy=bind,mem-path=/mnt/hugetlbfs \
-numa node,nodeid=3,cpus=6,cpus=7,memdev=ram-node3 \

with 4 2-core sockets for 4 NUMA nodes.
Comment 7 FuXiangChun 2015-09-24 03:30:06 EDT
In order to prevent tester to use wrong numa topology in qemu-kvm command line. I will highlight this tips in numa test cases/test plan.
Comment 8 juzhang 2015-09-24 05:08:32 EDT
Hi Eduardo,

Please have a look comment5?

Best Regards,
Junyi
Comment 9 Eduardo Habkost 2015-09-24 11:56:04 EDT
(In reply to juzhang from comment #5)
> (In reply to Eduardo Habkost from comment #2)
> > This is just the guest kernel telling you that the NUMA topology you created
> > does not make sense. Linux expects all cores and threads inside each socket
> > to appear in the same NUMA node. In your configuration you have 1 8-core
> > socket, but 4 NUMA nodes.
> 
> Thanks Eduardo. QE could highlight in our test plan about "all cores and
> threads inside each socket to appear in the same NUMA node. Can not be
> assigned in different Numa node". How about our customer, they might assign
> the cores in same socket into different numa node as well. We know it's
> impossible in physical host. But indeed could do it in VM. We might need to
> tell customer clearly in release note.

We can document that, yes, but I believe the guest-side error message is very clear and will point the customer to the custom CPU+NUMA topology they have created:
    "sched: CPU #%d's %s-sibling CPU #%d is not on the same node! "
                "[node: %d != %d]. Ignoring dependency.\n",


> Or, it is possible to stop vm boot if
> cores and threads inside each socket to assigned in different numa node?

Preventing such configuration is risky, because it may break existing VMs.

Having a Linux guest complaining about the NUMA topology is harmless, but having a VM that suddenly can't be migrated anywhere after an upgrade would cause real issues for customers.
Comment 10 juzhang 2015-09-24 21:30:42 EDT
Make sense and thanks for your explanation and creating doc bz.

Best Regards,
Junyi

Note You need to log in before you can comment on or make changes to this bug.