1200685 – RHEL6 64bit guest hangs during boot on 7.2 host when default VCPU->NUMA mapping is used

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1200685 - RHEL6 64bit guest hangs during boot on 7.2 host when default VCPU->NUMA mapping is used

Summary: RHEL6 64bit guest hangs during boot on 7.2 host when default VCPU->NUMA mappi...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	qemu-kvm-rhev
Sub Component:
Version:	7.2
Hardware:	x86_64
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Igor Mammedov
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-03-11 07:51 UTC by Yanhui Ma
Modified:	2015-12-04 16:31 UTC (History)
CC List:	9 users (show)
Fixed In Version:	upstream 2.3
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-12-04 16:31:59 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
dmesg (29.44 KB, text/plain) 2015-03-11 07:51 UTC, Yanhui Ma	no flags	Details
host cpu info (35.53 KB, text/plain) 2015-03-11 07:57 UTC, Yanhui Ma	no flags	Details
guest full console logs (49.26 KB, text/plain) 2015-03-12 02:36 UTC, Yanhui Ma	no flags	Details
host full dmesg (85.10 KB, text/plain) 2015-03-12 02:38 UTC, Yanhui Ma	no flags	Details
RHEL6.6 guest full console logs (56.38 KB, text/plain) 2015-03-13 03:47 UTC, Yanhui Ma	no flags	Details
call trace info (15.40 KB, image/png) 2015-07-27 07:11 UTC, Yanhui Ma	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2015:2546	0	normal	SHIPPED_LIVE	qemu-kvm-rhev bug fix and enhancement update	2015-12-04 21:11:56 UTC

Description Yanhui Ma 2015-03-11 07:51:27 UTC

Created attachment 1000284 [details]
dmesg

Description of problem:
when I boot a RHEL6.7 64bit guest on 7.2 host and hotplug 17 cpus(sometimes 25,to reproduct it you can increase it) in guest,call trace info appears or guest hangs,there are only 4 cpus in /proc/cpuinfo.

Version-Release number of selected component (if applicable):

qemu-kvm-rhev-2.2.0-5.el7.x86_64
3.10.0-230.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1.boot a guest with the following qemu command:
/usr/libexec/qemu-kvm -M pc-i440fx-rhel7.1.0 -m 4G 
-cpu Opteron_G3,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_time 
-smp 1,sockets=60,cores=4,threads=1,maxcpus=240 \

-monitor stdio -vga qxl -spice port=5900,disable-ticketing \
 
-drive file=/home/RHEL-Server-6.7-64-virtio.qcow2,if=none,id=drive-data-disk1,cache=none,format=qcow2,aio=threads,werror=stop,rerror=stop -device ide-drive,drive=drive-data-disk1,id=data-disk1 \
 
-netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device e1000,netdev=hostnet0,id=virtio-net-pci0,mac=00:24:21:7f:b6:11,bus=pci.0,addr=0x9\
 
-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 
-global PIIX4_PM.disable_s3=0 -global PIIX4_PM.disable_s4=0 \

-object memory-backend-ram,host-nodes=0,id=mem-0,policy=bind,prealloc=yes,size=2G -numa node,nodeid=0,memdev=mem-0 -object memory-backend-ram,host-nodes=0,id=mem-1,policy=bind,prealloc=yes,size=2G -numa node,nodeid=1,memdev=mem-1 \

-usb -device usb-tablet,id=input0 -qmp tcp:0:5559,server,nowait
2. hotplug 17 vcpus
3.

Actual results:
Call trace info appears, there are only 4 cpus in /proc/cpuinfo,and guest hangs

Expected results:
no call trace info,guest works well and gets 17 cpus 

Additional info:
The problem cannot be hitted with qemu-kvm-rhev-2.1.2-23.el7 and qemu-kvm-1.5.3-86.el7.
There are 32 cpus in host.The cpu info of host is attached.

Comment 1 Yanhui Ma 2015-03-11 07:57:12 UTC

Created attachment 1000288 [details]
host cpu info

Comment 3 FuXiangChun 2015-03-11 08:03:18 UTC

As qemu-kvm-rhev-2.1.2-23.el7 didn't hit this issue.  So set this bug is as regression bug,  If I am wrong, please remove regression from keywords. 

Another, 7.1 works well.

Comment 4 FuXiangChun 2015-03-11 08:05:27 UTC

sorry, correct it comment 3,  7.1/7.1 guest.

Comment 6 Igor Mammedov 2015-03-11 12:57:23 UTC

FuXiangChun,

Please always attach full console/demesg logs from guest/host to BZ.

Also
could you please reproduce issue, leave it in hang state and provide access to host where it happened.

Comment 7 Yanhui Ma 2015-03-12 02:36:57 UTC

Created attachment 1000749 [details]
guest full console logs

Comment 8 Yanhui Ma 2015-03-12 02:38:59 UTC

Created attachment 1000750 [details]
host full dmesg

Comment 10 Igor Mammedov 2015-03-12 16:28:39 UTC

Yanhui Ma,

Could you try to reproduce bug with RHEL6.6 guest kernel?

Comment 11 Yanhui Ma 2015-03-13 03:46:01 UTC

(In reply to Igor Mammedov from comment #10)
> Yanhui Ma,
> 
> Could you try to reproduce bug with RHEL6.6 guest kernel?

I have reproduced bug with RHEL6.6 guest kernel.The guest hang is not 100% reproduced, call trace and only 4 cpus successfully hotpluged  can be 100% reproduced. Another I have attached the RHEL6.6 guest full console logs.
You can access the host to see it according to comment 9.

Comment 12 Yanhui Ma 2015-03-13 03:47:40 UTC

Created attachment 1001271 [details]
RHEL6.6 guest full console logs

Comment 13 Igor Mammedov 2015-03-17 16:36:47 UTC

So here goes my analysis:

 1. I wasn't able to reproduce bug locally (probably due to lack of effort)
    but it's reproducible reliably on host in comment 9

 2. issue has nothing to do with cpu-hotplug, regular boot is affected as well when CPU count 5 and we have 2 NUMA nodes, here is striped down reproducer:

 qemu-kvm -m 4G -smp 5,sockets=1,cores=4,threads=1,maxcpus=8 -numa node,nodeid=0 -numa node,nodeid=1 -drive file=/home/rhel66-64-virtio.qcow2,if=virtio

 3. CPU[0] hangs in smp_call_function_many() waiting on execution of call on CPU[4], but CPU[4] loops in update_sd_lb_stats() due to incorrectly initialized
sched_groups looping on condition "while (sg != sd->groups)" since the last 'sg->next' instead of pointing to head 'sd->groups' points to itself.

crash> bt -a
PID: 1      TASK: ffff88007f307500  CPU: 0   COMMAND: "swapper"
    [exception RIP: smp_call_function_many+482]
[...]
 #0 [ffff88007f309ce8] smp_call_function at ffffffff810c7c22
 #1 [ffff88007f309cf8] on_each_cpu at ffffffff8108e1fd
 #2 [ffff88007f309d28] do_tune_cpucache at ffffffff8118186b
 #3 [ffff88007f309d98] enable_cpucache at ffffffff81181d36
 #4 [ffff88007f309dc8] setup_cpu_cache at ffffffff8150fcc2
 #5 [ffff88007f309e08] kmem_cache_create at ffffffff81182232
 #6 [ffff88007f309eb8] shmem_init at ffffffff81c4f27e
 #7 [ffff88007f309ed8] kernel_init at ffffffff81c29ef6
 #8 [ffff88007f309f48] kernel_thread at ffffffff8100c1ca
[...]

PID: 27     TASK: ffff88013f28eae0  CPU: 4   COMMAND: "events/4"
[...]
 #0 [ffff88013f295a58] update_sd_lb_stats at ffffffff8106fc8d
 #1 [ffff88013f295b38] find_busiest_group at ffffffff8106fe7a
 #2 [ffff88013f295c18] load_balance_newidle at ffffffff81070a57
 #3 [ffff88013f295d08] idle_balance at ffffffff8107e38e
 #4 [ffff88013f295d88] schedule at ffffffff815304e0
 #5 [ffff88013f295e38] worker_thread at ffffffff810aa640
 #6 [ffff88013f295ee8] kthread at ffffffff810af4d0
 #7 [ffff88013f295f48] kernel_thread at ffffffff8100c1ca

 4. Issue is caused by commit:
     dd0247e0 pc: acpi: mark all possible CPUs as enabled in SRAT
    which makes kernel actually use QEMU supplied numa mapping in SRAT for vCPUs, before that commit guest kernel was discarding CPU related SRAT info.

 5. Problem is that QEMU by default distributes vCPUs among NUMA nodes in round-robin order, which leads to insane topology where VCPU threads from one socket(package) end up in different NUMA nodes.

Setting cpus to node mapping manually with sane topology (i.e. threads from the same socket are on the same node), makes bug go away:
 -numa node,nodeid=0,cpus=0-3 -numa node,nodeid=1,cpus=4-7

Comment 14 Igor Mammedov 2015-03-17 16:52:10 UTC

 Yanhui Ma,

1. Could you test if problem affects RHEL7, RHEL5 and Windows guests?

2. Also after #1 try following scratch build with fix:
https://brewweb.devel.redhat.com/taskinfo?taskID=8861393

Comment 15 Igor Mammedov 2015-03-19 10:14:20 UTC

to reproduce on Intel host following options have to be added to reproducer from comment 13:
 -cpu Opteron_G3,vendor=AuthenticAMD

Comment 16 Igor Mammedov 2015-03-19 10:34:15 UTC

Upstream fix posted:
https://lists.gnu.org/archive/html/qemu-devel/2015-03/msg04008.html

Comment 18 Igor Mammedov 2015-03-19 13:04:08 UTC

Tested with different Windows versions, it's 90% reproducible with WS2012x64,
also seen it with WS2012R2x64 but only once.

Windows fails to boot going into reboot cycle with C4 error.

Comment 19 Yanhui Ma 2015-03-20 04:51:45 UTC

(In reply to Igor Mammedov from comment #14)
>  Yanhui Ma,
> 
> 1. Could you test if problem affects RHEL7, RHEL5 and Windows guests?
> 
Steps to Reproduce:
1. boot a HELR7.1/RHEL5.11/win-server-2008r2 guest with following qemu command line:

 /usr/libexec/qemu-kvm -M pc-i440fx-rhel7.1.0 -m 4G -cpu Opteron_G3,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_time -smp 5,sockets=1,cores=4,threads=1,maxcpus=8 -numa node,nodeid=0 -numa node,nodeid=1 \

-monitor stdio -vga qxl -spice port=5900,disable-ticketing  \

-drive file=/home/rhel59-64-virtio.qcow2,if=none,id=drive-data-disk1,cache=none,format=qcow2,aio=threads,werror=stop,rerror=stop -device ide-drive,drive=drive-data-disk1,id=data-disk1 \

-netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device e1000,netdev=hostnet0,id=virtio-net-pci0,mac=00:24:21:7f:b6:11,bus=pci.0,addr=0x9 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -usb -device usb-tablet,id=input0


results:

RHEL5.11 and RHEL6.7 hit the issue, RHEL7.1 and win-server-2008r2 does not hit the issue.

> 2. Also after #1 try following scratch build with fix:
> https://brewweb.devel.redhat.com/taskinfo?taskID=8861393


test again with the above fixed build, RHEL5.11,RHEL6.7,RHEL7.1 and win-server-2008r2 can not hit the issue.

Comment 20 Igor Mammedov 2015-03-20 13:51:03 UTC

Fixed upstream in 2.3

fb43b73 pc: fix default VCPU to NUMA node mapping
57924bc numa: introduce machine callback for VCPU to node mapping

Please retest when qemu-kvm-rhev is rebased to 2.3 version.

Comment 23 Yanhui Ma 2015-07-27 07:09:45 UTC

Reproduce this issue.

host info:
qemu-kvm-rhev-2.2.0-8.el7.x86_64
3.10.0-230.el7.x86_64

Steps to Reproduce:
1. boot a RHEL6.7/win2012r2x64 guest with following qemu command line:

 /usr/libexec/qemu-kvm -M pc-i440fx-rhel7.1.0 -m 4G -cpu Opteron_G3,hv_relaxed,hv_vapic,hv_spinlocks=0x1fff,hv_time -smp 5,sockets=1,cores=4,threads=1,maxcpus=8 -numa node,nodeid=0 -numa node,nodeid=1 \

-monitor stdio -vga qxl -spice port=5900,disable-ticketing  \

-drive file=/home/RHEL-Server-6.7-64-virtio.qcow2,if=none,id=drive-data-disk1,cache=none,format=qcow2,aio=threads,werror=stop,rerror=stop -device ide-drive,drive=drive-data-disk1,id=data-disk1 \

-netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device e1000,netdev=hostnet0,id=virtio-net-pci0,mac=00:24:21:7f:b6:11,bus=pci.0,addr=0x9 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -usb -device usb-tablet,id=input0

results:
rhel6.7 guest hit call trace info during boot,see attachment, win2012r2x64 guest does not hit the issue.

#########################################################

Verified this issue.
host info:
3.10.0-297.el7.x86_64
qemu-kvm-rhev-2.3.0-12.el7.x86_64

steps is the same as above

Result:
qemu prints out following info:
cpu topology: error: sockets (1) * cores (4) * threads (1) < smp_cpus (5)

hi, Igor
Is it the expected result? I remember guest can boot up successfully with scratch build in comment 14. If the qemu error is the expected result, then the bug has been fixed.

Comment 24 Yanhui Ma 2015-07-27 07:11:00 UTC

Created attachment 1056469 [details]
call trace info

Comment 25 Igor Mammedov 2015-07-27 12:16:48 UTC

(In reply to Yanhui Ma from comment #23)
> Reproduce this issue.
[...]
> Result:
> qemu prints out following info:
> cpu topology: error: sockets (1) * cores (4) * threads (1) < smp_cpus (5)
> 
> hi, Igor
> Is it the expected result? I remember guest can boot up successfully with
> scratch build in comment 14. If the qemu error is the expected result, then
> the bug has been fixed.

bug was fixed upstream by commit
 fb43b73b "pc: fix default VCPU to NUMA node mapping"
and error message was introduced upstream by 
 ec2cbbdd8 "vl: Don't silently change topology when all -smp options were set"
after 2.2 release which scratch build is based on.

So yes, I'd say exit with error is expected result.

Comment 26 Yanhui Ma 2015-07-28 02:02:47 UTC

(In reply to Igor Mammedov from comment #25)
> (In reply to Yanhui Ma from comment #23)
> > Reproduce this issue.
> [...]
> > Result:
> > qemu prints out following info:
> > cpu topology: error: sockets (1) * cores (4) * threads (1) < smp_cpus (5)
> > 
> > hi, Igor
> > Is it the expected result? I remember guest can boot up successfully with
> > scratch build in comment 14. If the qemu error is the expected result, then
> > the bug has been fixed.
> 
> bug was fixed upstream by commit
>  fb43b73b "pc: fix default VCPU to NUMA node mapping"
> and error message was introduced upstream by 
>  ec2cbbdd8 "vl: Don't silently change topology when all -smp options were
> set"
> after 2.2 release which scratch build is based on.
> 
> So yes, I'd say exit with error is expected result.

Thanks, if so, I think the bug was fixed.

Comment 27 juzhang 2015-08-03 04:12:47 UTC

According to comment23-comment26, set this issue as verified.

Comment 29 errata-xmlrpc 2015-12-04 16:31:59 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2546.html

Note You need to log in before you can comment on or make changes to this bug.