Bug 1090861

Summary: Fail to boot up a guest with some vcpu configuration
Product: Red Hat Enterprise Linux 6 Reporter: Qunfang Zhang <qzhang>
Component: kernelAssignee: Andrew Jones <drjones>
Status: CLOSED DUPLICATE QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: high Docs Contact:
Priority: high    
Version: 6.6CC: acathrow, bsarathy, drjones, juzhang, michen, mkenneth, virt-maint, xfu
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-04-29 14:52:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Qunfang Zhang 2014-04-24 10:20:28 UTC
Description of problem:
Boot up a guest with 160 vcpu on a host with 160 p-cpu, failed to boot up when specify the guest cpu cores/threads values with >1 values.

Version-Release number of selected component (if applicable):
kernel-2.6.32-460.el6.x86_64
qemu-kvm-0.12.1.2-2.424.el6.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Boot up a guest with "-smp 160,sockets=2,cores=40,threads=2" or "-smp 160,sockets=4,cores=20,threads=2" on a host with 160 cpu.

2. Failed to boot up guest, it prompts:

kvm_create_vcpu: Invalid argument
Failed to create vCPU. Check the -smp parameter.

3. If I boot with "-smp 160,sockets=160,cores=1,threads=1", can not reproduce. 

Actual results:
Guest failed to boot up.

Expected results:
Guest should boot up. 

Additional info:
(1) qemu-kvm-424 + kernel-431:  NOT reproduce  
(2) qemu-kvm-424 + kernel-460:  reproduce.

But Andrew said this is unlikely a kernel issue, mostly qemu does some funny things. So does not add "regression" keyword and file to qemu-kvm component first.

Comment 2 Qunfang Zhang 2014-04-24 10:23:42 UTC
/usr/libexec/qemu-kvm \
  -M rhel6.5.0 \
  -cpu Nehalem \
  -m 4G \
  -smp 160,sockets=2,cores=40,threads=2 \
  -enable-kvm \
  -name RHEL-Server-6.6-64 \
  -uuid cca1433d-5bac-490f-a097-c5c80c1a083f \
  -nodefconfig \
  -nodefaults \
  -k en-us \
  -qmp tcp:0:5000,server,nowait \
  -boot order=c,menu=off \
  -vga qxl \
  -global qxl-vga.vram_size=67108864 \
  -spice port=6000,disable-ticketing \
  -drive file=/home/rhel6.6-64-virtio.qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native,rerror=stop,werror=stop \
  -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk0,id=os-disk,bootindex=1 \
  -chardev socket,id=log,path=/tmp/seabios,server,nowait \
  -device isa-debugcon,iobase=0x402,chardev=log \
  -device virtio-serial-pci,id=virtio-serial0,max_ports=16 \
  -chardev socket,id=qemu-ga0,path=/tmp/qemu-ga,server,nowait \
  -device virtserialport,chardev=qemu-ga0,name=org.qemu.guest_agent.0,bus=virtio-serial0.0,id=port2 &

Comment 3 Andrew Jones 2014-04-29 14:49:21 UTC
It looks like I messed up with bug 1010882. KVM_CREATE_VCPU needs to support vcpu_ids up to 255, even if we only support up to 160 vcpus, because the vcpu_id should match the apic_id, and topologies with greater than one socket can have gaps in the apic_id space.

Moving back to kernel to post the oneliner fix.

Comment 4 Andrew Jones 2014-04-29 14:51:30 UTC
Ah, bug 1010882, is still ON_QA, so I'll just FailQA it, and then post a fix under the original BZ.

Comment 5 Andrew Jones 2014-04-29 14:52:20 UTC

*** This bug has been marked as a duplicate of bug 1010882 ***