Bug 1072373

Summary: Along with the increase of vCPUs in guest, and guest OS will spend more time to boot up in specified machine.
Product: Red Hat Enterprise Linux 6 Reporter: Andrew Jones <drjones>
Component: kernelAssignee: Andrew Jones <drjones>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: urgent    
Version: 6.5CC: atheurer, bsarathy, chayang, chegu_vinod, dhoward, drjones, dyuan, honzhang, jiahu, juzhang, knoel, linuxdev-kernel-it, lwang, michen, mkenneth, msvoboda, mtosatti, mzhan, qzhang, rbalakri, scrandall, skito, virt-maint, xfu
Target Milestone: rcKeywords: ZStream
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.32-453.el6 Doc Type: Bug Fix
Doc Text:
A previous change that introduced global clock updates caused guest machines to boot slowly when the host Time Stamp Counter (TSC) was marked as unstable. The slow down increased with the number of vCPUs allocated. To resolve this problem, a patch has been applied to limit the rate of the global clock updates.
Story Points: ---
Clone Of: 1036457 Environment:
Last Closed: 2014-10-14 05:58:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1036457    
Bug Blocks: 994246, 1011600, 1078929, 1090750, 1102253    

Comment 1 Andrew Jones 2014-03-25 12:30:51 UTC
Bug had wrong component set. PM, QE please re-ack.

Thanks,
drew

Comment 3 RHEL Program Management 2014-03-26 00:11:52 UTC
This request was evaluated by Red Hat Product Management for
inclusion in a Red Hat Enterprise Linux release.  Product
Management has requested further review of this request by
Red Hat Engineering, for potential inclusion in a Red Hat
Enterprise Linux release for currently deployed products.
This request is not yet committed for inclusion in a release.

Comment 4 Rafael Aquini 2014-04-01 02:01:26 UTC
Patch(es) available on kernel-2.6.32-453.el6

Comment 6 Andrew Jones 2014-04-15 08:34:17 UTC
*** Bug 1087715 has been marked as a duplicate of this bug. ***

Comment 8 Qunfang Zhang 2014-04-24 09:43:05 UTC
This bug could be reproduced on kernel-2.6.32-431.el6.x86_64 and verified pass on kernel-2.6.32-460.el6.x86_64. 

Host hardware:
processor       : 159
vendor_id       : GenuineIntel
cpu family      : 6
model           : 47
model name      : Intel(R) Xeon(R) CPU E7- 8850  @ 2.00GHz
stepping        : 2
cpu MHz         : 1064.000
cache size      : 24576 KB
physical id     : 3
siblings        : 20
core id         : 25
cpu cores       : 10
apicid          : 243
initial apicid  : 243
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt aes lahf_lm ida arat epb dts tpr_shadow vnmi flexpriority ept vpid
bogomips        : 3989.93
clflush size    : 64
cache_alignment : 64
address sizes   : 44 bits physical, 48 bits virtual
power management:


Steps: 

Boot a guest with 80 vcpu. And also have a try with 160 vcpu.
#sh boot.sh

# cat boot.sh 
#!/bin/sh

export SMP_PARA=80,sockets=2,cores=20,threads=2
sh cli
echo -e "start boot at:\t`date '+%D %T'`"
sleep 1

echo '{"execute": "guest-ping"}' | nc -U /tmp/qemu-ga
echo -e "finish boot at:\t`date '+%D %T'`"
----
# cat cli 
/usr/libexec/qemu-kvm \
  -M pc-i440fx-rhel7.0.0 \
  -cpu Nehalem \
  -m 4G \
  -smp ${SMP_PARA} \
  -enable-kvm \
  -name RHEL-Server-7.0-64 \
  -uuid cca1433d-5bac-490f-a097-c5c80c1a083f \
  -nodefconfig \
  -nodefaults \
  -k en-us \
  -qmp tcp:0:5000,server,nowait \
  -boot order=c,menu=off \
  -vga qxl \
  -global qxl-vga.vram_size=67108864 \
  -spice port=6000,disable-ticketing \
  -drive file=/home/xuhan/RHEL-Server-7.0-64.qcow2,if=none,id=drive-virtio-disk0,cache=none,aio=native,rerror=stop,werror=stop \
  -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk0,id=os-disk,bootindex=1 \
  -chardev socket,id=log,path=/tmp/seabios,server,nowait \
  -device isa-debugcon,iobase=0x402,chardev=log \
  -device virtio-serial-pci,id=virtio-serial0,max_ports=16 \
  -chardev socket,id=qemu-ga0,path=/tmp/qemu-ga,server,nowait \
  -device virtserialport,chardev=qemu-ga0,name=org.qemu.guest_agent.0,bus=virtio-serial0.0,id=port2 &
----


Result:

On the old kernel-2.6.32-431.el6.x86_64:

-smp 80:

start boot at:	04/24/14 16:46:40
finish boot at:	04/24/14 16:48:56  ===> More than 2 mins.

-smp 160:

start boot at:	04/24/14 16:50:53
finish boot at:	04/24/14 17:01:47  ===>  More than 10 mins.


On the fixed kernel-2.6.32-460.el6.x86_64

-smp 80:

start boot at:	04/24/14 17:16:10
finish boot at:	04/24/14 17:16:40  ===>  30 seconds. 

-smp 160:

start boot at:	04/24/14 17:33:49
finish boot at:	04/24/14 17:34:29  ===> 40 seconds. 

===============

Based on above, the issue is fixed.

Comment 10 Qunfang Zhang 2014-05-21 08:44:25 UTC
Hi, Andrew, 

Besides the test scenario in comment 8, do we need to arrange some other test additional to verify this bug? 

Thanks,
Qunfang

Comment 11 Andrew Jones 2014-05-21 09:44:06 UTC
(In reply to Qunfang Zhang from comment #10)
> Hi, Andrew, 
> 
> Besides the test scenario in comment 8, do we need to arrange some other
> test additional to verify this bug? 
> 
> Thanks,
> Qunfang

comment 8 looks good to me. I consider it verified.

thanks,
drew

Comment 12 Qunfang Zhang 2014-05-22 03:13:56 UTC
(In reply to Andrew Jones from comment #11)
> (In reply to Qunfang Zhang from comment #10)
> > Hi, Andrew, 
> > 
> > Besides the test scenario in comment 8, do we need to arrange some other
> > test additional to verify this bug? 
> > 
> > Thanks,
> > Qunfang
> 
> comment 8 looks good to me. I consider it verified.
> 
> thanks,
> drew

Thanks for the feedback.

Comment 14 Qunfang Zhang 2014-06-25 09:07:03 UTC
Setting to VERIFIED according to comment 8 and comment 11.

Comment 15 errata-xmlrpc 2014-10-14 05:58:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2014-1392.html