Bug 710319

Summary: RHEL4.9 guest get many lost ticks warning and become very slowly
Product: Red Hat Enterprise Linux 5 Reporter: Joy Pu <ypu>
Component: kvmAssignee: Virtualization Maintenance <virt-maint>
Status: CLOSED WONTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 5.7CC: juzhang, mkenneth, tburke, virt-maint
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-07-17 12:57:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 580948    

Description Joy Pu 2011-06-03 03:35:07 UTC
Description:
Boot up a smp RHEL 4.9 guest and dd a big file in it, this may caused the guest report "many lost ticks" warning and become very slowly:
2011-05-27 13:16:06: warning: many lost ticks.
2011-05-27 13:16:06: Your time source seems to be instable or some driver is hogging interupts
2011-05-27 13:16:06: rip __do_softirq+0x4d/0xd0

We meet this problem several times in file_transfer and rebase tests in the step of dd files. The guest should be use pit and the qemu command line has -kvm-no-pit-reinject.

Version-Release number of selected component (if applicable):
kernel: 
2.6.18-262.el5
kvm: 
# rpm -qa |grep kvm
etherboot-roms-kvm-5.4.4-13.el5
kvm-qemu-img-83-235.el5
etherboot-zroms-kvm-5.4.4-13.el5
kmod-kvm-83-235.el5
kvm-debuginfo-83-235.el5
kvm-83-235.el5
kvm-tools-83-235.el5
kmod-kvm-debug-83-235.el5

qemu:
# rpm -qa |grep qemu
kvm-qemu-img-83-235.el5
gpxe-roms-qemu-0.9.7-8.el5


How reproducible:
1/3

Steps to Reproduce:
1. Boot up a RHEL 4.9 guest.
#/usr/local/staf/test/RHEV/kvm-new/autotest/client/tests/kvm/qemu -name 'vm1' -monitor unix:'/tmp/monitor-humanmonitor1-20110528-105750-6LJ6',server,nowait -serial unix:'/tmp/serial-20110528-105750-6LJ6',server,nowait -drive file='/usr/local/staf/test/RHEV/kvm-new/autotest/client/tests/kvm/sn1.qcow2',index=0,if=virtio,media=disk,cache=none,boot=on,format=qcow2 -net nic,vlan=0,model=virtio,macaddr='9a:02:fa:ce:0f:59' -net tap,vlan=0,ifname='t0-105750-6LJ6',script='/usr/local/staf/test/RHEV/kvm-new/autotest/client/tests/kvm/scripts/qemu-ifup-switch',downscript='no' -m 4096 -smp 2,cores=1,threads=1,sockets=2 -cpu qemu64,+sse2 -soundhw ac97 -spice port=8000,disable-ticketing -qxl 1 -rtc-td-hack -M rhel5.6.0 -boot c  -usbdevice tablet -no-kvm-pit-reinjection
2. dd a big file to make the system busy on I/O
3. check the serial port output you will see the warning
# nc -U /tmp/serial-20110528-105750-6LJ6

Actual results:
Guest will become very slow and can not finish the tests
Expected results:
Guest should not influence and can runs well

Additional info:
1.  cpu info:
processor	: 1
vendor_id	: AuthenticAMD
cpu family	: 15
model		: 107
model name	: AMD Athlon(tm) Dual Core Processor 5400B
stepping	: 2
cpu MHz		: 1000.000
cache size	: 512 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy misalignsse
bogomips	: 2004.17
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc 100mhzsteps

Comment 1 Dor Laor 2011-06-27 06:46:13 UTC
Can you test that with raw format instead of qcow2?
qcow2 of a new snapshot might block and will be the cause for the lost ticks.

Comment 2 Joy Pu 2011-06-30 01:58:19 UTC
(In reply to comment #1)
> Can you test that with raw format instead of qcow2?
> qcow2 of a new snapshot might block and will be the cause for the lost ticks.

I submit a job for this case with raw format and it already run 100 times, didn't hit this problem.