Bug 823374 - RHEL6.3 guest got time drift after running 3 days on a stressed Intel host
RHEL6.3 guest got time drift after running 3 days on a stressed Intel host
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.3
Unspecified Unspecified
unspecified Severity high
: rc
: ---
Assigned To: Gleb Natapov
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-05-21 01:00 EDT by Qunfang Zhang
Modified: 2013-12-08 19:57 EST (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-05-28 23:07:55 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
time drift log of rhel6.3 guest (1.98 MB, text/plain)
2012-05-21 01:02 EDT, Qunfang Zhang
no flags Details

  None (edit)
Description Qunfang Zhang 2012-05-21 01:00:52 EDT
Description of problem:
Similar with bug 823372, this issue also happens on a rhel6.3 guest with kvm-clock. Create separate issue for rhel guest and if the component is wrong please help fix it. 

Note:
This issue only happens on my Intel host. Doesn't hit it on the AMD host.

I boot multiple vms on both intel and amd hosts (totally 1.5*p-mem and 3*p-cpu overcommit) , and let them run during a weekend. But When I check them on the Monday morning, there's -15 seconds time drift on the RHEL6.3-64 guest which is running on Intel host.. There's no any application running inside guests except a script loop to check the system time every 15 seconds. 


Here's the detail information for the hosts and guests:
Intel:
Host: 12G mem and 16 cpus
       cpu information please check the 'Additional info' of the bug report.
Guest: Running 20vms (including win2k8r2 guests and rhel6.3-64 guests)
       16vms have 1G v-mem and 2 vcpu
       4 vms have 512M v-mem and 4 vcpu

Version-Release number of selected component (if applicable):
kernel-2.6.32-272.el6.x86_64
qemu-kvm-0.12.1.2-2.294.el6.x86_64

Guest:
kernel-2.6.32-272.el6.x86_64

How reproducible:
1/1 (running 3 days so far)

Steps to Reproduce:
0. On host,sync the time: #ntpdate clock.redhat.com
1. Boot multiple vms on a host. (1.5X mem and 3X cpu overcommiit)
/usr/libexec/qemu-kvm -M rhel6.3.0 -cpu Westmere -m 1024 -smp 1,sockets=1,cores=1,threads=1 -enable-kvm -name intel-rhel-7 -uuid d7d34343-24db-4008-9ee5-22bbf7901112 -k en-us -rtc base=localtime,clock=host,driftfix=slew -usb -device usb-tablet,id=input0 -device virtio-serial-pci,id=virtio-serial0,max_ports=2,bus=pci.0,addr=0x3 -chardev file,id=charchannel0,path=/tmp/win-socket-7 -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.rhevm.vdsm -drive file=/home/rhel6.3-64-0516-1.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,cache=none,werror=stop,rerror=stop,aio=native -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,id=virtio-disk0 -drive if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw -device ide-drive,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0 -netdev tap,id=hostnet0,vhost=on,script=/etc/qemu-ifup -device virtio-net-pci,netdev=hostnet0,id=net0,mac=00:1a:4a:68:36:07,bus=pci.0,addr=0x5 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6 -chardev socket,id=111a,path=/tmp/monitor-win-7,server,nowait -mon chardev=111a,mode=readline -vnc :7

 
2. sync the time of guest and run a script to check the time every few seconds.
#ntpdate clock.redhat.com
#while true; do ntpdate -q clock.redhat.com >> time.txt; sleep 15; done


Actual results:
Guest got -15 time drift after 3 days running.
This issue doesn't happen on my AMD host.

Expected results:


Additional info:
processor	: 15
vendor_id	: GenuineIntel
cpu family	: 6
model		: 44
model name	: Intel(R) Xeon(R) CPU           E5620  @ 2.40GHz
stepping	: 2
cpu MHz		: 2393.997
cache size	: 12288 KB
physical id	: 0
siblings	: 8
core id		: 10
cpu cores	: 4
apicid		: 21
initial apicid	: 21
fpu		: yes
fpu_exception	: yes
cpuid level	: 11
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt aes lahf_lm ida arat dts tpr_shadow vnmi flexpriority ept vpid
bogomips	: 4787.83
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

======================================
[root@intel-e5620-12-3 home]# kvm_stat -1
efer_reload                    0         0
exits                 2191583591     12590
fpu_reload             312380428      1351
halt_exits             254169489      1128
halt_wakeup            181869607       829
host_state_reload     1072809897      4722
hypercalls                     0         0
insn_emulation         335305296      1453
insn_emulation_fail            0         0
invlpg                         0         0
io_exits               810908435      3639
irq_exits               35029370       837
irq_injections         269139363      1539
irq_window               7627479        58
largepages                 38137         3
mmio_exits               7145550         0
mmu_cache_miss            132560         0
mmu_flooded                    0         0
mmu_pde_zapped                 0         0
mmu_pte_updated                0         0
mmu_pte_write              13000         0
mmu_recycled               31592         0
mmu_shadow_zapped         148397         0
mmu_unsync                     0         0
nmi_injections              7479         0
nmi_window                  7412         0
pf_fixed                21319346      1140
pf_guest                       0         0
remote_tlb_flush         2680410         9
request_irq                    0         0
signal_exits                  19         0
tlb_flush                  48655         0
Comment 1 Qunfang Zhang 2012-05-21 01:02:50 EDT
Created attachment 585715 [details]
time drift log of rhel6.3 guest
Comment 3 Qunfang Zhang 2012-05-21 03:47:02 EDT
Hi, all
Checked the host time and found Intel host got time drift after running 3 days under stress while AMD host has no obvious time drift.
AMD:
[root@localhost qzhang-test]# ntpdate -q clock.redhat.com
server 66.187.233.4, stratum 1, offset 0.543017, delay 0.31108
21 May 03:28:39 ntpdate[26837]: step time server 66.187.233.4 offset 0.543017 sec

Intel: 
[root@intel-e5620-12-3 ~]# ntpdate -q clock.redhat.com
server 66.187.233.4, stratum 1, offset -17.750523, delay 0.31085
21 May 15:28:52 ntpdate[16581]: step time server 66.187.233.4 offset -17.750523 sec
Comment 4 Qunfang Zhang 2012-05-28 23:07:55 EDT
Re-test again, sync both the host and guest time, and running multiple vms on the host. there's no time drift on both host and guests.  I'd like to close this bug and thanks everyone's time on it.

Note You need to log in before you can comment on or make changes to this bug.