Bug 584740

Summary: Monotonic test fails on rhel55 guest with kvmclock
Product: Red Hat Enterprise Linux 6 Reporter: Amos Kong <akong>
Component: kernelAssignee: Red Hat Kernel Manager <kernel-mgr>
Status: CLOSED DUPLICATE QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: ailan, gcosta, jasowang, mtosatti, tburke, virt-maint, ypu, zamsden
Target Milestone: rcKeywords: RHELNAK
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 588997 613904 (view as bug list) Environment:
Last Closed: 2010-08-10 07:49:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 562808, 613904, 654914    

Description Amos Kong 2010-04-22 10:23:27 UTC
Description of problem:
Boot up a RHEL-5.5 guest and execute monotonic test on it. It always fail.
It could be reproduced even in a machine with constant_tsc.

# ./time_test --duration 300 gtod -v
INFO: time_test: running gtod test on 2 cpus for 300 seconds
INFO: time_test: new gtod-warp maximum:        -1
INFO: time_test: new gtod-warp maximum:        -2
INFO: time_test: new gtod-warp maximum:        -3
INFO: time_test: new gtod-warp maximum:        -4
INFO: time_test: new gtod-warp maximum:        -5
INFO: time_test: new gtod-warp maximum:        -6
INFO: time_test: new gtod-warp maximum:        -7
INFO: time_test: new gtod-warp maximum:        -8
INFO: time_test: new gtod-warp maximum:        -9
INFO: time_test: new gtod-warp maximum:       -10
INFO: time_test: new gtod-warp maximum:       -11
INFO: time_test: new gtod-warp maximum:       -12
INFO: time_test: new gtod-warp maximum:       -13
INFO: time_test: new gtod-warp maximum:       -14
INFO: time_test: new gtod-warp maximum:       -15
INFO: time_test: new gtod-warp maximum:       -16
INFO: time_test: new gtod-warp maximum:       -17
INFO: time_test: new gtod-warp maximum:       -18
INFO: time_test: new gtod-warp maximum:       -19
INFO: time_test: new gtod-warp maximum:       -20
INFO: time_test: new gtod-warp maximum:       -21
INFO: time_test: new gtod-warp maximum:       -22
INFO: time_test: new gtod-warp maximum:       -23
FAIL: gtod-worst-warp=-23757493 \


Version-Release number of selected component (if applicable):
host kernel: 2.6.32-19.el6.x86_64
guest kernel: 2.6.18-194.el5
# rpm -qa |grep qemu
qemu-kvm-tools-0.12.1.2-2.39.el6.x86_64
qemu-kvm-0.12.1.2-2.39.el6.x86_64
qemu-kvm-debuginfo-0.12.1.2-2.39.el6.x86_64
gpxe-roms-qemu-0.9.7-6.2.el6.noarch
qemu-img-0.12.1.2-2.39.el6.x86_64

How reproducible:
Always

Steps to Reproduce:
1. boot up a rhel55 guest
2. login to guest
3. execute monotonic test
   # ./time_test --duration 300 gtod -v 

Actual results:
monotonic test could not PASS

Expected results:
monotonic test can PASS


Additional info:
1. command line:
qemu-kvm -name vm100 -monitor stdio -drive file=/home/RHEL-Server-5.5-64.qcow2,if=ide,cache=none,boot=on -net nic,vlan=0,model=rtl8139,macaddr=00:2D:4B:53:8a:e9 -net tap,vlan=0,ifname=rtl8139_0_69,script=/usr/local/staf/test/RHEV/kvm-new/kvm-test/tests/kvm/scripts/qemu-ifup-switch,downscript=no -m 4096 -smp 2 -soundhw ac97 -usbdevice tablet -rtc-td-hack -no-hpet -cpu qemu64,+sse2 -no-kvm-pit-reinjection -vnc :10

2. cpuinfo of host:
# cat /proc/cpuinfo 
processor       : 1
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 67
model name      : Dual-Core AMD Opteron(tm) Processor 1216
stepping        : 3
cpu MHz         : 2400.000
cache size      : 1024 KB
physical id     : 0
siblings        : 2
core id         : 1
cpu cores       : 2
apicid          : 1
initial apicid  : 1
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips        : 4822.65
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc

3. meminfo of host:
# cat /proc/meminfo 
MemTotal:        8192636 kB
MemFree:         7069632 kB
Buffers:            7952 kB
Cached:            61456 kB
SwapCached:            0 kB
Active:           777552 kB
Inactive:          47096 kB
Active(anon):     755448 kB
Inactive(anon):        0 kB
Active(file):      22104 kB
Inactive(file):    47096 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:      10305528 kB
SwapFree:       10305528 kB
Dirty:                16 kB
Writeback:             0 kB
AnonPages:        755268 kB
Mapped:            26420 kB
Shmem:               184 kB
Slab:             198940 kB
SReclaimable:      13096 kB
SUnreclaim:       185844 kB
KernelStack:        1264 kB
PageTables:         6536 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    14401844 kB
Committed_AS:    4510448 kB
VmallocTotal:   34359738367 kB
VmallocUsed:       49256 kB
VmallocChunk:   34359635872 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:        9152 kB
DirectMap2M:     8378368 kB

Comment 2 RHEL Program Management 2010-04-22 12:32:38 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 3 Joy Pu 2010-06-02 04:45:02 UTC
In the new kernel this problem has some change. This test can passed under single kernel and still have problem when there are multi-cpus. RHEL 5.5 also has this problem
the gtod test passed but tsc test still failed
10:46:34 INFO | Results (test, status, duration, info):
10:46:34 INFO | ('monotonic_time.gtod', 'GOOD', 305, 'completed successfully')
10:46:34 INFO | ('monotonic_time.clock', 'GOOD', 303, 'completed successfully')
10:46:34 INFO | ('monotonic_time.tsc', 'FAIL', 302, 'FAIL: tsc-worst-warp=-42584482535')

tsc test log:
./time_test --duration 300 tsc -v
INFO: time_test: running tsc test on 2 cpus for 300 seconds
INFO: time_test: new tsc-warp maximum: -76882236694
INFO: time_test: new tsc-warp maximum: -76882236859
INFO: time_test: new tsc-warp maximum: -76882236918
INFO: time_test: new tsc-warp maximum: -76882236962
INFO: time_test: new tsc-warp maximum: -76882236984
INFO: time_test: new tsc-warp maximum: -76882236990
INFO: time_test: new tsc-warp maximum: -76882237031
INFO: time_test: new tsc-warp maximum: -76882237044
INFO: time_test: new tsc-warp maximum: -76882237075
INFO: time_test: new tsc-warp maximum: -76882237086
INFO: time_test: new tsc-warp maximum: -76882237093
INFO: time_test: new tsc-warp maximum: -76882237106
INFO: time_test: new tsc-warp maximum: -76882237113
INFO: time_test: new tsc-warp maximum: -76882237119
INFO: time_test: new tsc-warp maximum: -76882237128
INFO: time_test: new tsc-warp maximum: -76882237131
INFO: time_test: new tsc-warp maximum: -76882237142
INFO: time_test: new tsc-warp maximum: -76882237143
INFO: time_test: new tsc-warp maximum: -76882237146
INFO: time_test: new tsc-warp maximum: -76882237156
INFO: time_test: new tsc-warp maximum: -76882237162
INFO: time_test: new tsc-warp maximum: -76882237169
FAIL: tsc-worst-warp=-76882237169


host kernel:2.6.32-25.el6.x86_64
guest kernel: 2.6.32-31.el6.i686

command lines:
/root/autotest/client/tests/kvm/qemu -name vm1 -monitor tcp:0:6001,server,nowait -drive file=/root/autotest/client/tests/kvm/images/RHEL-Server-6.0-32.raw,if=ide,cache=writethrough,boot=on -net nic,vlan=0,model=e1000,macaddr=00:30:0D:20:1b:82 -net tap,vlan=0,ifname=e1000_0_6001,script=/root/autotest/client/tests/kvm/scripts/qemu-ifup-switch,downscript=no -m 4096 -smp 2 -soundhw ac97 -usbdevice tablet -rtc-td-hack -no-hpet -cpu qemu64,+sse2 -no-kvm-pit-reinjection -redir tcp:5000::22 -vnc :0 -serial unix:/tmp/serial-20100602-094724-MuYW,server,nowait

host cpu info:
processor       : 2
vendor_id       : AuthenticAMD
cpu family      : 16
model           : 2
model name      : AMD Phenom(tm) 8750 Triple-Core Processor
stepping        : 3
cpu MHz         : 1200.000
cache size      : 512 KB
physical id     : 0
siblings        : 3
core id         : 2
cpu cores       : 3
apicid          : 2
initial apicid  : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs
bogomips        : 4809.88
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

Comment 4 Glauber Costa 2010-07-05 15:45:38 UTC
Still an issue with the current kernel ?

Comment 5 Glauber Costa 2010-07-05 15:48:03 UTC
Also, kvmclock monotonicity is assured by the guest.

moving.

Comment 7 jason wang 2010-08-10 06:30:58 UTC
Update the produce to RHEL6 according to comment #1 and also the QA Contact to virt-bugs.

Comment 9 RHEL Program Management 2010-08-10 06:58:10 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 10 jason wang 2010-08-10 07:49:53 UTC
According to comment #3 the problem of kvmclock should be fixed and the issue mentioned is related to TSC rather than kvmclock. So we have:

TSC issue: bug 613904 Monotonic tsc test fails on rhel6 guest 
Kvmclock issue: bug 569603 kvm-clock use in guests results in wild jumps in time

So CLOSE this bug as duplicate.

*** This bug has been marked as a duplicate of bug 569603 ***