Description of problem: RHEL5.4 VM running on RHEV-H host showing inconsistent time, hangs during script execution Version-Release number of selected component (if applicable): Host: [root@beta-vdsa ~]# uname -a Linux beta-vdsa.gss.lab.tlv.redhat.com 2.6.18-164.2.1.el5 #1 SMP Mon Sep 21 04:37:42 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux [root@beta-vdsa ~]# rpm -qa |grep -i kvm kmod-kvm-83-105.el5_4.8 kvm-tools-83-105.el5_4.8 etherboot-zroms-kvm-5.4.4-10.el5 kvm-debuginfo-83-105.el5_4.8 kvm-qemu-img-83-105.el5_4.8 kvm-83-105.el5_4.8 [root@beta-vdsa ~]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5420 @ 2.50GHz stepping : 6 cpu MHz : 2493.750 cache size : 6144 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 4 apicid : 0 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm bogomips : 4987.50 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5420 @ 2.50GHz stepping : 6 cpu MHz : 2493.750 cache size : 6144 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 4 apicid : 1 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm bogomips : 4987.47 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5420 @ 2.50GHz stepping : 6 cpu MHz : 2493.750 cache size : 6144 KB physical id : 0 siblings : 4 core id : 2 cpu cores : 4 apicid : 2 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm bogomips : 4987.50 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Xeon(R) CPU E5420 @ 2.50GHz stepping : 6 cpu MHz : 2493.750 cache size : 6144 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 apicid : 3 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm bogomips : 4987.49 clflush size : 64 cache_alignment : 64 address sizes : 38 bits physical, 48 bits virtual power management: _________________________________________ VM: [root@localhost ~]# uname -a Linux localhost.localdomain 2.6.18-164.2.1.el5 #1 SMP Mon Sep 21 04:37:42 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux [root@localhost ~]# dmesg|grep -i kvm kvm-clock: cpu 0, msr 7eff:80433401, boot clock kvm-clock: cpu 0, msr 0:1575401, primary cpu clock kvm_get_tsc_khz: cpu 0, msr 0:1602001 kvm-clock: cpu 1, msr 0:157da81, secondary cpu clock time.c: Using 1.193182 MHz WALL KVM GTOD KVM timer. How reproducible: This is exactly one minute of te script running, note the output doesn't show the actual 60 seconds in stdout. Also 5 seconds are lost [root@localhost ~]# while true; do sleep 1; date; done Wed Oct 28 01:14:36 IST 2009 Wed Oct 28 01:14:43 IST 2009 Wed Oct 28 01:14:49 IST 2009 Wed Oct 28 01:14:56 IST 2009 Wed Oct 28 01:15:02 IST 2009 Wed Oct 28 01:15:09 IST 2009 Wed Oct 28 01:15:10 IST 2009 Wed Oct 28 01:15:17 IST 2009 Wed Oct 28 01:15:23 IST 2009 Wed Oct 28 01:15:24 IST 2009 Wed Oct 28 01:15:31 IST 2009 This is another un of the script, note the time jumping back: [root@localhost ~]# while true; do sleep 1; date; done Wed Oct 28 01:17:56 IST 2009 Wed Oct 28 01:18:02 IST 2009 Wed Oct 28 01:18:09 IST 2009 Wed Oct 28 01:18:15 IST 2009 Wed Oct 28 01:18:16 IST 2009 Wed Oct 28 01:18:12 IST 2009 Wed Oct 28 01:18:18 IST 2009 Wed Oct 28 01:18:25 IST 2009 Wed Oct 28 01:18:31 IST 2009 Wed Oct 28 01:18:32 IST 2009 Wed Oct 28 01:18:28 IST 2009 Wed Oct 28 01:18:34 IST 2009 Wed Oct 28 01:18:41 IST 2009 Wed Oct 28 01:18:47 IST 2009 Wed Oct 28 01:18:43 IST 2009 Actual results: adding clock=pmtmr divider=10 made things a bit better. Expected results: no timedrift, no hangups of the running script, seeing every second in stdout, or at least almost every second, while running while true; do sleep 1; date; done Additional info: a VM exists on which it is reproducible. ping me on #gss-rhev or via email if you require access.
after reboot the script ra for 60 seconds, showing every second. however it counted 50 sec during actual 60sec.
This is probably fixed by the last series I sent out. Dan, I gave you a kernel with the fix included. Can you please confirm that it fixes the issue for you ? Thanks!
The VM with the new kernel is running, I'll test it again on Monday, and update the BZ with the results
Checked the VM today - looks like the problem is solved by the new kernel - no visible drift at all.
in kernel-2.6.18-176.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5 Please do NOT transition this bugzilla state to VERIFIED until our QE team has sent specific instructions indicating when to do so. However feel free to provide a comment indicating that this fix has been verified.
*** Bug 521517 has been marked as a duplicate of this bug. ***
I ran a rhel5.5 guest in rhel5.5 Intel host, time inside guest went backwards after 4 hours.Ran the guest in AMD host for a whole night,the problem does not exist. host: kernel-2.6.18-185.el5 kvm-83-154.el5 guest:kernel-2.6.18-185.el5 (rhel5.5-64bit) Steps: 1.Boot a rhel5.5 guest /usr/libexec/qemu-kvm -drive file=RHEL-Server-5.4-64-virtio.qcow2,if=virtio,boot=on -no-hpet -rtc-td-hack -usbdevice tablet -startdate now -smp 2 -m 2G -net nic,model=virtio,macaddr=20:20:20:11:23:5f,vlan=0 -net tap,vlan=0,script=/etc/qemu-ifup -cpu qemu64,+sse2 -name 64 -monitor stdio -vnc :8 -no-kvm-pit-reinjection 2.Run #./gettimeofday inside guest.("gettimeofday" will be attached.) Result: Time went backwards after 4 hours. [root@localhost ~]# ./gettimeofday time went backwards: tv.tv_sec = 1264747084, tv.tv_usec = 940012 lasttv.tv_sec = 1264747084, lasttv.tv_usec = 940013 In guest: [root@localhost ~]# dmesg | grep time.c time.c: Using tsc for timekeeping HZ 1000 time.c: Using 1.193182 MHz WALL KVM GTOD KVM timer. time.c: Detected 2826.230 MHz processor. Host info:(4 cpu,here only list one) processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz stepping : 10 cpu MHz : 2826.231 cache size : 6144 KB physical id : 0 siblings : 4 core id : 3 cpu cores : 4 apicid : 3 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm bogomips : 5652.50 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management:
Created attachment 387520 [details] gettimeofday
Created attachment 387521 [details] gettimeofday.c
Created attachment 387970 [details] time go backwards log.
Do you want to reopen the bug?
~~ Attention Customers and Partners - RHEL 5.5 Beta is now available on RHN ~~ RHEL 5.5 Beta has been released! There should be a fix present in this release that addresses your request. Please test and report back results here, by March 3rd 2010 (2010-03-03) or sooner. Upon successful verification of this request, post your results and update the Verified field in Bugzilla with the appropriate value. If you encounter any issues while testing, please describe them and set this bug into NEED_INFO. If you encounter new defects or have additional patch(es) to request for inclusion, please clone this bug per each request and escalate through your support representative.
(In reply to comment #24) > Do you want to reopen the bug? have the guests running for 48hour. we have reproduced it again. both guest and host are installed from tree RHEL5.5-Server-20100217.0/ kernel-2.6.18-189.el5 kvm-83-157.el5 -------------------------------------------- guest host result -------------------------------------------- 64bitrhel5.5 amd* pass--no time back 64bitrhel5.5 Intel* failed 32bitrhel5.5 amd failed 32bitrhel5.5 Intel failed -------------------------------------------- CLI. /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -smp 2 -m 2G -drive file=rhel5.5-64-virtio.qcow2,if=virtio,boot=on -net nic,vlan=0,macaddr=20:88:99:11:20:86 -net tap,vlan=0,script=/etc/qemu-ifup -uuid eb8a5d04-feae-480c-989c-edc431f3363f -cpu qemu64,+sse2 -vnc :10 -monitor stdio -notify all -name 64_Intel -startdate now /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -smp 2 -m 2G -drive file=rhel5.5-32-virtio.qcow2,if=virtio,boot=on -net nic,vlan=0,macaddr=20:88:99:11:20:59 -net tap,vlan=0,script=/etc/qemu-ifup -uuid babd64c0-cd07-46fd-bcfb-3f8f6015bbf3 -cpu qemu64,+sse2 -vnc :11 -monitor stdio -notify all -M rhel5.5.0 -startdate now -name 32_intel /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -smp 2 -m 2G -drive file=rhel5.5-64-virtio.qcow2,if=virtio,boot=on -net nic,vlan=0,macaddr=20:88:99:11:20:69 -net tap,vlan=0,script=/etc/qemu-ifup -uuid c431b0cd-ddba-4a5c-9aa2-edccf9048316 -cpu qemu64,+sse2 -vnc :10 -monitor stdio -notify all -name 64_amd -startdate now /usr/libexec/qemu-kvm -no-hpet -usbdevice tablet -rtc-td-hack -smp 2 -m 2G -drive file=rhel5.5-32-virtio.qcow2,if=virtio,boot=on -net nic,vlan=0,macaddr=20:88:99:11:20:56 -net tap,vlan=0,script=/etc/qemu-ifup -uuid 91a76cfe-8df4-466a-a177-5f80ed7ecf91 -cpu qemu64,+sse2 -vnc :11 -monitor stdio -notify all -M rhel5.5.0 -startdate now -name 32_amd host cpuinfo: processor : 3 vendor_id : AuthenticAMD cpu family : 16 model : 2 model name : AMD Phenom(tm) 9600B Quad-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 23 model name : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm
Created attachment 397265 [details] 32bitrhel5.5-AMD-timeback.txt
Created attachment 397266 [details] 32bitrhel5.5-Intel-timeback.txt
Created attachment 397267 [details] 64bitrhel5.5-Intel-timeback.txt
reopen
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0178.html