Bug 640608
Summary: | Wrong value after counting two Time values | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Michal Fojtik <mfojtik> | ||||
Component: | kernel | Assignee: | Prarit Bhargava <prarit> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 5.5 | CC: | drjones, jwilson, vgrinco, xen-maint | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2013-11-04 16:51:44 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Thanks for the report. There is definitely a bug in there somewhere. I wonder if this is ruby-specific. I suspect Xen, not ruby. What does this print for you: perl '-MTime::HiRes "time"' -le '$t=time; sleep 1; print time - $t' If that shows the same problem, then please reassign the bug. If it does not, please run the following and attach the resulting "log" file: strace -o log ruby -e 't=Time.now; sleep 1; puts Time.now - t' (In reply to comment #1) > Thanks for the report. > There is definitely a bug in there somewhere. > I wonder if this is ruby-specific. I suspect Xen, not ruby. That's right, but weird is, that Perl or C doesn't suffer with this kind of problem. > What does this print for you: > > perl '-MTime::HiRes "time"' -le '$t=time; sleep 1; print time - $t' This one works fine: perl '-MTime::HiRes "time"' -le '$t=time; sleep 1; print time - $t' 1.00364589691162 > > If that shows the same problem, then please reassign the bug. > If it does not, please run the following and attach the resulting "log" file: > > strace -o log ruby -e 't=Time.now; sleep 1; puts Time.now - t' Log file attached. Created attachment 452057 [details]
strace output
Thanks for the info. Note that Perl does not use gettimeofday, but rather the superior "nanosleep" function. Contrast your output: gettimeofday({1286441633, 652023}, NULL) = 0 time(NULL) = 1286441633 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) time(NULL) = 1286441634 gettimeofday({1286441634, 651329}, NULL) = 0 with what I see on RHEL6: rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) It may be that RHEL5's gettimeofday isn't working quite right. This discussion suggests that your problem could be due to how your virtual system is configured: http://thread.gmane.org/gmane.comp.emulators.xen.user/55086/focus=55117 Here is a related bug. Make sure that you have that fix: https://bugzilla.redhat.com/show_bug.cgi?id=427588 (In reply to comment #4) > Thanks for the info. > Note that Perl does not use gettimeofday, but rather the superior "nanosleep" > function. > > Contrast your output: > > gettimeofday({1286441633, 652023}, NULL) = 0 > time(NULL) = 1286441633 > select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) > time(NULL) = 1286441634 > gettimeofday({1286441634, 651329}, NULL) = 0 > > with what I see on RHEL6: > > rt_sigprocmask(SIG_BLOCK, NULL, [], 8) = 0 > select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout) > > It may be that RHEL5's gettimeofday isn't working quite right. > > This discussion suggests that your problem could be due to how your virtual > system is configured: > > http://thread.gmane.org/gmane.comp.emulators.xen.user/55086/focus=55117 > > Here is a related bug. Make sure that you have that fix: > > https://bugzilla.redhat.com/show_bug.cgi?id=427588 Hi, we tried several combinations of kernel parameters and also kernel update but nothing works :-( So I thought this bug is not Ruby related as you said before but XEN. I'm changing component to XEN then. Please describe the host and guest completely; kernel revisions for both, type of guest (pv|fv), x86_64 or i386. Also take a look at dmesg for both host and guest and 'xm dmesg' on the host. If there's anything interesting, please attach it. Thanks, Drew Guest: * PV XEN guest * 2.6.18-194.17.1.el5xen * x86_64 * 1 VCPU Host: * RHEL5.5 * 2.6.18-194.11.3.el5xen * x86_64 * 8 CPU Do you need anything else? I captured some data for a variety of systems. ------------------------------------------------------------------------------- config mean standard max min diff deviation from 1 ------------------------------------------------------------------------------- dom0 0.997042 0.000655 1.005585 0.996048 0.002958 f12_bm 1.000866 0.000346 1.003212 1.000062 0.000866 rhel55-64pv 0.997907 0.000720 1.005739 0.993254 0.002093 rhel6-32pv 1.001077 0.000053 1.001859 1.000138 0.001077 rhel6-64fv 1.001599 0.000296 1.002114 1.001069 0.001599 rhel6-64pv 1.001071 0.000058 1.001999 1.000136 0.001071 The f12_bm is my f12 bare-metal notebook. dom0 is a latest rhel 5.6 candidate and was the dom0 used for all the other cases, which are xen guests. We see that bare-metal has the closest mean to 1, but it's standard deviation isn't the best. This indicates that we have error here even without virtualization. With respect to virtualization we see that the rhel5 systems (dom0 and rhel55-64pv) are generally less than 1, as was observed when this bug was opened, but not always. They also have the worst standard deviation and difference of the mean from 1. However, the rhel6 guests are all generally higher and have differences closer to that of the bare-metal, but with even less standard deviation. What all this tells us is that if there's a problem, and not just some expected error, then it's at least in the kernel, not the hypervisor, or it could be even higher up the stack. I'll move this bug over to the kernel for now to get comments from them. Note, I also quickly scanned the data by eye to see if it looked the results were increasing or decreasing -- indicating some cumulative error on time accounting. I didn't see any obvious patterns. This Bugzilla has been reviewed by Red Hat and is not planned on being addressed in Red Hat Enterprise Linux 5, and therefore is being closed. If this bug is critical to production systems, please contact your Red Hat support representative and provide a sufficient business justification in order to re-open it. |
Description of problem: Counting with time is broken somehow on XEN guest in Ruby, but it could be a system issue. Version-Release number of selected component (if applicable): Red Hat Enterprise Linux Server release 5.5 (Tikanga) ruby 1.8.7 (2009-12-24 patchlevel 248) [x86_64-linux] How reproducible: Run a simple Ruby script on Xen guest with RHEL5 Steps to Reproduce: 1. Install Ruby 2. Launch this Ruby script: current_time = Time.now sleep(1) actual_time = Time.now # Time should be >1 puts actual_time-current_time 3. Actual results: 0.994902 Expected results: >1