From private email from Cai: I ran the reproducer [prarit: see BZ 250708], and found it failed in one of boxes while doing online/offline CPUs with .98.el5 Kernel, [root@nec-em11 ~]# ./gtod_test TPstart_time=1221418633.824041s start time =1221418648.829208 end time =1221418648.829207 reverse -1000 nsec (-1.999999 ) It could be reproduced this way, Run gtod_test (the original reproducer, in attachment) in one terminal window. Then, run the following script in another window, cd /sys/devices/system/node/node0 while :; do echo 0 >cpu1/online echo 0 >cpu2/online sleep 2 echo 1 >cpu1/online echo 1 >cpu2/online sleep 2 done Sometimes, it takes a little while or needs to re-run the above to see the problem. I have tried the same on another machine (nec-em12.rhts.bos.redhat.com) which could reproduce the original bug with .92 Kernel, but looks like there was no such issue with .98 Kernel. Nec-em12 has a single dual-core physical processor, while nec-em11 has two single-core processors with HT enabled. Prarit: part of /proc/cpuinfo from nec-em11.rhts.bos.redhat.com: processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 3.20GHz stepping : 3 cpu MHz : 3192.078 cache size : 2048 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall lm constant_tsc pn i monitor ds_cpl cid cx16 xtpr bogomips : 6386.56 clflush size : 64 cache_alignment : 128 address sizes : 36 bits physical, 48 bits virtual power management:
Please report the clocksource being used? Presumably TSC since this is specific to CPU disablement/enablement. Just want to confirm.
bmaly, time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer. time.c: Detected 3192.109 MHz processor. P.
I've extended the reservation for nec-em11.rhts for 99 hours. I've created a new gtod_test, gtod_test01.c which runs the gtod_test in a slightly different manner. In the new test it does the following: while (1){ if( gettimeofday(&tv_start, NULL) != 0) error(EXIT_FAILURE,errno,"cannot gettimeofday"); if( gettimeofday(&tv_end, NULL) != 0) error(EXIT_FAILURE,errno,"cannot gettimeofday"); if(tv_lt(tv_end,tv_start)){ print_difftime(tv_start,tv_end); break; } tv_start = tv_end ; } ie) compares back-to-back reads of gettimeofday() I'm currently running the test *WITHOUT* enabling and disabling cpus to see if the problem exists on bare-metal. If the test is still running tomorrow AM, I will reboot and run the test while enabling and disabling cpus. FYI, P.
The test ran fine overnight. I am going to enable Cai's script to enable and disable cpus and retest. P.
This doesn't appear to be a gettimeofday issue. It looks like CPU Hotplug is broken. [root@nec-em11 prarit]# cd /sys/devices/system/node/node0 [root@nec-em11 node0]# echo 0 >cpu1/online results in [ 179.894725] CPU 1 is now offline [root@nec-em11 node0]# echo 1 >cpu1/online results in *no* output. [root@nec-em11 node0]# echo 0 >cpu1/online results in (note the output below includes some debug output I've added) [ 183.565332] Initializing CPU#1 [ 183.565361] start_secondary: calling smp_callin [ 183.565365] CPU#1 (phys ID: 6) waiting for CALLOUT [ 183.566224] CALLIN, before setup_local_APIC(). [ 183.566230] smp_callin: calling calibrate_delay [ 183.625901] Calibrating delay using timer specific routine.. 6383.23 BogoMIPS (lpj=3191617) [ 183.625905] Stack at about ffff8100026cdef4 [ 183.626287] Intel(R) Xeon(TM) CPU 3.20GHz stepping 03 [ 183.626292] [ 183.626294] Call Trace: [ 183.626316] [<ffffffff800768c5>] start_secondary+0x3b/0x4b8 [ 183.626352] [ 183.626354] cpu 1: setting up apic clock [ 183.626906] cpu 1: enabling apic timer [ 186.866164] CPU 1 is now offline ie) the last "echo 0" results in the processor being brought online and then immediately offline. I will go speak to dzickus to see if any changes were recently made in CPU Hotplug that I'm unaware of.
After some additional testing it looks like the difference between gettimeofday() reads is not always 1 usec. It seems to vary anywhere from 10 usec to lt. 1 usec (but obviously not 0 ;) ). That footprint also indicates that this isn't the same issue encountered previously. I suspect that some value is not being updated when a cpu is brought back up via the sysfs interface. P.
example output showing gt. 1 usec difference between gettimeofday() reads: [root@nec-em11 prarit]# ./gtod_test01 TPstart_time=1221595467.243089s start time =1221595469.699872 end time =1221595469.699867 reverse -5000 nsec (-1.999995 ) P.
This is still a problem, however, it is a relatively minor problem that occurs only on bare-metal. Closing as WONTFIX. P.