Bug 700886 - RHEL5.6 TSC used as default clock source on multi-chassis system
Summary: RHEL5.6 TSC used as default clock source on multi-chassis system
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.6
Hardware: x86_64
OS: All
urgent
urgent
Target Milestone: rc
: ---
Assignee: Prarit Bhargava
QA Contact: Zhang Kexin
URL:
Whiteboard:
Depends On:
Blocks: 684940 690969 726799 758797
TreeView+ depends on / blocked
 
Reported: 2011-04-29 18:20 UTC by IBM Bug Proxy
Modified: 2018-11-26 18:29 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
With this update, IBM System x3850 X5 is now properly identified as a multi-chassis system by querying the system name and checking for multiple Chassis entries in the SMBIOS table. If multiple Chassis entries are found, the TSC is marked as unsynchronized. The side effect of this solution is that the kernel will attempt to synchronize the TSC on every CPU during system boot which will cause a small delay and error message to be displayed. For other multi-chassis systems, the "notsc" boot parameter can be used to disable the TSC.
Clone Of:
Environment:
Last Closed: 2012-02-21 03:47:04 UTC
Target Upstream Version:


Attachments (Terms of Use)
RHEL5 initial patch (3.45 KB, patch)
2011-05-24 13:28 UTC, Prarit Bhargava
no flags Details | Diff
Preliminary updated patch to detect IBM multi-chassis systems (3.81 KB, text/plain)
2011-05-25 17:30 UTC, IBM Bug Proxy
no flags Details
RHEL5 v2 (1.50 KB, patch)
2011-05-27 15:04 UTC, Prarit Bhargava
no flags Details | Diff
Patch to detect IBM multi-chassis, modified version of Prarit's earlier patch (1.64 KB, text/plain)
2011-05-27 19:50 UTC, IBM Bug Proxy
no flags Details
RHEL5 v3 (1.53 KB, patch)
2011-05-30 15:22 UTC, Prarit Bhargava
no flags Details | Diff
gettimeofday on 4 cpus which locates in 4 nodes individually (2.96 KB, text/plain)
2011-12-08 09:51 UTC, Zhang Kexin
no flags Details
gettimeofday.c (3.05 KB, text/plain)
2011-12-10 01:17 UTC, Zhang Kexin
no flags Details


Links
System ID Priority Status Summary Last Updated
IBM Linux Technology Center 71837 None None None Never
Red Hat Knowledge Base (Legacy) 55940 None None None Never
Red Hat Product Errata RHSA-2012:0150 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Linux 5.8 kernel update 2012-02-21 07:35:24 UTC

Description IBM Bug Proxy 2011-04-29 18:20:45 UTC
The RHEL5.6 x86_64 kernel picks the TSC as a clock source on two-node x3850 M2 system. The 
TSC is not a reliable clock source on this system as it can't be synchronized across the two 
chassis. The use of the TSC will lead to clock skew and other time inconsistencies.

In RHEL5.5 and earlier the HPET was chosen as the default clock source. It is reliable on the 
multi-chassis system.

The change in behavior was introduced be a change to the apic_is_clustered_box() function. This 
function used to recognize the dual-node x3850 M2 system as a multi-chassis system and mark 
the TSC as unreliable. In RHEL5.6 this function identifies the TSC as reliable. This is not correct.

Problem also exists with RHEL5.6 on the x3850 X5.

1. Server architecture(s) (please list all effected) (x86/POWER6/Z/etc.): x86_64
2. Server type (9117-MMA/HS20/s390/etc.): x3850 X5, x3850 M2
3. Other components involved (ixgbe/java/emulex/etc.): kernel
4. Does the server have the latest GA firmware? Yes
5. Has the problem been shown to occur on more than one system? Yes
6. Collect "sosreport" from machine problem was found on, and attach to bug.
7. What is the latest official distro build on which this bug has been seen? RHEL 5.6

Comment 1 IBM Bug Proxy 2011-04-29 23:50:20 UTC
------- Comment From masbock@us.ibm.com 2011-04-29 19:48 EDT-------
The problem was introduced with the linux-2.6-x86_64-unify-apic-mapping-code.patch

The patch makes the assumption that systems setting the FORCE_APIC_PHYSICAL_DESTINATION_MODE
bit in the FADT cannot not be multi-chassis. This assumption does not hold.

Comment 2 IBM Bug Proxy 2011-05-06 17:40:27 UTC
------- Comment From masbock@us.ibm.com 2011-05-06 13:31 EDT-------
This is a regression from RHEL5.5. Serious time skew problems due to this bug have been observed by a customer.
Therefore I raise the severity to ship issue.

Comment 5 IBM Bug Proxy 2011-05-09 17:40:48 UTC
------- Comment From masbock@us.ibm.com 2011-05-09 13:35 EDT-------
I was wrong when I said this was regression. In fact in RHEL5.5 the TSC is also selected as a clock source on the x3850 M2 dual-node system.
None of the multi-node tests (designed to mark the TSC as unstable)  work for this system.

Comment 6 Prarit Bhargava 2011-05-11 12:26:05 UTC
I've removed the Regression flag from this BZ.

Max -- from your private email you said that RHEL6 correctly chooses the HPET.  I do know that the order of clocksource was changed between RHEL5 and RHEL6.  It is entirely possible that is why RHEL6 works.

I'll take a closer look at the code and specifically the decisions made based on the FADT table information.

P.

Comment 9 Prarit Bhargava 2011-05-13 19:18:34 UTC
(In reply to comment #6)
> I've removed the Regression flag from this BZ.
> 
> Max -- from your private email you said that RHEL6 correctly chooses the HPET. 
> I do know that the order of clocksource was changed between RHEL5 and RHEL6. 
> It is entirely possible that is why RHEL6 works.
> 
> I'll take a closer look at the code and specifically the decisions made based
> on the FADT table information.
> 
> P.

I don't see anything in the timer code that accesses the FADT.  Max, could you send me a dmesg output from a "good" boot and a "bad" boot?  I'd like to take a look ...

P.

Comment 10 IBM Bug Proxy 2011-05-16 16:50:24 UTC
------- Comment From masbock@us.ibm.com 2011-05-16 12:43 EDT-------
(In reply to comment #11)
> (In reply to comment #6)
> > I've removed the Regression flag from this BZ.
> >
> > Max -- from your private email you said that RHEL6 correctly chooses the HPET.
> > I do know that the order of clocksource was changed between RHEL5 and RHEL6.
> > It is entirely possible that is why RHEL6 works.
> >
> > I'll take a closer look at the code and specifically the decisions made based
> > on the FADT table information.
> >
> > P.
> I don't see anything in the timer code that accesses the FADT.  Max, could you
> send me a dmesg output from a "good" boot and a "bad" boot?  I'd like to take a
> look ...

Hi Prarit,

the fact that RHEL6 picks the HPET on the dual-node x3850 M2 is because there is a "time warp" check for TSCs on different CPUs. This check discovers a warp between the TSC on CPU 0 (chassis 1) and CPU 16 (chassis 2) and as a consequence removes the TSC from the list of available clock sources.

RHEL5: My earlier comment that the new code in  apic_is_clustered_box() broke the kernel's ability to detect x3850 M2 multi-node system is incorrect. These multi-chassis systems were never detected as such by RHEL5. (the dmi_check_multi() function applies to older IBM systems).

The fact remains that neither RHEL5 nor RHEL6 categorizes this dual-node system as a multi-chassis box.

Comment 11 Prarit Bhargava 2011-05-17 13:02:09 UTC
.
> 
> Hi Prarit,
> 
> the fact that RHEL6 picks the HPET on the dual-node x3850 M2 is because there
> is a "time warp" check for TSCs on different CPUs. This check discovers a warp
> between the TSC on CPU 0 (chassis 1) and CPU 16 (chassis 2) and as a
> consequence removes the TSC from the list of available clock sources.

Ah, I see.  So it's purely by accident that RHEL6 does the right thing.

> 
> RHEL5: My earlier comment that the new code in  apic_is_clustered_box() broke
> the kernel's ability to detect x3850 M2 multi-node system is incorrect. These
> multi-chassis systems were never detected as such by RHEL5. (the
> dmi_check_multi() function applies to older IBM systems).
> 
> The fact remains that neither RHEL5 nor RHEL6 categorizes this dual-node system
> as a multi-chassis box.

Hmm ... do you know if the system is correctly identified as multi-chassis upstream?  If that's broken then we should attempt to fix this problem there and move the code back into RHEL5 and RHEL6.

I wonder if the chassis type in the SMBIOS (Type3, "Type" field which should be 0x19) is correct on your system?

Can you do a 'dmidecode -t 3' and put the output in this BZ?

Thanks,

P.
P.

Comment 12 IBM Bug Proxy 2011-05-20 20:01:31 UTC
------- Comment From masbock@us.ibm.com 2011-05-20 15:53 EDT-------
(In reply to comment #13)

> I wonder if the chassis type in the SMBIOS (Type3, "Type" field which should be
> 0x19) is correct on your system?
> Can you do a 'dmidecode -t 3' and put the output in this BZ?

# dmidecode -t 3
# dmidecode 2.11
SMBIOS 2.4 present.

Handle 0x003A, DMI type 3, 13 bytes
Chassis Information
Manufacturer: IBM
Type: Main Server Chassis
Lock: Not Present
Version: Not Specified
Serial Number: Not Specified
Asset Tag:
Boot-up State: Safe
Power Supply State: Unknown
Thermal State: Unknown
Security Status: Unknown

Handle 0x003B, DMI type 3, 13 bytes
Chassis Information
Manufacturer: IBM
Type: Main Server Chassis
Lock: Not Present
Version: Not Specified
Serial Number: Not Specified
Asset Tag:
Boot-up State: Safe
Power Supply State: Unknown
Thermal State: Unknown
Security Status: Unknown

Comment 13 Prarit Bhargava 2011-05-22 21:36:26 UTC
Okay, that seems correct (and what I wrote about earlier with 0x19 was actually incorrect).

Each chassis has it's own Chassis structure and you have two chassis therefore two Chassis entries in the SMBIOS structs.

I think I can code around this scenario -- can you test out a kernel patch for me?

Thanks,

P.

Comment 14 IBM Bug Proxy 2011-05-23 19:30:29 UTC
------- Comment From lcm@us.ibm.com 2011-05-23 15:23 EDT-------
(In reply to comment #13)
> .
> > Hi Prarit,
> > the fact that RHEL6 picks the HPET on the dual-node x3850 M2 is because there
> > is a "time warp" check for TSCs on different CPUs. This check discovers a warp
> > between the TSC on CPU 0 (chassis 1) and CPU 16 (chassis 2) and as a
> > consequence removes the TSC from the list of available clock sources.
> Ah, I see.  So it's purely by accident that RHEL6 does the right thing.

Not necessarily by accident. In my opinion, RHEL6 (and current mainline) are using the most accurate mechanism for determining whether the CPU TSCs can be used as a global time source - checking for TSC time warp across sockets/buses/interconnects, after calling unsynchronized_tsc() fast path.

Multi-chassis platforms typically don't have synchronized TSCs across chassis boundaries. However, it would be perfectly reasonable to assume that a platform could implement logic that would keep the TSCs synchronous even across physical nodes. So, a generic 'if multi-chassis' check may not always apply.

I think the appropriate way to fix this for RHEL5 would be to incorporate the check tsc sync code from mainline (probably too invasive?) or include multi_dmi_table[] entries for the other affected multi-node servers. The caveat with simply including multi_dmi_table[] entries is that single node and multi node servers will have the same DMI information. So, and additional change that checks the number of nodes (chassis) or number of CPUs in the platform would also be required.

Comment 15 IBM Bug Proxy 2011-05-23 22:50:22 UTC
------- Comment From masbock@us.ibm.com 2011-05-23 18:49 EDT-------
For reference, the TSC time warp check went into 2.6.19 and is described in this article:
http://lwn.net/Articles/211051/

On the dual-node x3850 M2 with RHEL6 it is this code that detects a time warp between TSCs on different nodes.

Comment 16 Prarit Bhargava 2011-05-23 23:11:09 UTC
>I think the appropriate way to fix this for RHEL5 would be to incorporate the
>check tsc sync code from mainline (probably too invasive?) 

I spent my weekend reviewing the TSC Warp code and I agree that it is too invasive for this stage in RHEL5.

>or include
>multi_dmi_table[] entries for the other affected multi-node servers. The caveat
>with simply including multi_dmi_table[] entries is that single node and multi
>node servers will have the same DMI information. 

Well ... maybe we could figure something out for your specific system.  What we do know is that there are TWO (or more) SMBIOS Type 3 Chassis structures.

So maybe something like:

if (vendor == IBM && model == x3850) && (num_chassis() > 1)
    notsc = true;

Of course, I would use the standard dmi code in the kernel for this ...

I realize this doesn't scale well, but I don't think the TSC Warp code would get into RHEL5.

P.

Comment 17 IBM Bug Proxy 2011-05-24 00:00:23 UTC
------- Comment From lcm@us.ibm.com 2011-05-23 19:51 EDT-------
(In reply to comment #19)
> >or include
> >multi_dmi_table[] entries for the other affected multi-node servers. The caveat
> >with simply including multi_dmi_table[] entries is that single node and multi
> >node servers will have the same DMI information.
>
> Well ... maybe we could figure something out for your specific system.  What we
> do know is that there are TWO (or more) SMBIOS Type 3 Chassis structures.
>
> So maybe something like:
>
> if (vendor == IBM && model == x3850) && (num_chassis() > 1)
> notsc = true;
>
> Of course, I would use the standard dmi code in the kernel for this ...
>
> I realize this doesn't scale well, but I don't think the TSC Warp code would
> get into RHEL5.
>

This works for me. Max and I can get you the appropriate DMI data for the pertinent servers. We have already had a couple of customer escalations relative to this issue, and while there's a boot option workaround, it would be terrific if things just works as expected out of the box. Thanks!

Comment 18 Prarit Bhargava 2011-05-24 13:28:43 UTC
Created attachment 500609 [details]
RHEL5 initial patch

lcm (sorry, I didn't catch your full name) and Max,

Can you please modify this patch with your DMI entries and test?  This patch will count the number of type 3 structures, and cause unsynchronized_tsc() to return 1.

Thanks,

P.

Comment 19 IBM Bug Proxy 2011-05-24 18:31:27 UTC
------- Comment From masbock@us.ibm.com 2011-05-24 14:20 EDT-------
(In reply to comment #21)
> Created an attachment (id=61710) [details]
> RHEL5 initial patch
>
>
> ------- Comment on attachment From prarit@redhat.com 2011-05-24 09:28:43
> EDT-------
>
>
> lcm (sorry, I didn't catch your full name) and Max,
>
> Can you please modify this patch with your DMI entries and test?  This patch
> will count the number of type 3 structures, and cause unsynchronized_tsc() to
> return 1.
>

I am collecting the DMI information and will test the patch.

Comment 20 IBM Bug Proxy 2011-05-25 17:30:32 UTC
Created attachment 500886 [details]
Preliminary updated patch to detect IBM multi-chassis systems


------- Comment on attachment From masbock@us.ibm.com 2011-05-25 13:22 EDT-------


Prarit, I updated your patch with dmi information that will match some of the systems for which we need to detect multi-chassis. I have tested the patch on an IBM x3850 M2 multi-chassis system. I correctly detects the multiple chassis and selects the HPET as clock source. I will have to do more extensive testing, including the case of the single-chassis system of the same type.

Comment 21 IBM Bug Proxy 2011-05-25 22:50:32 UTC
------- Comment From masbock@us.ibm.com 2011-05-25 18:48 EDT-------
The patch has a side effect. Due to the following code:
static void __cpuinit tsc_sync_wait(void)
{
/*
* When the CPU has synchronized TSCs assume the BIOS
* or the hardware already synced.  Otherwise we could
* mess up a possible perfect synchronization with a
* not-quite-perfect algorithm.
*/
if (notscsync || !cpu_has_tsc || !unsynchronized_tsc())
return;
sync_tsc(0);
}

sync_tsc is now called on each CPU because unsynchronized_tsc() returns 1. This does no harm in this case, but it is unnecessary and noisy (and perhaps confusing: why sync the TSCs if we don't use them). Here is boot time dmesg output from one of the CPUs:

Booting processor 6/32 APIC 0x11
Initializing CPU#6
Calibrating delay using timer specific routine.. 5863.06 BogoMIPS (lpj=2931531)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 4096K
CPU 6/11 -> Node 0
CPU: Physical Processor ID: 4
CPU: Processor Core ID: 1
CPU6: Thermal monitoring enabled (TM1)
Intel(R) Xeon(R) CPU           X7350  @ 2.93GHz stepping 0b
APIC: IBM x3850 Multi Chassis detected
CPU 6: Syncing TSC to CPU 0.
CPU 6: synchronized TSC with CPU 0 (last diff -407 cycles, maxerr 4488 cycles)
SMP alternatives: switching to SMP code

Comment 22 Prarit Bhargava 2011-05-25 23:01:12 UTC
(In reply to comment #21)
> ------- Comment From masbock@us.ibm.com 2011-05-25 18:48 EDT-------
> The patch has a side effect. Due to the following code:
> static void __cpuinit tsc_sync_wait(void)
> {
> /*
> * When the CPU has synchronized TSCs assume the BIOS
> * or the hardware already synced.  Otherwise we could
> * mess up a possible perfect synchronization with a
> * not-quite-perfect algorithm.
> */
> if (notscsync || !cpu_has_tsc || !unsynchronized_tsc())
> return;
> sync_tsc(0);
> }

.. Working around that maybe difficult.  My vote is that we just live with it.  We know we're going to reject the tsc anyway and like you said it is harmless and just spits out a bit of extra (ignorable) info into dmesg.

P.

Comment 25 Prarit Bhargava 2011-05-27 15:04:12 UTC
Created attachment 501325 [details]
RHEL5 v2

Max, does this patch work for you?  It's a bit cleaner than the first patch...

P.

Comment 26 IBM Bug Proxy 2011-05-27 17:40:31 UTC
------- Comment From masbock@us.ibm.com 2011-05-27 13:31 EDT-------
(In reply to comment #26)
> Created an attachment (id=61802) [details]
> RHEL5 v2

>
> Max, does this patch work for you?  It's a bit cleaner than the first patch...
>

Prarit,

unfortunately this patch doesn't work. unsynchronized_tsc() is called from every secondary CPU. num_chassis gets incremented every time unsynchronized_tsc() is invoked. num_chassis ends up being (NUM_CPUS * NUM_CHASSIS).
Perhaps something like this would work:

__cpuinit int unsynchronized_tsc(void)
{
#ifdef CONFIG_SMP
+       /*
+        * RHEL5: Upstream the TSC Warp code should catch multi-chassis
+        * systems.  The code is too invasive for RHEL5.  Doing this
+        * check here is safe ...
+        */
+       if (dmi_check_system(multi_dmi_table)) {
+               if (num_chassis)      /* only walk once */
+                       dmi_walk(check_multi_chassis);
+               if (num_chassis > 1)
+                       return 1;
+       }
+

But this only works if we are sure unsynchronized_tsc is called sequentially on all CPUs.

- Max

Comment 27 IBM Bug Proxy 2011-05-27 17:50:28 UTC
------- Comment From masbock@us.ibm.com 2011-05-27 13:46 EDT-------
(In reply to comment #27)

Correction to my previous comment: it should really be "if (!num_chassis) /* walk only once */"

> Perhaps something like this would work:
>
>  __cpuinit int unsynchronized_tsc(void)
>  {
>  #ifdef CONFIG_SMP
> +       /*
> +        * RHEL5: Upstream the TSC Warp code should catch multi-chassis
> +        * systems.  The code is too invasive for RHEL5.  Doing this
> +        * check here is safe ...
> +        */
> +       if (dmi_check_system(multi_dmi_table)) {
+               if (!num_chassis)      /* walk only once */  <-- was wrong before
> +                       dmi_walk(check_multi_chassis);
> +               if (num_chassis > 1)
> +                       return 1;
> +       }
> +
>

Comment 28 IBM Bug Proxy 2011-05-27 19:50:35 UTC
Created attachment 501377 [details]
Patch to detect IBM multi-chassis, modified version of Prarit's earlier patch


------- Comment on attachment From masbock@us.ibm.com 2011-05-27 15:43 EDT-------


Updated version of Prarit's last patch. I modified it so that chassis are counted only once, based on my earlier comments.

Comment 29 Prarit Bhargava 2011-05-30 15:22:35 UTC
Created attachment 501833 [details]
RHEL5 v3

Oops -- good point Max :)  How 'bout this then?  This way we only actually run the chassis check once.

P.

Comment 30 IBM Bug Proxy 2011-06-02 00:00:51 UTC
------- Comment From masbock@us.ibm.com 2011-06-01 19:54 EDT-------
(In reply to comment #30)
> Created an attachment (id=61826) [details]
> RHEL5 v3
>
>
> ------- Comment on attachment From prarit@redhat.com 2011-05-30 11:22:35
> EDT-------
>
>
> Oops -- good point Max :)  How 'bout this then?  This way we only actually run
> the chassis check once.
>
> P.

__cpuinit int unsynchronized_tsc(void)
{
#ifdef CONFIG_SMP
+	/*
+	 * RHEL5: Upstream the TSC Warp code should catch multi-chassis
+	 * systems.  The code is too invasive for RHEL5.  Doing this
+	 * check here is safe ...
+	 */
+	if (num_chassis > 1)
+		return 1;
+
+	if (dmi_check_system(multi_dmi_table)) {
+		dmi_walk(check_multi_chassis);
+		if (num_chassis > 1)
+			return 1;
+	}
+

This doesn't work either because unsynchronized_tsc is called NR_CPUS times (on every non-boot cpu and in time_init()). On a single chassis system the first invocation set num_chassis to 1. The second invocation passes the num_chassis > 1 test and set num_chassis to 2. The third invocation deems the TSC as unsynchronized.

The last the patch I attached does the following:
__cpuinit int unsynchronized_tsc(void)
{
#ifdef CONFIG_SMP
+	/*
+	 * RHEL5: Upstream the TSC Warp code should catch multi-chassis
+	 * systems.  The code is too invasive for RHEL5.  Doing this
+	 * check here is safe ...
+	 */
+	if (dmi_check_system(multi_dmi_table)) {
+		if (!num_chassis) /* walk dmi only once to count chassis */
+			dmi_walk(check_multi_chassis);
+		if (num_chassis > 1)
+			return 1;
+	}
+

This seems to work. Tested on single and multi-chassis systems.

- Max

Comment 32 Prarit Bhargava 2011-06-13 13:09:19 UTC
masbock,

I'm submitting

https://bugzilla.redhat.com/attachment.cgi?id=501377

for internal review this AM.

FYI ;)

P.

Comment 33 RHEL Program Management 2011-08-04 04:13:16 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 35 Jarod Wilson 2011-08-23 14:01:15 UTC
Patch(es) available in kernel-2.6.18-282.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5
Detailed testing feedback is always welcomed.

Comment 38 Dong Zhu 2011-12-05 10:40:30 UTC
hi IBM,  

1) how to reproduce it and how can I determine the problem has been reproduced ?  What is the phenomenon of clock skew ?

2)Does this problem exist in single chassis systems ?

3) Does X3950 M2  have the same problem ?


I did the following steps on X390M2 ,Is it right ?
(ibm-x3950m2-1.gsslab.rdu.redhat.com)

#uname -r
2.6.18-275.el5

#dmesg
time.c: Using 266.538728 MHz WALL PIT GTOD PIT/TSC timer.

the system use the TSC
----------------------------------------------------------------------------------

#uname -r
2.6.18-300.el5

#dmesg
time.c: Using 266.538728 MHz WALL PIT GTOD PIT/HPET timer.

Calibrating delay using timer specific routine.. 5330.02 BogoMIPS (lpj=2665013)
CPU: L1 I cache: 32K, L1 D cache: 32K
CPU: L2 cache: 3072K
CPU: L3 cache: 16384K
CPU 1/0 -> Node 0
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
Genuine Intel(R) CPU                  @ 2.66GHz stepping 01
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff -2040 cycles, maxerr 2550 cycles)

the system use the hpet

Comment 39 Zhang Kexin 2011-12-06 02:20:18 UTC
Hi Washer, could you please have a look at comment#38 ? Thanks.

Comment 40 Zhang Kexin 2011-12-08 09:51:45 UTC
Created attachment 542450 [details]
gettimeofday on 4 cpus which locates in 4 nodes individually

Comment 46 James Washer 2011-12-10 22:24:10 UTC
The original problem was reproduced by a simple program calling sleep and observing the actual sleep times. One such process bound to each processor. Much like the suggestion above.

Comment 54 Zhang Kexin 2011-12-17 10:11:42 UTC
(In reply to comment #46)
> The original problem was reproduced by a simple program calling sleep and
> observing the actual sleep times. One such process bound to each processor.
> Much like the suggestion above.

Hi James, could you please upload the reproducer? Because we are not sure how to reproduce it exactly. Thanks!

Comment 56 Martin Prpič 2012-02-02 12:12:22 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
With this update, IBM System x3850 X5 is now properly identified as a multi-chassis system by querying the system name and checking for multiple Chassis entries in the SMBIOS table. If multiple Chassis entries are found, the TSC is marked as unsynchronized. The side effect of this solution is that the kernel will attempt to synchronize the TSC on every CPU during system boot which will cause a small delay and error message to be displayed. For other multi-chassis systems, the "notsc" boot parameter can be used to disable the TSC.

Comment 57 errata-xmlrpc 2012-02-21 03:47:04 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0150.html


Note You need to log in before you can comment on or make changes to this bug.