Created attachment 747961 [details] agent.log Description of problem: Erroneous "clocks not in sync" message when importing large number (~4K) of resources. "date" command clear shows both system clocks are within 1 second. 2013-05-14 18:05:46,819 ERROR [RHQ Agent Ping Thread-1] (org.rhq.enterprise.agent.AgentMain)- {AgentMain.time-not-synced}The server and agent clocks are not in sync. Server=[1368569101312][May 14, 2013 6:05:01 PM EDT], Agent=[1368569146818][May 14, 2013 6:05:46 PM EDT] Version-Release number of selected component (if applicable): 3.2 alpha 5 How reproducible: 100% Steps to Reproduce: 1. Run perftest plugin with 10servers, 200 services 2. Import all top-level servers Actual results: Expected results: Additional info:
The only way for the agent to determine if its clock is in sync with the server is to ask the server for its time and compare it with its own time. If the system is under high load and the time it takes to process the message takes a long time, this clock difference calculation is skewed. So, it is suggested you ignore any transient, one-off messages that warn you about clock skew, especially if it occurred during high load (such as when importing large number of resources as is the case here). You should stop seeing these messages about clock skew once the agent hits steady state and the clocks really are in sync. Only if you consistently see this message should you consider checking the clocks.
As mazz explains, this is a rare case where the response to the agent was so slow that it appeared there was sync issue. We could probably protect against this by capturing the RT time for the server request and if > than say, 1s, don't perform the check. But as far as I know this has not been a point of confusion for users. Pushing and recommend close won't fix.
As per previous comments, closing this as WONTFIX. The clock skew warning captured in this bug report should be rare and would only occur under heavy load. In this particular case, the reported skew indicates that this agent is running approximately 45 seconds behind due to load. Although this is not ideal, it should be rare that this will happen and when it does, it most likely indicates that the clocks are really out of sync or that the agent is so heavily loaded that it is like the clocks are out of sync.