there is a timing issue in the plugin container's measurement manager. when you first start the agent/plugin container, sometimes you will notice that the plugin container misses the first metric collection (you'll see a debug message like "Measurement collection is falling behind... Missed requested time by [30040ms]") Sometimes this doesn't happen, other times it does. Its just a random timing issue. I saw this when I started with a fresh DB, nothing in inventory. I then imported the platform and I then went to the platform view once I saw the first availability go green. But I noticed that I did not collect any traits data, nor did I get any dynamic measurement data. This should have been collected within about 60s after the agent got all its plugins and the plugin container started.
(05:02:04 PM) mazz: ips: check this out. PC's MeasurementManager.scheduleCollection: (05:02:04 PM) mazz: long firstCollection = System.currentTimeMillis(); (05:02:04 PM) mazz: for (MeasurementScheduleRequest request : requests) { (05:02:04 PM) mazz: ScheduledMeasurementInfo info = new ScheduledMeasurementInfo(request, resourceId); (05:02:04 PM) mazz: info.setNextCollection(firstCollection); (05:02:04 PM) mazz: .... (05:02:24 PM) mazz: notice that it (from my reading of this) sets the next collection time to NOW (System.currentTimeMillis) (05:03:03 PM) mazz: so I'm not sure why the collection of metrics is delayed at startup sometimes. like I said, earlier today, I was seeing traits come up and other data pretty quick. but then the last two times, I did not see this. (05:04:25 PM) ips: hmm (05:06:31 PM) ips: mazz - myabe bc of the: if ((System.currentTimeMillis() - 30000L) > requests.iterator().next().getNextCollection()) block in MeasurementCollectorRunner (05:06:56 PM) ips: maybe by the time it actually gets scheduled 30s has passed (05:07:17 PM) ips: well, before it actually runs i mean (05:08:39 PM) mazz: yes.. that is exactly what I'm looking at now (05:08:45 PM) mazz: I do see those messags in my logs (05:08:49 PM) mazz: about falling behind (05:09:18 PM) mazz: that might be why I see it sometimes but not others? just a timing issue (05:10:16 PM) ips: ah, check out MeasurementManager.initialize() (05:10:26 PM) ips: the initial delay for the collector threadpool is 30s (05:10:38 PM) ips: also its max size is 5 (05:11:09 PM) ips: between those 2 things, i'm not suprised that if check in MeasurementCollectorRunner is often true (05:37:39 PM) ips: mazz - assuming that initial delay is what's causing issues, here's one idea: (05:38:05 PM) mazz: I think that is it (05:38:09 PM) ips: in MeasurementManager.initialize(): (05:38:10 PM) mazz: I'm sitting at a breatpoint right now (05:38:25 PM) ips: right after the line: this.collectorThreadPool.schedule(new MeasurementCollectionRequester(), collectionInitialDelaySecs, (05:38:25 PM) ips: TimeUnit.SECONDS); (05:38:27 PM) mazz: it hit that part where it thinks it is falling behind - and its all the platform metrics (05:39:25 PM) ips: add: this.initialCollectionStartTime = System.currentTimeMillis() + (collectionInitialDelaySecs * 1000); (05:40:27 PM) ips: and then in scheduleCollection() change: long firstCollection = System.currentTimeMillis(); (05:40:52 PM) ips: to: long firstCollection = Math.max(System.currentTimeMillis(), this.initialCollectionStartTime); (05:43:00 PM) mazz: yeah, that would delete the very first collecton out in the future by the initial delay amount (05:43:08 PM) mazz: I'll see if that works
that proposed solution doesn't work. metrics are still delayed, but I don't see the falling behind message. not sure what the problem is yet
fix was to always delay the first collection. git commit: master: 98ea586e555d3f07ba568b19b75f144872a8471a release_jon3.x: 61f2d882d8c614c5635f875fbe6c37e58032bb0b
note: do not expect metrics to come in immediately upon starting the agent or immediately upon importing a resource. it will take about 60 seconds for the data to come in - this includes dynamic metrics as well as traits.
*** Bug 536181 has been marked as a duplicate of this bug. ***
changing status of VERIFIED BZs for JON 2.4.2 and JON 3.0 to CLOSED/CURRENTRELEASE