+++ This bug was initially created as a clone of Bug #1833098 +++ https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_installer/3561/pull-ci-openshift-installer-master-e2e-azure/540 https://search.apps.build01.ci.devcluster.openshift.com/?search=NodeClockNotSynchronising&maxAge=168h&context=1&type=bug%2Bjunit&name=azure&maxMatches=5&maxBytes=20971520&groupBy=job Across 805 runs and 80 jobs (54.29% failed), matched 35.24% of failing runs and 13.75% of jobs [sig-instrumentation][Late] Alerts shouldn't report any alerts in firing state apart from Watchdog and AlertmanagerReceiversNotConfigured [Suite:openshift/conformance/parallel] expand_less 1m30s fail [github.com/openshift/origin/test/extended/util/prometheus/helpers.go:174]: Expected <map[string]error | len:1>: { "count_over_time(ALERTS{alertname!~\"Watchdog|AlertmanagerReceiversNotConfigured|KubeAPILatencyHigh\",alertstate=\"firing\",severity!=\"info\"}[2h]) >= 1": { s: "promQL query: count_over_time(ALERTS{alertname!~\"Watchdog|AlertmanagerReceiversNotConfigured|KubeAPILatencyHigh\",alertstate=\"firing\",severity!=\"info\"}[2h]) >= 1 had reported incorrect results:\n[{\"metric\":{\"alertname\":\"NodeClockNotSynchronising\",\"alertstate\":\"firing\",\"endpoint\":\"https\",\"instance\":\"ci-op-hy8z3bni-2dc90-xpt9z-master-0\" Looks like in the run I linked NodeClockNotSynchronising is firing on all three nodes because node_timex_sync_status is empty. --- Additional comment from Colin Walters on 2020-05-07 19:45:01 UTC --- This should be fixed since https://gitlab.cee.redhat.com/coreos/redhat-coreos/merge_requests/918 but rollout was stalled by https://bugzilla.redhat.com/show_bug.cgi?id=1781575 I think also we've only done the bump in 4.6 and need to backport it to 4.5.
https://gitlab.cee.redhat.com/coreos/redhat-coreos/merge_requests/946
Linked PR was merged on May 15; we've had a number of successful RHCOS 4.5 builds since then. Marking as MODIFIED.
Validation Steps: 1. Launched an Azure Cluster 2. Confirmed that: - /run/systemd/generator/chronyd.service.d/coreos-platform-chrony.conf exists - /run/coreos-platform-chrony.conf exists 3. Confirmed that the run-time configuration is being used. sh-4.4# systemctl status chronyd ● chronyd.service - NTP client/server Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled; vendor preset: enabled) Drop-In: /run/systemd/generator/chronyd.service.d └─coreos-platform-chrony.conf Active: active (running) since Wed 2020-06-17 16:34:54 UTC; 42min ago Docs: man:chronyd(8) man:chrony.conf(5) Process: 1300 ExecStartPost=/usr/libexec/chrony-helper update-daemon (code=exited, status=0/SUCCESS) Process: 1294 ExecStart=/usr/sbin/chronyd -f /run/coreos-platform-chrony.conf $OPTIONS (code=exited, status=0/SUCCESS) Main PID: 1298 (chronyd) Tasks: 1 Memory: 1.8M CPU: 449ms CGroup: /system.slice/chronyd.service └─1298 /usr/sbin/chronyd -f /run/coreos-platform-chrony.conf Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409