Description of problem: I simulated NTP Unreachable on one master host. Host turn Insufficient due to missing NTP sources. The issue is that i expected Event message to display the reason for host insufficient. Instead event described machine connectivity issue Host master-0-0: updated status from "known" to "insufficient" (Host cannot be installed due to following failing validation(s): No connectivity to the majority of hosts in the cluster) AND Host master-0-0: updated status from "discovering" to "insufficient" (Host cannot be installed due to following failing validation(s): ) See attached examples Version-Release number of selected component (if applicable): v1.0.14.2 Steps to Reproduce: 1. Block default NTP to one master node, make sure NTP status is Unreachable 2. Look at the events Actual results: failing validation(s): No connectivity to the majority of hosts in the cluster OR empty validation list Expected results: failing validation(s): No NTP sources
Created attachment 1744343 [details] Unreachable example
Created attachment 1744344 [details] events
@mfilanov @ygoldber From some reason the empty validation in the events doesn't reproduce (maybe i used an old cluster). Looks like NTP Unreachable doesn't log an event, as the host was insufficient due to connectivity and still is insufficient late due to NTP Unreachable, so we only have this event: Host master-0-0: updated status from "discovering" to "insufficient" (Host cannot be installed due to following failing validation(s): No connectivity to the majority of hosts in the cluster) So no event for NTP unreachable.
Should be resolved by https://issues.redhat.com/browse/MGMT-3561
This is not related to https://issues.redhat.com/browse/MGMT-3561 The event you describe is part of a transition in the state machine, that change the host status from "known" to "insufficient", probably at the moment of that transition, NTP wasn't failing, once it fails it wasn't part of the transition so you didn't get a new events (with the form of "Host *: updated status from * to * ..."). Anyway, now, there are other events regarding host/cluster validations that will show the issue to the user: * Host *: validation * that used to succeed is now failing * Cluster validation * that used to succeed is now failing * HOst *: validations * is now fixed * Cluster validation * is now fixed. Closing it.
Events including NTP: Host test-infra-cluster-4cef4a83-worker-1: validation 'ntp-synced' that used to succeed is now failing Host test-infra-cluster-4cef4a83-worker-1: validation 'ntp-synced' is now fixed Host test-infra-cluster-4cef4a83-worker-1: updated status from "discovering" to "insufficient" (Host cannot be installed due to following failing validation(s): Host couldn't synchronize with any NTP server ; No connectivity to the majority of hosts in the cluster) Verified in Staging OCP-Metal-V1.0.18.1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.7.9 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:1365