Bug 1968175 - [4.8.0] Agent validation not including reason for being insufficient
Summary: [4.8.0] Agent validation not including reason for being insufficient
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: assisted-installer
Version: 4.8
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.8.0
Assignee: Fred Rolland
QA Contact: Yuri Obshansky
URL:
Whiteboard: AI-Team-Hive KNI-EDGE-4.8 KNI-EDGE-JU...
Depends On: 1968067
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-06-06 13:52 UTC by Fred Rolland
Modified: 2021-07-27 23:11 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of: 1968067
Environment:
Last Closed: 2021-07-27 23:11:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift assisted-service pull 1926 0 None open Bug 1968067: KubeAPI add all none-success validations to conditions 2021-06-06 13:52:04 UTC
Github openshift assisted-service pull 1927 0 None open [ocm-2.3] Bug 1968175: KubeAPI add all none-success validations to conditions 2021-06-08 10:20:51 UTC
Red Hat Bugzilla 1968067 1 high CLOSED [master] Agent validation not including reason for being insufficient 2021-07-27 23:11:55 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 23:11:55 UTC

Description Fred Rolland 2021-06-06 13:52:01 UTC
+++ This bug was initially created as a clone of Bug #1968067 +++

Created attachment 1789025 [details]
cluster resource from REST API

Description of problem:

When a cluster's validations are pending, but not failed, a related Agent's status can show the following condition:


    - lastTransitionTime: "2021-06-03T15:05:29Z"
      message: 'The agent''s validations are failing: '
      reason: ValidationsFailing
      status: "False"
      type: Validated


The lack of any reasons leaves the user unable to understand what's wrong or how to proceed.

Code that ignores pending validations: https://github.com/openshift/assisted-service/blob/80d9b5f/internal/controller/controllers/agent_controller.go#L439-L448

I'll attach the full Cluster representation.

Version-Release number of selected component (if applicable):
0.0.5-rc1


How reproducible:
always

Steps to Reproduce:
1. create a ClusterDeployment and AgentClusterInstall such that validations are pending
2. Create InfraEnv
3. boot the ISO, approve the resulting Agent, and watch its conditions

--- Additional comment from mfilanov on 20210606T06:19:21

Pending validations are not handled.

--- Additional comment from frolland on 20210606T10:18:26

A new Reason for better clarity can be added: ValidationsUserPending:

AgentCLusterInstall:
Validated 	False 	ValidationsFailing 	The cluster's validations are failing: "summary of failed validations" 	If the cluster status is "insufficient"
Validated 	False 	ValidationsUserPending 	The cluster's validations are are pending for user: "summary of failed validations" If the cluster status is "pending-for-input"

Agent:
Validated 	False 	ValidationsFailing 	The agent's validations are failing: "summary of failed validations" 	If the host status is "insufficient" or "pending-for-input"
Validated 	False 	ValidationsUserPending 	The agent's validations are pending for user: "summary of failed validations" 	If the host status is "pending-for-input"


Also, the summary will include all "not-succeeded" validations (ValidationFailure, ValidationPending, ValidationError) excluding ValidationSuccess & ValidationDisabled

@atraeger WDYT?

--- Additional comment from atraeger on 20210606T10:54:13

@frolland sounds good to me

Comment 3 Chad Crum 2021-06-19 13:56:40 UTC
This has been validated:
- 2.3.0-DOWNSTREAM-2021-06-17-01-26-58
- Hub = 4.8.0-fc.7

Steps:
- Created CRs for SNO type cluster, except for ACI created multiple control plane agents:
  provisionRequirements:
    controlPlaneAgents: 7
- Machine started, but did not pass validations due to multiple control plane agents, and was pending:
- Pending validations showed the reason for pending: 
  - lastTransitionTime: "2021-06-19T13:52:21Z"                                                                                                                                                                       
    message: 'The agent''s validations are pending for user: Machine Network CIDR
      is undefined; the Machine Network CIDR can be defined by setting either the
      API or Ingress virtual IPs,Missing inventory or machine network CIDR,Machine
      Network CIDR or Connectivity Majority Groups missing,Host couldn''t synchronize
      with any NTP server'

Comment 5 errata-xmlrpc 2021-07-27 23:11:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.