Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1968067

Summary: [master] Agent validation not including reason for being insufficient
Product: OpenShift Container Platform Reporter: Michael Hrivnak <mhrivnak>
Component: assisted-installerAssignee: Fred Rolland <frolland>
assisted-installer sub component: assisted-service QA Contact: Yuri Obshansky <yobshans>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: aos-bugs, atraeger, frolland, mfilanov, nshidlin
Version: 4.8Keywords: Triaged
Target Milestone: ---Flags: frolland: needinfo-
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: AI-Team-Hive KNI-EDGE-4.8 KNI-EDGE-JUKE-4.8
Fixed In Version: OCP-Metal-v1.0.21.3 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 1968175 (view as bug list) Environment:
Last Closed: 2021-07-27 23:11:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1968175    
Attachments:
Description Flags
cluster resource from REST API none

Description Michael Hrivnak 2021-06-04 20:27:26 UTC
Created attachment 1789025 [details]
cluster resource from REST API

Description of problem:

When a cluster's validations are pending, but not failed, a related Agent's status can show the following condition:


    - lastTransitionTime: "2021-06-03T15:05:29Z"
      message: 'The agent''s validations are failing: '
      reason: ValidationsFailing
      status: "False"
      type: Validated


The lack of any reasons leaves the user unable to understand what's wrong or how to proceed.

Code that ignores pending validations: https://github.com/openshift/assisted-service/blob/80d9b5f/internal/controller/controllers/agent_controller.go#L439-L448

I'll attach the full Cluster representation.

Version-Release number of selected component (if applicable):
0.0.5-rc1


How reproducible:
always

Steps to Reproduce:
1. create a ClusterDeployment and AgentClusterInstall such that validations are pending
2. Create InfraEnv
3. boot the ISO, approve the resulting Agent, and watch its conditions

Comment 1 Michael Filanov 2021-06-06 06:19:21 UTC
Pending validations are not handled.

Comment 2 Fred Rolland 2021-06-06 10:18:26 UTC
A new Reason for better clarity can be added: ValidationsUserPending:

AgentCLusterInstall:
Validated 	False 	ValidationsFailing 	The cluster's validations are failing: "summary of failed validations" 	If the cluster status is "insufficient"
Validated 	False 	ValidationsUserPending 	The cluster's validations are are pending for user: "summary of failed validations" If the cluster status is "pending-for-input"

Agent:
Validated 	False 	ValidationsFailing 	The agent's validations are failing: "summary of failed validations" 	If the host status is "insufficient" or "pending-for-input"
Validated 	False 	ValidationsUserPending 	The agent's validations are pending for user: "summary of failed validations" 	If the host status is "pending-for-input"


Also, the summary will include all "not-succeeded" validations (ValidationFailure, ValidationPending, ValidationError) excluding ValidationSuccess & ValidationDisabled

@atraeger WDYT?

Comment 3 Avishay Traeger 2021-06-06 10:54:13 UTC
@frolland sounds good to me

Comment 5 nshidlin 2021-06-08 09:32:21 UTC
Verified with:
assisted-service: quay.io/ocpmetal/assisted-service@sha256:2706a902016fdbda8ca61a69052f22275d51f9cbbc18e877fb34d83055949d82

"lastTransitionTime": "2021-06-08T09:28:20Z",                                                                                                                            
    "message": "The agent's validations are pending for user: Machine Network CIDR is undefined; the Machine Network CIDR can be defined by setting either the API or Ingress
 virtual IPs,Missing inventory or machine network CIDR,Machine Network CIDR or Connectivity Majority Groups missing",                                                        
    "reason": "ValidationsUserPending",
    "status": "False",
    "type": "Validated"

- lastTransitionTime: "2021-06-08T06:40:36Z"                                                                                                                               
    message: 'The agent''s validations are failing: Require at least 8 CPU cores for                                                                                         
      master role, found only 4,Require at least 32.00 GiB RAM for role master, found                                                                                        
      only 16.00 GiB,Hostname localhost is forbidden'                                                                                                                        
    reason: ValidationsFailing                                                                                                                                               
    status: "False"                                                                                                                                                          
    type: Validated

Comment 8 errata-xmlrpc 2021-07-27 23:11:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438