Bug 1943376 - ingress-operator doesn't send always send helpful error messages to install-log when it fails to come up
Summary: ingress-operator doesn't send always send helpful error messages to install-l...
Keywords:
Status: CLOSED EOL
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: ---
Assignee: Ryan Fredette
QA Contact: Hongan Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-25 22:27 UTC by Chris Collins
Modified: 2024-06-14 01:02 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-11-04 15:19:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Chris Collins 2021-03-25 22:27:26 UTC
Description of problem:

Unsure of component to route to: ingress would be most appropriate, but I was unable to find that specifically.  Perhaps cloud provider would be next most appropriate.

OSD ROSA cluster w/BYOC VPC install failed. Cause appears to be tracked back to lack of available IPs in the AWS subnet.

> 42m        Warning   SyncLoadBalancerFailed   service/router-default   (combined from similar events): Error syncing load balancer: failed to ensure load balancer: InvalidSubnet: Not enough IP space available in subnet-xxxxxxxx. ELB requires at least 8 free IP addresses in each subnet.

This causes the ingress and console clusteroperators to fail.

Discussion within the team suggested filing this BZ with a request for a check of available IP space and a message printed clearly to the install log if not enough space is available.

Version-Release number of selected component (if applicable): OSD OCP 4.7.2 on AWS

How reproducible: Have not reproduced, but presumably would be possible with a subnet lacking available IPs

Steps to Reproduce: N/A

Actual results: Cluster failed to complete install. API server up and available, but ingress and console operators degraded.  External access to cluster unavailable.

Expected results: Identify lack of available IPs for ingress ELB and halt (or enter pending state) install, printing results to the log.

Additional info:

Comment 1 Greg Sheremeta 2021-03-26 12:08:39 UTC
@mmasters @sgreene this is another case where we need this exact log message printed right in the openshift-install log.

Error syncing load balancer: failed to ensure load balancer: InvalidSubnet: Not enough IP space available in subnet-xxxxxxxx. ELB requires at least 8 free IP addresses in each subnet.

Because that is a message we want to 1. make super obvious to the users of ROSA and OpenShift Dedicated, 2. use in Hive to transform into a nice error code for both the users and Red Hat SRE.

Comment 5 Greg Sheremeta 2021-08-12 00:10:38 UTC
We had a similar problem today where we had to dig this out of a must-gather:
`failed to describe elb load balancers: InvalidClientTokenId: The security token included in the request is invalid\n\tstatus code: 403`

@mmasters @sgreene this is another case where we need this exact log message printed right in the openshift-install log.

Can we please prioritize this?

Comment 8 Miciah Dashiel Butler Masters 2022-01-27 17:20:08 UTC
Moving off of 4.10.0; we'll get this in a later release.

Comment 10 mfisher 2022-11-04 15:19:03 UTC
This issue is stale and closed because it has no activity for a significant amount of time and is reported on a version no longer in maintenance.  If this issue should not be closed please verify the condition still exists on a supported release and submit an updated bug.


Note You need to log in before you can comment on or make changes to this bug.