Bug 2082604 - [IBMCloud][x86_64] IBM VPC does not properly support RHCOS Custom Image tagging
Summary: [IBMCloud][x86_64] IBM VPC does not properly support RHCOS Custom Image tagging
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.11
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.11.0
Assignee: Nobody
QA Contact: MayXu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-06 14:20 UTC by Christopher J Schaefer
Modified: 2022-08-10 11:11 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-10 11:10:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 5869 0 None open Bug 2082604: Revert "image: set os name to red-coreos-stable-amd64" 2022-05-07 11:15:38 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:11:11 UTC

Description Christopher J Schaefer 2022-05-06 14:20:57 UTC
Version:
4.11


Platform:
ibmcloud

Please specify:
IPI

What happened?
IPI deployments on IBM Cloud (x86_64) fail due to the bootstrap VSI failing to start (stuck in Starting or Failed).

This appears to be due to the recent change to the Operating System tag for the IBM Cloud VPC Custom Image used for deploying VPC VSI's.
https://github.com/openshift/installer/commit/9f339a3a6f34c0498bb137693f4941669945b7e9

From what I can tell, IBM Cloud VPC does not support RHCOS Custom Images properly, which was supposed to have been added recently.


DEBUG ibm_is_instance.bootstrap_node: Still creating... [12m40s elapsed] 
DEBUG ibm_is_instance.bootstrap_node: Still creating... [12m50s elapsed] 
ERROR                                              
ERROR Error: Instance (0787_175bbfd5-1dc9-4695-ac19-4ffc7090a415) went into failed state during the operation  
ERROR  ([                                          
ERROR     {                                        
ERROR         "code": "cannot_start_compute",      
ERROR         "message": "Can't start instance because provisioning failed.", 
ERROR         "more_info": "https://cloud.ibm.com/docs/vpc?topic=vpc-instance-status-messages#cannot-start-compute" 
ERROR     },                                       
ERROR     {                                        
ERROR         "code": "cannot_start_compute",      
ERROR         "message": "Can't start instance because provisioning failed.", 
ERROR         "more_info": "https://cloud.ibm.com/docs/vpc?topic=vpc-instance-status-messages#cannot-start-compute" 
ERROR     }                                        
ERROR ])



What did you expect to happen?
Successful IPI deployment


How to reproduce it (as minimally and precisely as possible)?
100%

Create a new IPI 4.11 cluster on IBM Cloud
1. openshift-install create cluster --dir my-ibm-cluster


Anything else we need to know?
IBM has already created a PR to revert the change that caused this and is working with IBM VPC development to determine the reason why RHCOS Custom Images appear not to work properly.

Comment 1 Christopher J Schaefer 2022-05-06 14:21:50 UTC
PR to revert the change that is causing this issue.

Comment 2 Christopher J Schaefer 2022-05-06 14:47:09 UTC
A similar issue has been reported on 4.10 use as well, which does not have the OS patch, https://github.com/openshift/installer/commit/9f339a3a6f34c0498bb137693f4941669945b7e9

I will have to continue investigating, in case the 100% failure with the patch above versus 100% success without the patch, happen to coincide with an IBM Cloud VPC issue instead.

Comment 3 Christopher J Schaefer 2022-05-06 18:40:03 UTC
Local testing has confirmed the issue affects 4.11 CI/nightly builds, with the OS patch mentioned (RHEL vs. Fedora CoreOS tag).

I also have confirmed the latest release-4.10 build is not affected by this bug, as it does not have the OS patch.

So I believe this is only affecting 4.11, due to this OS patch, and the PR to revert that change.
https://github.com/openshift/installer/pull/5869

Comment 4 MayXu 2022-05-10 02:05:53 UTC
pre-merge test done

Comment 6 MayXu 2022-05-11 02:45:55 UTC
registry.ci.openshift.org/ocp/release:4.11.0-0.ci-2022-05-10-210344 IPI install success

Comment 9 errata-xmlrpc 2022-08-10 11:10:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.