Bug 2025464

Summary: [aws] openshift-install gather bootstrap collects logs for bootstrap and only one master node
Product: OpenShift Container Platform Reporter: jima
Component: InstallerAssignee: Matthew Staebler <mstaeble>
Installer sub component: openshift-installer QA Contact: jima
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: high CC: mstaeble
Version: 4.10   
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-10 16:29:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2031606    

Description jima 2021-11-22 09:56:52 UTC
Version:
4.9/4.10

Platform:
aws

Please specify:
IPI

What happened?
create a ipi on aws with default configuration, only one master ip is present in cluster.tfvars.json, so only this master log is collected when gather bootstrap log.

$ ls log-bundle-20211122143620/control-plane/
10.0.143.20

In cluster.tfvars.json,
$ cat cluster.tfvars.json | jq -r
{
  "ami_id": "ami-0c2dbd95931008b1a",
  "control_plane_ips": [
    "10.0.143.20"
  ],
...
} 

In the code, the first item of list is as output.
https://github.com/openshift/installer/blob/master/data/data/aws/cluster/master/outputs.tf#L2

Normally, all masters IPs should be as output on terraform, then it can be consumed by "openshift gather bootstrap" later.

What did you expect to happen?
All master IPs are stored in cluster.tfvars.json and related logs are collected when launching "openshift-install gather bootstrap"

How to reproduce it (as minimally and precisely as possible)?
Always on 4.9/4.10

Comment 1 Matthew Staebler 2021-11-22 18:15:56 UTC
https://github.com/openshift/installer/blob/7fd358462f14d43f41d64a5d591c85adc2c122f4/data/data/aws/cluster/master/outputs.tf#L2 is only grabbing the first IP address in the list of IP addresses for all the masters, rather than the first IP address for each master.

Comment 4 jima 2021-12-13 02:14:28 UTC
Verified on 4.10.0-0.nightly-2021-10-20-193037, logs on all control nodes are collected, so move bug to VERIFIED.

$ ls -ltr log-bundle-20211213020240/control-plane/
total 12
drwxrwxr-x. 7 jima jima 4096 Dec 13  2021 10.0.141.58
drwxrwxr-x. 7 jima jima 4096 Dec 13  2021 10.0.176.42
drwxrwxr-x. 7 jima jima 4096 Dec 13  2021 10.0.200.85

Issue also exist on 4.9, need to backport, how to handle this? @mstaeble

Comment 5 jima 2021-12-13 02:45:06 UTC
correct comment4, the nightly build used to verify the bug is 4.10.0-0.nightly-2021-12-12-184227, not 4.10.0-0.nightly-2021-10-20-193037.

Comment 6 Matthew Staebler 2021-12-13 04:01:23 UTC
(In reply to jima from comment #4)
> Issue also exist on 4.9, need to backport, how to handle this?
> @mstaeble

https://github.com/openshift/installer/pull/5474

Comment 9 errata-xmlrpc 2022-03-10 16:29:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056