Bug 1958420

Summary: openshift-install 4.7.10 fails with segmentation error
Product: OpenShift Container Platform Reporter: jrickard
Component: InstallerAssignee: Matthew Staebler <mstaeble>
Installer sub component: openshift-installer QA Contact: Yunfei Jiang <yunjiang>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: mstaeble, wking
Version: 4.7Keywords: Regression, TestBlocker
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
This regression was not released.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 23:07:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1958518    
Attachments:
Description Flags
openshift-install.log none

Description jrickard 2021-05-07 21:06:09 UTC
Created attachment 1780871 [details]
openshift-install.log

Thanks for opening a bug report!
Before hitting the button, please fill in as much of the template below as you can.
If you leave out information, it's harder to help you.
Be ready for follow-up questions, and please respond in a timely manner.
If we can't reproduce a bug we might close your issue.
If we're wrong, PLEASE feel free to reopen it and explain why.

Version:

$ openshift-install version

openshift-install 4.7.10
built from commit 3d157f47000c2a9963527ad1dc8c69b77053a4a6
release image quay.io/openshift-release-dev/ocp-release@sha256:24f0bcf67474e06ceb1091fc63bddd6010e1d13f5fe5604962a4579ee98b8e22

Platform:

aws (govcloud)
rhel8
attempted using following instances types:
t2.medium
m4.large 

Please specify:

IPI

What happened?

when running: openshift-install create cluster --dir=deploy --log-level=debug - the following error is presented:

DEBUG           Fetching Install Config...
DEBUG           Reusing previously-fetched Install Config
DEBUG         Generating Additional Trust Bundle Config...
DEBUG       Generating Infrastructure Config...
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0xabfaebb]

goroutine 1 [running]:
github.com/openshift/installer/pkg/asset/manifests.(*Infrastructure).Generate(0xc001ac5b00, 0xc000320840, 0xd9bd9d8, 0x12)
        /go/src/github.com/openshift/installer/pkg/asset/manifests/infrastructure.go:100 +0x6bb
github.com/openshift/installer/pkg/asset/store.(*storeImpl).fetch(0xc000320cf0, 0xed56420, 0xc001ac5b00, 0xc00064c8e8, 0x6, 0xc00064c8e8, 0x6)
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:227 +0x7dc
github.com/openshift/installer/pkg/asset/store.(*storeImpl).fetch(0xc000320cf0, 0xed564a0, 0xc0013e4f60, 0xc0019aa0a8, 0x4, 0xc0019aa0a8, 0x4)
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:221 +0x625
github.com/openshift/installer/pkg/asset/store.(*storeImpl).fetch(0xc000320cf0, 0xed55d60, 0xc00085ab20, 0xd962f64, 0x2, 0xd962f64, 0x2)
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:221 +0x625
github.com/openshift/installer/pkg/asset/store.(*storeImpl).fetch(0xc000320cf0, 0x7f91721169d8, 0x15c12508, 0x0, 0x0, 0x40b525, 0xc42a4c0)
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:221 +0x625
github.com/openshift/installer/pkg/asset/store.(*storeImpl).Fetch(0xc000320cf0, 0x7f91721169d8, 0x15c12508, 0x15be34e0, 0x8, 0x8, 0x7d00000000000000, 0xed8279fb3)
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:77 +0x4b
main.runTargetCmd.func1(0x7ffd3db1f6cc, 0x6, 0xc000c50320, 0xc0004b57a0)
        /go/src/github.com/openshift/installer/cmd/openshift-install/create.go:173 +0x135
main.runTargetCmd.func2(0x15bec7a0, 0xc000c500c0, 0x0, 0x2)
        /go/src/github.com/openshift/installer/cmd/openshift-install/create.go:200 +0xb5
github.com/spf13/cobra.(*Command).execute(0x15bec7a0, 0xc000c50080, 0x2, 0x2, 0x15bec7a0, 0xc000c50080)
        /go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:854 +0x2c2
github.com/spf13/cobra.(*Command).ExecuteC(0xc00038e840, 0xc000b03df8, 0x1, 0x1)
        /go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:958 +0x375
github.com/spf13/cobra.(*Command).Execute(...)
        /go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:895
main.installerMain()
        /go/src/github.com/openshift/installer/cmd/openshift-install/main.go:70 +0x2b8
main.main()
        /go/src/github.com/openshift/installer/cmd/openshift-install/main.go:50 +0x16f

# Always at least include the `.openshift_install.log`

What did you expect to happen?

The installer to begin

How to reproduce it (as minimally and precisely as possible)?

$curl -LfO http://mirror.openshift.com/pub/openshift-v4/x86_64/clients/ocp/4.7.10/openshift-install-linux-4.7.10.tar.gz

$sudo tar xvf openshift-install-linux-4.7.10.tar.gz -C /usr/local/bin/

$openshift-install create cluster --dir=deploy --log-level=debug

Anything else we need to know?

When i download/install openshift-install 4.7.9 the cluster deploys as expected.

Comment 1 Matthew Staebler 2021-05-08 01:05:52 UTC
Please share the install-config.yaml that you are using.

Comment 2 Matthew Staebler 2021-05-08 01:06:52 UTC
(In reply to Matthew Staebler from comment #1)
> Please share the install-config.yaml that you are using.

Nevermind. I see the problem. Thanks.

Comment 3 Yunfei Jiang 2021-05-08 05:40:53 UTC
Reproduced issue on OCP 4.7.10, the panic occurred when setting `serviceEndpoints` in the install-config.yaml.

this issue blocks all tests that require custom endpoints, e.g. install a cluster on AWS in a restricted network with STS support, a regional STS endpoint is required instead of the global STS endpoint

Adding testblocker keywords.

Comment 5 jrickard 2021-05-08 18:06:38 UTC
Providing sanitized install-config.yaml for completeness:

apiVersion: v1
baseDomain: mydomain.com
credentialsMode: Passthrough
controlPlane:
  hyperthreading: Enabled
  name: master
  platform:
    aws:
      rootVolume:
        iops: 4000
        size: 500
        type: io1
      type: m5.xlarge
  replicas: 3
compute:
- hyperthreading: Enabled
  name: worker
  platform:
    aws:
      rootVolume:
        iops: 2000
        size: 500
        type: io1
      type: m5.xlarge
  replicas: 3
metadata:
  name: cluster
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineNetwork:
  - cidr: 10.0.1.0/24
  - cidr: 10.0.2.0/24
  - cidr: 10.0.3.0/24
  - cidr: 10.0.4.0/24
  - cidr: 10.0.5.0/24
  - cidr: 10.0.0.0/24
  networkType: OpenShiftSDN
  serviceNetwork:
  - 172.30.0.0/16
platform:
  aws:
    serviceEndpoints:
    - name: ec2
      url: https://ec2.us-gov-west-1.amazonaws.com
    - name: elasticloadbalancing
      url: https://elasticloadbalancing.us-gov-west-1.amazonaws.com
    - name: route53
      url: https://route53.us-gov.amazonaws.com
    - name: tagging
      url: https://tagging.us-gov-west-1.amazonaws.com
    zones:
      - us-gov-west-1a
      - us-gov-west-1b
      - us-gov-west-1c
    region: us-gov-west-1
    subnets:
      - subnet-00604a66713f764f6
      - subnet-0244a3713645630b9
      - subnet-08cb5ff4d797e46e5
      - subnet-0e772b0602b61b47a
      - subnet-0b2246878076c46ff
      - subnet-06bd334f17a95aeec
    amiID: ami-0d993be65a4c274a8
pullSecret: '{"auths":}'
fips: false
sshKey: ssh-rsa AAAAB3
publish: Internal

Comment 6 Yunfei Jiang 2021-05-10 04:38:06 UTC
verified. PASS.

OCP version: 4.8.0-0.nightly-2021-05-09-105430

Comment 10 errata-xmlrpc 2021-07-27 23:07:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438