Bug 1957809 - [OSP] Install with invalid platform.openstack.machinesSubnet results in runtime error
Summary: [OSP] Install with invalid platform.openstack.machinesSubnet results in runti...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.7
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: 4.8.0
Assignee: Adolfo Duarte
QA Contact: Udi Shkalim
URL:
Whiteboard:
Depends On:
Blocks: 1928761
TreeView+ depends on / blocked
 
Reported: 2021-05-06 14:04 UTC by Kevin Chung
Modified: 2021-12-01 14:07 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Previously if the user assigned the id of subnet that does not exist to platform.openstack.machinesSubnet, the installer would fail with a panic: runtime error: invalid memory address or nil pointer dereference Now, the installer fails gracefully informing user the machinesSubnet value is invalid.
Clone Of:
Environment:
Last Closed: 2021-07-27 23:06:51 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift installer pull 4917 0 None closed Bug 1957809: Validation of platform.openstack.machineSubnet 2021-06-10 19:38:33 UTC
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 23:07:14 UTC

Description Kevin Chung 2021-05-06 14:04:25 UTC
Version:

$ openshift-install version
openshift-install 4.7.5
built from commit e15f17c958b4a04e770c0cfe758ca69452874508
release image quay.io/openshift-release-dev/ocp-release@sha256:0a4c44daf1666f069258aa983a66afa2f3998b78ced79faa6174e0a0f438f0a5


Platform:
OpenStack with IPI

What happened?

In my install-config.yaml file, I populated the platform.openstack.machinesSubnet with the incorrect UUID and the installer wasn't able to look it up.  While the installer did return just enough information for me to conclude what the issue was, it'd be best if this produced a friendly error output rather than a nil pointer reference / goroutine stacktrace.

# openshift-install create cluster --dir=. --log-level=info
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x48 pc=0x23c0f41]

goroutine 1 [running]:
github.com/openshift/installer/pkg/asset/installconfig/openstack/validation.validateMachinesSubnet(0xc00013bc00, 0xc0007945a0, 0xc0009fcbd0, 0xc001224180, 0x1, 0xc001224180, 0xc00071dbe0)
        /go/src/github.com/openshift/installer/pkg/asset/installconfig/openstack/validation/platform.go:48 +0xc1
github.com/openshift/installer/pkg/asset/installconfig/openstack/validation.ValidatePlatform(0xc00013bc00, 0xc0007945a0, 0xc0009fcbd0, 0x0, 0x16, 0x1)
        /go/src/github.com/openshift/installer/pkg/asset/installconfig/openstack/validation/platform.go:21 +0xc7
github.com/openshift/installer/pkg/asset/installconfig/openstack.Validate(0xc0000fe480, 0x0, 0x0)
        /go/src/github.com/openshift/installer/pkg/asset/installconfig/openstack/validate.go:27 +0xc5
github.com/openshift/installer/pkg/asset/installconfig.(*InstallConfig).platformValidation(0xc0005e0a60, 0x0, 0x0)
        /go/src/github.com/openshift/installer/pkg/asset/installconfig/installconfig.go:196 +0x15b
github.com/openshift/installer/pkg/asset/installconfig.(*InstallConfig).finish(0xc0005e0a60, 0xd9c85f9, 0x13, 0x12c3, 0xd6e4be0)
        /go/src/github.com/openshift/installer/pkg/asset/installconfig/installconfig.go:156 +0x234
github.com/openshift/installer/pkg/asset/installconfig.(*InstallConfig).Load(0xc0005e0a60, 0xed1bc60, 0xc000590080, 0xee143a0, 0xc0005e0a60, 0xc0005e0a60)
        /go/src/github.com/openshift/installer/pkg/asset/installconfig/installconfig.go:133 +0x1f0
github.com/openshift/installer/pkg/asset/store.(*storeImpl).load(0xc0005fe840, 0xed49560, 0xc0005e0680, 0xc000aecdcc, 0x4, 0xc000aecdcc, 0x4, 0x0)
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:264 +0x455
github.com/openshift/installer/pkg/asset/store.(*storeImpl).load(0xc0005fe840, 0xed49520, 0xc0005e0600, 0xd958e24, 0x2, 0xd958e24, 0x2, 0x0)
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:247 +0x2d7
github.com/openshift/installer/pkg/asset/store.(*storeImpl).load(0xc0005fe840, 0x7f5506b235b8, 0x15c02480, 0x0, 0x0, 0x2, 0x2, 0xec90320)
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:247 +0x2d7
github.com/openshift/installer/pkg/asset/store.(*storeImpl).fetch(0xc0005fe840, 0x7f5506b235b8, 0x15c02480, 0x0, 0x0, 0x40b525, 0xc4210c0)
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:201 +0xa3a
github.com/openshift/installer/pkg/asset/store.(*storeImpl).Fetch(0xc0005fe840, 0x7f5506b235b8, 0x15c02480, 0x15bd34a0, 0x8, 0x8, 0xdd00000000000000, 0xed825ebc9)
        /go/src/github.com/openshift/installer/pkg/asset/store/store.go:77 +0x4b
main.runTargetCmd.func1(0x7ffeac021837, 0x1, 0xc0005e0560, 0xc0005fe660)
        /go/src/github.com/openshift/installer/cmd/openshift-install/create.go:173 +0x135
main.runTargetCmd.func2(0x15bdc740, 0xc0005e0380, 0x0, 0x2)
        /go/src/github.com/openshift/installer/cmd/openshift-install/create.go:200 +0xb5
github.com/spf13/cobra.(*Command).execute(0x15bdc740, 0xc0005e0340, 0x2, 0x2, 0x15bdc740, 0xc0005e0340)
        /go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:854 +0x2c2
github.com/spf13/cobra.(*Command).ExecuteC(0xc000c03b80, 0xc000b1fdf8, 0x1, 0x1)
        /go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:958 +0x375
github.com/spf13/cobra.(*Command).Execute(...)
        /go/src/github.com/openshift/installer/vendor/github.com/spf13/cobra/command.go:895
main.installerMain()
        /go/src/github.com/openshift/installer/cmd/openshift-install/main.go:70 +0x2b8
main.main()
        /go/src/github.com/openshift/installer/cmd/openshift-install/main.go:50 +0x16f


What did you expect to happen?

Produce an error message without a stacktrace

How to reproduce it (as minimally and precisely as possible)?

Populate the platform.openstack.machinesSubnet field in install-config.yaml with an invalid UUID

Comment 2 Adolfo Duarte 2021-05-07 20:30:19 UTC
@Kevin
could you please copy and paste the  relevant part of your install-config.yaml

I did this test on 
[stack@standalone 4.7]$ ./openshift-install version
./openshift-install 4.7.9
built from commit fae650e24e7036b333b2b2d9dfb5a08a29cd07b1
release image registry.ci.openshift.org/ocp/release@sha256:dd75546170e65d7d17130de10a6ffeb425f960399640632cbc8426b9da338458

using this in my environment: 

platform:
  openstack:
    machinesSubnet: 12312323
    apiFloatingIP: 192.168.25.244
    apiVIP: 10.0.0.5
    cloud: openshift
    defaultMachinePlatform:
      type: m1.xlarge
    externalDNS: null
    externalNetwork: hostonly
    ingressVIP: 10.0.0.7
publish: External

and I get: 
[stack@standalone 4.7]$ ./openshift-install create cluster --log-level=debug
DEBUG OpenShift Installer 4.7.9                    
DEBUG Built from commit fae650e24e7036b333b2b2d9dfb5a08a29cd07b1 
DEBUG Fetching Metadata...                         
DEBUG Loading Metadata...                          
DEBUG   Loading Cluster ID...                      
DEBUG     Loading Install Config...                
DEBUG       Loading SSH Key...                     
DEBUG       Loading Base Domain...                 
DEBUG         Loading Platform...                  
DEBUG       Loading Cluster Name...                
DEBUG         Loading Base Domain...               
DEBUG         Loading Platform...                  
DEBUG       Loading Networking...                  
DEBUG         Loading Platform...                  
DEBUG       Loading Pull Secret...                 
DEBUG       Loading Platform...                    
FATAL failed to fetch Metadata: failed to load asset "Install Config": platform.openstack.machinesSubnet: Internal error: invalid subnet ID 
[stack@standalone 4.7]$

Comment 3 Adolfo Duarte 2021-05-07 20:32:30 UTC
actually I replicated it thanks. 
It has to be a string of UUID size but non existent on the openstack deployment:

Comment 4 Adolfo Duarte 2021-05-07 20:33:46 UTC
example to reproduce, where subnet id 00a578f8-f68e-4ae0-aaea-266262033771 does not exists

use the following in install-config.yaml 
platform:
  openstack:
    machinesSubnet: 00a578f8-f68e-4ae0-aaea-266262033771
    apiFloatingIP: 192.168.25.244
    apiVIP: 10.0.0.5
    cloud: openshift
    defaultMachinePlatform:
      type: m1.xlarge
    externalDNS: null
    externalNetwork: hostonly
    ingressVIP: 10.0.0.7
publish: External

Comment 5 Adolfo Duarte 2021-05-07 20:48:40 UTC
The problem seems to be here: 

https://github.com/openshift/installer/blob/0887f5336fba1d3631a0c4be8575f887e273a4ed/pkg/asset/installconfig/openstack/validation/platform.go#L44

"if n.MachineNetwork[0].CIDR.String() != ci.MachinesSubnet.CIDR {" 

in the case of an invalid (non existent) subnet id ci.MachinesSubnet.CDIR is a null pointer.

Comment 6 Adolfo Duarte 2021-05-07 21:11:56 UTC
Or perhaps it isn.MachineNetwork[0].CIDR.String() that is a null pointer.

Comment 8 Adolfo Duarte 2021-05-08 02:58:22 UTC
attached a debugger and in the case 
ci.machineSubnet = nil

Comment 9 Adolfo Duarte 2021-05-08 03:12:24 UTC
patch submitted. https://github.com/openshift/installer/pull/4917

Comment 11 Udi Shkalim 2021-06-03 14:05:02 UTC
Verified on: 4.8.0-0.nightly-2021-06-03-055145

#################################################################################################################################

Working configuration with correct values:

(shiftstack) [stack@undercloud-0 ~]$ openstack subnet list
+--------------------------------------+----------------------+--------------------------------------+------------------+
| ID                                   | Name                 | Network                              | Subnet           |
+--------------------------------------+----------------------+--------------------------------------+------------------+
| 8d6446e9-634b-490e-a9f7-14868fada2e3 | provider-flat-subnet | 8a5f38b9-7f25-43ef-a083-a2bcf8d75c14 | 10.46.22.128/26  |

(shiftstack) [stack@undercloud-0 ~]$ grep machinesSubnet install-config.yaml
    machinesSubnet: 8d6446e9-634b-490e-a9f7-14868fada2e3

#################################################################################################################################
Subnet ID not existing on openstack

(shiftstack) [stack@undercloud-0 ~]$ grep machinesSubnet ostest/install-config.yaml
    machinesSubnet: 1d6446e9-634b-490e-a9f7-14868fada2e3

(shiftstack) [stack@undercloud-0 ~]$ openshift-install create cluster --dir ostest/
FATAL failed to fetch Metadata: failed to load asset "Install Config": platform.openstack.machinesSubnet: Not found: "1d6446e9-634b-490e-a9f7-14868fada2e3"



#################################################################################################################################
Non-format subnet ID (deleted first char)

(shiftstack) [stack@undercloud-0 ~]$ grep machinesSubnet ostest/install-config.yaml
    machinesSubnet: d6446e9-634b-490e-a9f7-14868fada2e3

(shiftstack) [stack@undercloud-0 ~]$ openshift-install create cluster --dir ostest/
FATAL failed to fetch Metadata: failed to load asset "Install Config": platform.openstack.machinesSubnet: Not found: "d6446e9-634b-490e-a9f7-14868fada2e3"

Comment 14 errata-xmlrpc 2021-07-27 23:06:51 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.