Bug 1741763 - [aws] Creating machine with invalid label results in panic
Summary: [aws] Creating machine with invalid label results in panic
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cloud Compute
Version: 4.2.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.2.0
Assignee: Jan Chaloupka
QA Contact: sunzhaohua
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-08-16 06:00 UTC by sunzhaohua
Modified: 2019-10-16 06:36 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-10-16 06:36:15 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:2922 0 None None None 2019-10-16 06:36:26 UTC

Description sunzhaohua 2019-08-16 06:00:00 UTC
Description of problem:
Create a machine with invalid label "machine.openshift.io/cluster-api-cluster: zhsun3-8vcmx-invalid", the machine-controller log show "panic: runtime error: invalid memory address or nil pointer dereference".

Version-Release number of selected component (if applicable):
4.2.0-0.nightly-2019-08-15-033605

How reproducible:
Always

Steps to Reproduce:
1.  Create machine with invalid label 
apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
  labels:
    machine.openshift.io/cluster-api-cluster: xxia-0815-g492t-invalid
    machine.openshift.io/cluster-api-machine-role: worker
    machine.openshift.io/cluster-api-machine-type: worker
  name: xxia-0815-g492t-worker-ap-southeast-1a-a
  namespace: openshift-machine-api
spec:
  metadata:
    creationTimestamp: null
  providerSpec:
    value:
      ami:
        id: ami-020a6747c571d1ee5
      apiVersion: awsproviderconfig.openshift.io/v1beta1
      blockDevices:
      - ebs:
          iops: 0
          volumeSize: 120
          volumeType: gp2
      credentialsSecret:
        name: aws-cloud-credentials
      deviceIndex: 0
      iamInstanceProfile:
        id: xxia-0815-g492t-worker-profile
      instanceType: m4.large
      kind: AWSMachineProviderConfig
      metadata:
        creationTimestamp: null
      placement:
        availabilityZone: ap-southeast-1a
        region: ap-southeast-1
      publicIp: null
      securityGroups:
      - filters:
        - name: tag:Name
          values:
          - xxia-0815-g492t-worker-sg
      subnet:
        filters:
        - name: tag:Name
          values:
          - xxia-0815-g492t-private-ap-southeast-1a
      tags:
      - name: kubernetes.io/cluster/xxia-0815-g492t
        value: owned
      userDataSecret:
        name: worker-user-data          
2. Check machine, node and machine-controller logs

Actual results:
Machines was created, no instance join the cluster, machine-controller log show "panic: runtime error: invalid memory address or nil pointer dereference "

$ oc describe machine xxia-0815-g492t-worker-ap-southeast-1a-a
Status:
  Addresses:
    Address:     10.0.139.44
    Type:        InternalIP
    Address:     
    Type:        ExternalDNS
    Address:     ip-10-0-139-44.ap-southeast-1.compute.internal
    Type:        InternalDNS
  Last Updated:  2019-08-16T05:05:39Z
  Provider Status:
    API Version:  awsproviderconfig.openshift.io/v1beta1
    Conditions:
      Last Probe Time:       2019-08-16T05:05:19Z
      Last Transition Time:  2019-08-16T05:05:19Z
      Message:               machine successfully created
      Reason:                MachineCreationSucceeded
      Status:                True
      Type:                  MachineCreation
    Instance Id:             i-0d55cfbbbd8f2012e
    Instance State:          running
    Kind:                    AWSMachineProviderStatus

$ oc logs -f machine-api-controllers-5c75c64997-d4kdh -c machine-controller
I0816 05:05:39.598781       1 actuator.go:451] xxia-0815-g492t-worker-ap-southeast-1a-a: found 1 running instances for machine
E0816 05:05:39.606348       1 runtime.go:69] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:76
/go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/asm_amd64.s:522
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:513
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:82
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/signal_unix.go:390
/go/src/sigs.k8s.io/cluster-api-provider-aws/pkg/actuators/machine/actuator.go:469
/go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/github.com/openshift/cluster-api/pkg/controller/machine/controller.go:251
/go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:210
/go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158
/go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152
/go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153
/go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/asm_amd64.s:1333
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x138406e]

goroutine 256 [running]:
sigs.k8s.io/cluster-api-provider-aws/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
        /go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x108
panic(0x15a3b40, 0x28df610)
        /opt/rh/go-toolset-1.11/root/usr/lib/go-toolset-1.11-golang/src/runtime/panic.go:513 +0x1b9
sigs.k8s.io/cluster-api-provider-aws/pkg/actuators/machine.(*Actuator).Update(0xc000522840, 0x19a09c0, 0xc000040118, 0x0, 0xc0005d4b00, 0x1, 0x0)
        /go/src/sigs.k8s.io/cluster-api-provider-aws/pkg/actuators/machine/actuator.go:469 +0x9ae
sigs.k8s.io/cluster-api-provider-aws/vendor/github.com/openshift/cluster-api/pkg/controller/machine.(*ReconcileMachine).Reconcile(0xc0000b02d0, 0xc000a98220, 0x15, 0xc00096c420, 0x28, 0x28f4b00, 0x0, 0x0, 0x0)
        /go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/github.com/openshift/cluster-api/pkg/controller/machine/controller.go:251 +0x8e8
sigs.k8s.io/cluster-api-provider-aws/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0000fc0a0, 0x0)
        /go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:210 +0x17d
sigs.k8s.io/cluster-api-provider-aws/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1()
        /go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:158 +0x36
sigs.k8s.io/cluster-api-provider-aws/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc00001f820)
        /go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x54
sigs.k8s.io/cluster-api-provider-aws/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc00001f820, 0x3b9aca00, 0x0, 0x1844c01, 0xc0001be120)
        /go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xbe
sigs.k8s.io/cluster-api-provider-aws/vendor/k8s.io/apimachinery/pkg/util/wait.Until(0xc00001f820, 0x3b9aca00, 0xc0001be120)
        /go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88 +0x4d
created by sigs.k8s.io/cluster-api-provider-aws/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start
        /go/src/sigs.k8s.io/cluster-api-provider-aws/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:157 +0x32a

Expected results:

Comment 2 sunzhaohua 2019-08-19 07:52:20 UTC
Verified.

4.2.0-0.nightly-2019-08-18-222019

Create machine with invalid label, check machine-controller logs.

$ oc logs -f machine-api-controllers-cf9954644-mbpmr -c machine-controller
I0819 07:22:10.577543       1 controller.go:141] Reconciling Machine "zhsun-7558q-worker-us-east-2a-invalid"
I0819 07:22:10.577678       1 controller.go:310] Machine "zhsun-7558q-worker-us-east-2a-invalid" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster
I0819 07:22:10.586231       1 controller.go:141] Reconciling Machine "zhsun-7558q-worker-us-east-2a-invalid"
I0819 07:22:10.586263       1 controller.go:310] Machine "zhsun-7558q-worker-us-east-2a-invalid" in namespace "openshift-machine-api" doesn't specify "cluster.k8s.io/cluster-name" label, assuming nil cluster
I0819 07:22:10.586277       1 actuator.go:481] zhsun-7558q-worker-us-east-2a-invalid: Checking if machine exists
I0819 07:22:10.629424       1 actuator.go:489] zhsun-7558q-worker-us-east-2a-invalid: Instance does not exist
I0819 07:22:10.629461       1 controller.go:259] Reconciling machine object zhsun-7558q-worker-us-east-2a-invalid triggers idempotent create.
I0819 07:22:10.629475       1 actuator.go:113] zhsun-7558q-worker-us-east-2a-invalid: creating machine
E0819 07:22:10.629677       1 utils.go:191] NodeRef not found in machine zhsun-7558q-worker-us-east-2a-invalid
I0819 07:22:10.639547       1 instances.go:44] No stopped instances found for machine zhsun-7558q-worker-us-east-2a-invalid
I0819 07:22:10.639584       1 instances.go:142] Using AMI ami-06c85f9d106577272
I0819 07:22:10.639592       1 instances.go:74] Describing security groups based on filters
I0819 07:22:10.828290       1 instances.go:119] Describing subnets based on filters
I0819 07:22:12.408425       1 actuator.go:199] zhsun-7558q-worker-us-east-2a-invalid: ProviderID updated at machine spec: aws:///us-east-2a/i-03673861c88d0fe21
I0819 07:22:12.416440       1 actuator.go:579] zhsun-7558q-worker-us-east-2a-invalid: Updating status
I0819 07:22:12.416465       1 actuator.go:623] zhsun-7558q-worker-us-east-2a-invalid: finished calculating AWS status
I0819 07:22:12.416516       1 actuator.go:231] zhsun-7558q-worker-us-east-2a-invalid: machine status has changed, updating
I0819 07:22:12.426455       1 actuator.go:641] zhsun-7558q-worker-us-east-2a-invalid: Instance state still pending, returning an error to requeue
W0819 07:22:12.426489       1 controller.go:261] Failed to create machine "zhsun-7558q-worker-us-east-2a-invalid": requeue in: 20s
I0819 07:22:12.426515       1 controller.go:364] Actuator returned requeue-after error: requeue in: 20s

Comment 3 errata-xmlrpc 2019-10-16 06:36:15 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2922


Note You need to log in before you can comment on or make changes to this bug.