Bug 1707323 - Elasticsearch operator - panic: assignment to entry in nil map
Summary: Elasticsearch operator - panic: assignment to entry in nil map
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 4.1.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.1.0
Assignee: Josef Karasek
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-05-07 09:38 UTC by Josef Karasek
Modified: 2019-06-04 10:48 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-06-04 10:48:34 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:0758 None None None 2019-06-04 10:48:41 UTC
Github openshift elasticsearch-operator pull 133 None None None 2019-05-07 09:45:16 UTC

Description Josef Karasek 2019-05-07 09:38:40 UTC
Description of problem:

Operator may attempt to write into uninitialized map.

This happens when the managed resource (deployment or statefulset) doesn't have Container.Resources.Limits or Container.Resources.Requests

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Deploy EO and elasticsearch CR
2. Delete Container.Resources.Limits (and/or Requests) entry
3. Operator tries to fix the managed resources, but crashes while doing that

Actual results:
elasticsearch-operator/pkg/k8shandler/deployment.go:567
elasticsearch-operator/pkg/k8shandler/deployment.go:84
elasticsearch-operator/pkg/k8shandler/cluster.go:63
elasticsearch-operator/pkg/stub/handler.go:67
...
/usr/lib/golang/src/runtime/asm_amd64.s:2361
panic: assignment to entry in nil map [recovered]
	panic: assignment to entry in nil map

Expected results:
Managed resource gets default values assigned, when none are supplied by the user

Additional info:

Comment 2 Anping Li 2019-05-08 08:56:09 UTC
The resource are added to CRD/elasticsearch after I delete them. But the elasticsearch deployment couldn't be created.
Steps 
1: Deploy cluster logging
2: Disable clusterloging by set --replicas=0
oc scale deployment cluster-logging-operator --replicas=0
3. Remove the resource items in elasticsearch/elasticsearch
Waiting for a while. Check the result
  No elasticsearch are deployed.
  Error message are reported [1]
  The elasticsearch resource as [2]



[1] [anli@preserve-anli-slave 41]$ oc logs elasticsearch-operator-796f97d775-qwjbz
time="2019-05-08T08:48:16Z" level=info msg="Go Version: go1.10.8"
time="2019-05-08T08:48:16Z" level=info msg="Go OS/Arch: linux/amd64"
time="2019-05-08T08:48:16Z" level=info msg="operator-sdk Version: 0.0.7"
time="2019-05-08T08:48:16Z" level=info msg="Watching logging.openshift.io/v1, Elasticsearch, , 5000000000"
time="2019-05-08T08:48:23Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:48:32Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:48:41Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:48:49Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:48:58Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:49:07Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"
time="2019-05-08T08:49:16Z" level=error msg="error syncing key (openshift-logging/elasticsearch): Failed to reconcile Elasticsearch deployment spec: Could not create node resource: Deployment.apps \"elasticsearch-cdm-1izqe0zq-1\" is invalid: [spec.selector: Required value, spec.template.metadata.labels: Invalid value: map[string]string(nil): `selector` does not match template `labels`, spec.template.spec.containers: Required value]"

[2]
[anli@preserve-anli-slave 41]$ oc get elasticsearch elasticsearch
NAME            AGE
elasticsearch   11m
[anli@preserve-anli-slave 41]$ oc get elasticsearch elasticsearch -o yaml
apiVersion: logging.openshift.io/v1
kind: Elasticsearch
metadata:
  creationTimestamp: 2019-05-08T08:39:29Z
  generation: 13
  name: elasticsearch
  namespace: openshift-logging
  ownerReferences:
  - apiVersion: logging.openshift.io/v1
    controller: true
    kind: ClusterLogging
    name: instance
    uid: 622741e8-7140-11e9-ba10-0a71760f4148
  resourceVersion: "860820"
  selfLink: /apis/logging.openshift.io/v1/namespaces/openshift-logging/elasticsearches/elasticsearch
  uid: cb4f393a-716c-11e9-be9d-06c585732894
spec:
  managementState: Managed
  nodeSpec:
    image: image-registry.openshift-image-registry.svc:5000/openshift/ose-logging-elasticsearch5:v4.1.0-201905070632
    resources:
      limits:
        cpu: "1"
        memory: 2Gi
      requests:
        cpu: "1"
        memory: 2Gi
  nodes:
  - genUUID: 1izqe0zq
    nodeCount: 1
    resources: {}
    roles:
    - client
    - data
    - master
    storage: {}
  redundancyPolicy: ZeroRedundancy
status:
  clusterHealth: cluster health unknown
  conditions: []
  nodes:
  - conditions:
    - lastTransitionTime: 2019-05-08T08:44:20Z
      message: '0/6 nodes are available: 3 Insufficient cpu, 3 node(s) had taints
        that the pod didn''t tolerate.'
      reason: Unschedulable
      status: "True"
      type: Unschedulable
    deploymentName: elasticsearch-cdm-1izqe0zq-1
    upgradeStatus: {}
  pods:
    client:
      failed: []
      notReady: []
      ready: []
    data:
      failed: []
      notReady: []
      ready: []
    master:
      failed: []
      notReady: []
      ready: []
  shardAllocationEnabled: shard allocation unknown

Comment 3 Anping Li 2019-05-08 09:02:52 UTC
The image is openshift/ose-elasticsearch-operator:v4.1.0-201905071832

Comment 4 Anping Li 2019-05-08 15:03:43 UTC
The comment 2 seems to be another issue caused by deployment deletion. Following josef's step, the assign resouce size are added back after it was deleted. So I will close this one and file a new one.

Comment 6 errata-xmlrpc 2019-06-04 10:48:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758


Note You need to log in before you can comment on or make changes to this bug.