Bug 1627690 - etcd-operator fail to manage clusters in all namespaces
Summary: etcd-operator fail to manage clusters in all namespaces
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd
Version: 3.11.0
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
: 3.11.z
Assignee: Sam Batschelet
QA Contact: ge liu
URL:
Whiteboard:
Depends On:
Blocks: 1659875
TreeView+ depends on / blocked
 
Reported: 2018-09-11 09:11 UTC by ge liu
Modified: 2019-10-23 02:52 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1659875 (view as bug list)
Environment:
Last Closed: 2019-04-03 20:48:21 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description ge liu 2018-09-11 09:11:15 UTC
Description of problem:
Try to enable etcd-operator managing clusters in all namespaces by add items(
annotations:
    etcd.database.coreos.com/scope: clusterwide
) into etcd-cluster.yaml, then create etcd cluster, but the etcd cluster pods have not be created.

try with etcd-cluster.yaml without these items, the etcd cluster pods started successfully as expected.


openshift v3.11.0-0.32.0

How reproducible:
Always

Steps to Reproduce:
1. Create etcd subscription in project: operator-lifecycle-manager

2. Create etcd cluster with file:

apiVersion: "etcd.database.coreos.com/v1beta2"
kind: "EtcdCluster"
metadata:
  name: "example-etcd-cluster"
  annotations:
    etcd.database.coreos.com/scope: clusterwide
spec:
  size: 3
  version: "3.2.13"

3. # oc create -f etcd-cluster.yaml
etcdcluster.etcd.database.coreos.com/example-etcd-cluster created

4. Check that there is not etcd cluster pods be started
# oc get pods
NAME                                READY     STATUS    RESTARTS   AGE
alm-operator-798c765f5c-npn56       1/1       Running   0          45m
catalog-operator-548958ff7f-45z7w   1/1       Running   0          44m
etcd-operator-7b49974f5b-cq899      3/3       Running   0          2m

5. Check the etcd operator pods and logs:

# oc describe pods etcd-operator-7b49974f5b-cq899
Name:               etcd-operator-7b49974f5b-cq899
Namespace:          operator-lifecycle-manager
Priority:           0
PriorityClassName:  <none>
Node:               qe-juzhao-311-gce-1-master-etcd-1/10.240.0.12
Start Time:         Tue, 11 Sep 2018 08:20:21 +0000
Labels:             name=etcd-operator-alm-owned
                    pod-template-hash=3605530916
Annotations:        openshift.io/scc=restricted
Status:             Running
IP:                 10.128.0.16
Controlled By:      ReplicaSet/etcd-operator-7b49974f5b
Containers:
  etcd-operator:
    Container ID:  docker://5c10b99b9c6f401462836621763a2c298dfa5d3b620a5c4a85871e5fafaa1d24
    Image:         quay.io/coreos/etcd-operator@sha256:c0301e4686c3ed4206e370b42de5a3bd2229b9fb4906cf85f3f30650424abec2
    Image ID:      docker-pullable://quay.io/coreos/etcd-operator@sha256:c0301e4686c3ed4206e370b42de5a3bd2229b9fb4906cf85f3f30650424abec2
    Port:          <none>
    Host Port:     <none>
    Command:
      etcd-operator
      --create-crd=false
    State:          Running
      Started:      Tue, 11 Sep 2018 08:20:24 +0000
    Ready:          True
    Restart Count:  0
    Environment:
      MY_POD_NAMESPACE:  operator-lifecycle-manager (v1:metadata.namespace)
      MY_POD_NAME:       etcd-operator-7b49974f5b-cq899 (v1:metadata.name)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from etcd-operator-token-68b99 (ro)
  etcd-backup-operator:
    Container ID:  docker://9cf63973559c77410d6571a7e914c55e6d4174be87807b32ac47e4b94d560648
    Image:         quay.io/coreos/etcd-operator@sha256:c0301e4686c3ed4206e370b42de5a3bd2229b9fb4906cf85f3f30650424abec2
    Image ID:      docker-pullable://quay.io/coreos/etcd-operator@sha256:c0301e4686c3ed4206e370b42de5a3bd2229b9fb4906cf85f3f30650424abec2
    Port:          <none>
    Host Port:     <none>
    Command:
      etcd-backup-operator
      --create-crd=false
    State:          Running
      Started:      Tue, 11 Sep 2018 08:20:24 +0000
    Ready:          True
    Restart Count:  0
    Environment:
      MY_POD_NAMESPACE:  operator-lifecycle-manager (v1:metadata.namespace)
      MY_POD_NAME:       etcd-operator-7b49974f5b-cq899 (v1:metadata.name)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from etcd-operator-token-68b99 (ro)
  etcd-restore-operator:
    Container ID:  docker://82f39a6a7ad91917c3a8bf9a3d80267fd1216e3f06cad44cc89531ab3d55fe2a
    Image:         quay.io/coreos/etcd-operator@sha256:c0301e4686c3ed4206e370b42de5a3bd2229b9fb4906cf85f3f30650424abec2
    Image ID:      docker-pullable://quay.io/coreos/etcd-operator@sha256:c0301e4686c3ed4206e370b42de5a3bd2229b9fb4906cf85f3f30650424abec2
    Port:          <none>
    Host Port:     <none>
    Command:
      etcd-restore-operator
      --create-crd=false
    State:          Running
      Started:      Tue, 11 Sep 2018 08:20:24 +0000
    Ready:          True
    Restart Count:  0
    Environment:
      MY_POD_NAMESPACE:  operator-lifecycle-manager (v1:metadata.namespace)
      MY_POD_NAME:       etcd-operator-7b49974f5b-cq899 (v1:metadata.name)
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from etcd-operator-token-68b99 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  etcd-operator-token-68b99:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  etcd-operator-token-68b99
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     <none>
Events:
  Type    Reason     Age   From                                        Message
  ----    ------     ----  ----                                        -------
  Normal  Scheduled  3m    default-scheduler                           Successfully assigned operator-lifecycle-manager/etcd-operator-7b49974f5b-cq899 to qe-juzhao-311-gce-1-master-etcd-1
  Normal  Pulled     3m    kubelet, qe-juzhao-311-gce-1-master-etcd-1  Container image "quay.io/coreos/etcd-operator@sha256:c0301e4686c3ed4206e370b42de5a3bd2229b9fb4906cf85f3f30650424abec2" already present on machine
  Normal  Created    3m    kubelet, qe-juzhao-311-gce-1-master-etcd-1  Created container
  Normal  Started    3m    kubelet, qe-juzhao-311-gce-1-master-etcd-1  Started container
  Normal  Pulled     3m    kubelet, qe-juzhao-311-gce-1-master-etcd-1  Container image "quay.io/coreos/etcd-operator@sha256:c0301e4686c3ed4206e370b42de5a3bd2229b9fb4906cf85f3f30650424abec2" already present on machine
  Normal  Created    3m    kubelet, qe-juzhao-311-gce-1-master-etcd-1  Created container
  Normal  Started    3m    kubelet, qe-juzhao-311-gce-1-master-etcd-1  Started container
  Normal  Pulled     3m    kubelet, qe-juzhao-311-gce-1-master-etcd-1  Container image "quay.io/coreos/etcd-operator@sha256:c0301e4686c3ed4206e370b42de5a3bd2229b9fb4906cf85f3f30650424abec2" already present on machine
  Normal  Created    3m    kubelet, qe-juzhao-311-gce-1-master-etcd-1  Created container
  Normal  Started    3m    kubelet, qe-juzhao-311-gce-1-master-etcd-1  Started container

# oc logs etcd-operator-7b49974f5b-cq899 -c etcd-operator
time="2018-09-11T08:20:24Z" level=info msg="etcd-operator Version: 0.9.2"
time="2018-09-11T08:20:24Z" level=info msg="Git SHA: a0032c1f"
time="2018-09-11T08:20:24Z" level=info msg="Go Version: go1.10"
time="2018-09-11T08:20:24Z" level=info msg="Go OS/Arch: linux/amd64"
time="2018-09-11T08:20:41Z" level=info msg="Event(v1.ObjectReference{Kind:\"Endpoints\", Namespace:\"operator-lifecycle-manager\", Name:\"etcd-operator\", UID:\"3ffff76d-b59b-11e8-b1e8-42010af0000c\", APIVersion:\"v1\", ResourceVersion:\"81741\", FieldPath:\"\"}): type: 'Normal' reason: 'LeaderElection' etcd-operator-7b49974f5b-cq899 became leader"
[root@qe-juzhao-311-gce-1-master-etcd-1 tmp]# oc logs etcd-operator-7b49974f5b-cq899 -c etcd-operator -f
time="2018-09-11T08:20:24Z" level=info msg="etcd-operator Version: 0.9.2"
time="2018-09-11T08:20:24Z" level=info msg="Git SHA: a0032c1f"
time="2018-09-11T08:20:24Z" level=info msg="Go Version: go1.10"
time="2018-09-11T08:20:24Z" level=info msg="Go OS/Arch: linux/amd64"
time="2018-09-11T08:20:41Z" level=info msg="Event(v1.ObjectReference{Kind:\"Endpoints\", Namespace:\"operator-lifecycle-manager\", Name:\"etcd-operator\", UID:\"3ffff76d-b59b-11e8-b1e8-42010af0000c\", APIVersion:\"v1\", ResourceVersion:\"81741\", FieldPath:\"\"}): type: 'Normal' reason: 'LeaderElection' etcd-operator-7b49974f5b-cq899 became leader"
^C
[root@qe-juzhao-311-gce-1-master-etcd-1 tmp]# oc logs etcd-operator-7b49974f5b-cq899 -c etcd-operator -f -c etcd-backup-operator
time="2018-09-11T08:20:24Z" level=info msg="Go Version: go1.10"
time="2018-09-11T08:20:24Z" level=info msg="Go OS/Arch: linux/amd64"
time="2018-09-11T08:20:24Z" level=info msg="etcd-backup-operator Version: 0.9.2"
time="2018-09-11T08:20:24Z" level=info msg="Git SHA: a0032c1f"
time="2018-09-11T08:20:41Z" level=info msg="Event(v1.ObjectReference{Kind:\"Endpoints\", Namespace:\"operator-lifecycle-manager\", Name:\"etcd-backup-operator\", UID:\"494f0e20-b59a-11e8-b1e8-42010af0000c\", APIVersion:\"v1\", ResourceVersion:\"81744\", FieldPath:\"\"}): type: 'Normal' reason: 'LeaderElection' etcd-operator-7b49974f5b-cq899 became leader"
time="2018-09-11T08:20:41Z" level=info msg="starting backup controller" pkg=controller
^C
[root@qe-juzhao-311-gce-1-master-etcd-1 tmp]# oc logs etcd-operator-7b49974f5b-cq899 -c etcd-operator -f -c etcd-restore-operator
time="2018-09-11T08:20:24Z" level=info msg="Go Version: go1.10"
time="2018-09-11T08:20:24Z" level=info msg="Go OS/Arch: linux/amd64"
time="2018-09-11T08:20:24Z" level=info msg="etcd-restore-operator Version: 0.9.2"
time="2018-09-11T08:20:24Z" level=info msg="Git SHA: a0032c1f"
time="2018-09-11T08:20:42Z" level=info msg="listening on 0.0.0.0:19999"
time="2018-09-11T08:20:42Z" level=info msg="Event(v1.ObjectReference{Kind:\"Endpoints\", Namespace:\"operator-lifecycle-manager\", Name:\"etcd-restore-operator\", UID:\"4985b50b-b59a-11e8-b1e8-42010af0000c\", APIVersion:\"v1\", ResourceVersion:\"81747\", FieldPath:\"\"}): type: 'Normal' reason: 'LeaderElection' etcd-operator-7b49974f5b-cq899 became leader"
time="2018-09-11T08:20:42Z" level=info msg="starting restore controller" pkg=controller



6. Try with etcd-cluster.yaml without 'clusterwide' setting, it works well as expected


Actual results:

As title

Expected results:

etcd cluster pods should be started with setting 'clusterwide' items in etcd-cluster.yaml

Comment 3 ge liu 2019-04-01 06:59:20 UTC
@Sam Batschelet, Yes, I tried this feature, it fixed the bug in Cluster version is 4.0.0-0.nightly-2019-03-28-030453, this bug could be verified and closed.


Note You need to log in before you can comment on or make changes to this bug.