Bug 1396366 - logging-deployer pod got failed with error 'deploymentconfigs "logging-es-x" not found'
Summary: logging-deployer pod got failed with error 'deploymentconfigs "logging-es-x" ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: ewolinet
QA Contact: Gaoyun Pei
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-11-18 08:04 UTC by Gaoyun Pei
Modified: 2017-03-08 18:43 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Piping to oc volume from oc process would not create the dc as it did before. Consequence: The deployer would output that the DC that would be generated did not exist and would fail. Fix: We now pipe the output of oc volume to oc create. Result: We can create the missing DC with the PVC mount when we have the deployer attaching PVC to ES upon creation and the deployer no longer fails.
Clone Of:
Environment:
Last Closed: 2017-01-18 12:55:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0066 0 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.4 RPM Release Advisory 2017-01-18 17:23:26 UTC

Description Gaoyun Pei 2016-11-18 08:04:09 UTC
Description of problem:
Enable logging deployment with dynamic pv in ansible inventory and run the installer playbook. The logging-deployer pod run into error status with error: deploymentconfigs "logging-es-j16gv78u" not found 

Also the same error when running with NFS volume


Version-Release number of selected component (if applicable):
brew-pulp-docker01.x.com:8888/openshift3/logging-deployer      3.4.0               acad3da7b4ad        14 hours ago        762.7 MB



How reproducible:
Always

Steps to Reproduce:
1.Set logging deployment options in inventory file and run the playbook openshift_hosted_logging_deploy=true
openshift_hosted_logging_deployer_prefix=brew-pulp-docker01.x.com:8888/openshift3
openshift_hosted_logging_deployer_version=3.4.0
openshift_hosted_logging_storage_kind=dynamic

2. The logs of logging deployment in installation and logging-deployer pod could be found in attachment


Actual results:
[root@openshift-137 ~]# oc get pod
NAME                     READY     STATUS    RESTARTS   AGE
logging-deployer-ea3tj   0/1       Error     0          26m

[root@openshift-137 ~]# oc logs --tail=3 logging-deployer-ea3tj
+ oc process logging-es-template
+ oc volume -f - --add --overwrite --name=elasticsearch-storage --type=persistentVolumeClaim --claim-name=logging-es1
error: deploymentconfigs "logging-es-j16gv78u" not found


Expected results:
logging deploy could be completed

Additional info:

Comment 3 Gaoyun Pei 2016-11-18 09:47:47 UTC
Found es-ops-pvc-dynamic=true and ENABLE_OPS_CLUSTER=false both set in "log of how installer deploy logging" when deploy logging, so it maybe an incorrect configuration of openshift-ansible installer, change it to installer to see whether we could avoid this in openshift-ansible.

Comment 4 Scott Dodson 2016-11-18 21:34:11 UTC
I've reproduced this on GCE, though I see no reason for it. Assigning to logging for their review. Feel free to pass this back to me if you see something that's misconfigured.

[root@instance-1 ~]# oc describe -n logging pod logging-deployer-qzx39
Name:                   logging-deployer-qzx39
Namespace:              logging
Security Policy:        anyuid
Node:                   instance-1.c.openshift-gce-devel.internal/10.240.0.17
Start Time:             Fri, 18 Nov 2016 21:24:48 +0000
Labels:                 app=logging-deployer-template
                        logging-infra=deployer
                        provider=openshift
Status:                 Failed
IP:                     10.1.0.19
Controllers:            <none>
Containers:
  deployer:
    Container ID:       docker://9db48a791acdf33c40e8322fee37eb2ecbe717af78a86414c527445ed69bd017
    Image:              registry.ops.openshift.com/openshift3/logging-deployer:3.4.0
    Image ID:           docker-pullable://registry.ops.openshift.com/openshift3/logging-deployer@sha256:3557aceece2f72caa343759d20a1bbc8f9395bdcae58460c19233a7601d24c85
    Port:
    State:              Terminated
      Reason:           Error
      Exit Code:        1
      Started:          Fri, 18 Nov 2016 21:24:51 +0000
      Finished:         Fri, 18 Nov 2016 21:25:14 +0000
    Ready:              False
    Restart Count:      0
    Volume Mounts:
      /etc/deploy from empty (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from logging-deployer-token-i5xhr (ro)
    Environment Variables:
      PROJECT:                          logging (v1:metadata.namespace)
      IMAGE_PREFIX:                     registry.ops.openshift.com/openshift3/
      IMAGE_VERSION:                    3.4.0
      IMAGE_PULL_SECRET:
      INSECURE_REGISTRY:                false
      ENABLE_OPS_CLUSTER:               false
      KIBANA_HOSTNAME:                  kibana.example.com
      KIBANA_OPS_HOSTNAME:              kibana-ops.example.com
      PUBLIC_MASTER_URL:                https://localhost:8443
      MASTER_URL:                       https://kubernetes.default.svc.cluster.local
      ES_INSTANCE_RAM:                  8G
      ES_PVC_SIZE:
      ES_PVC_PREFIX:                    logging-es-
      ES_PVC_DYNAMIC:
      ES_CLUSTER_SIZE:                  1
      ES_NODE_QUORUM:
      ES_RECOVER_AFTER_NODES:
      ES_RECOVER_EXPECTED_NODES:
      ES_RECOVER_AFTER_TIME:            5m
      ES_OPS_INSTANCE_RAM:              8G
      ES_OPS_PVC_SIZE:
      ES_OPS_PVC_PREFIX:                logging-es-ops-
      ES_OPS_PVC_DYNAMIC:
      ES_OPS_CLUSTER_SIZE:
      ES_OPS_NODE_QUORUM:
      ES_OPS_RECOVER_AFTER_NODES:
      ES_OPS_RECOVER_EXPECTED_NODES:
      ES_OPS_RECOVER_AFTER_TIME:        5m
      FLUENTD_NODESELECTOR:             logging-infra-fluentd=true
      ES_NODESELECTOR:
      ES_OPS_NODESELECTOR:
      KIBANA_NODESELECTOR:
      KIBANA_OPS_NODESELECTOR:
      CURATOR_NODESELECTOR:
      CURATOR_OPS_NODESELECTOR:
      MODE:                             install
Conditions:
  Type          Status
  Initialized   True 
  Ready         False 
  PodScheduled  True 
Volumes:
  empty:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  logging-deployer-token-i5xhr:
    Type:       Secret (a volume populated by a Secret)
    SecretName: logging-deployer-token-i5xhr
QoS Class:      BestEffort
Tolerations:    <none>
Events:
  FirstSeen     LastSeen        Count   From                                                    SubobjectPath                   Type            Reason          Message
  ---------     --------        -----   ----                                                    -------------                   --------        ------          -------
  6m            6m              1       {default-scheduler }                                                                    Normal          Scheduled       Successfully assigned logging-deployer-qzx39 to instance-1.c.openshift-gce-devel.internal
  6m            6m              1       {kubelet instance-1.c.openshift-gce-devel.internal}     spec.containers{deployer}       Normal          Pulling         pulling image "registry.ops.openshift.com/openshift3/logging-deployer:3.4.0"
  6m            6m              1       {kubelet instance-1.c.openshift-gce-devel.internal}     spec.containers{deployer}       Normal          Pulled          Successfully pulled image "registry.ops.openshift.com/openshift3/logging-deployer:3.4.0"
  6m            6m              1       {kubelet instance-1.c.openshift-gce-devel.internal}     spec.containers{deployer}       Normal          Created         Created container with docker id 9db48a791acd; Security:[seccomp=unconfined]
  6m            6m              1       {kubelet instance-1.c.openshift-gce-devel.internal}     spec.containers{deployer}       Normal          Started         Started container with docker id 9db48a791acd

Comment 7 Gaoyun Pei 2016-11-22 02:56:49 UTC
Verify this bug with openshift3/logging-deployer:3.4.0, IMAGE ID bc7270d46669 

Enable logging deployment with dynamic pv in ansible inventory, logging related pods are all running well after installation.
[root@x-nfs-1 ~]# oc get pod
NAME                          READY     STATUS      RESTARTS   AGE
logging-curator-1-nhtz8       1/1       Running     0          4m
logging-deployer-2mnsv        0/1       Completed   0          5m
logging-es-am2jqiw6-1-jwpoq   1/1       Running     0          4m
logging-fluentd-wkatf         1/1       Running     0          2m
logging-fluentd-xpvj6         1/1       Running     0          2m
logging-kibana-1-u3lc9        2/2       Running     0          4m

Comment 8 Xia Zhao 2016-11-22 06:33:06 UTC
@ewolinet Just want to learn more about the original issue -- how do you think this can be manually reproduced/verified outside of ansible? I'm willing to give it a try. Thanks!

Comment 9 ewolinet 2016-11-22 15:03:32 UTC
@Xia,

Sure, this was actually a bug in the deployer and was not specific to Ansible. 

To recreate this you can do the following:

  oc process logging-es-template | oc volume -f - --add --overwrite \
    --name=elasticsearch-storage --type=persistentVolumeClaim \
    --claim-name={some_pvc}

Comment 10 ewolinet 2016-12-12 15:55:58 UTC
The deployer was trying to pipe from oc process to oc volume and have a DC created, we resolved this by piping the output from oc volume to oc create within the deployer when adding a PVC.

Comment 12 errata-xmlrpc 2017-01-18 12:55:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0066

Comment 13 Louis Santillan 2017-02-16 01:16:36 UTC
Two questions, any idea how far back this bug has been present?

Is there a 3.3 ERRATA with similar a solution?

Customer is installing 3.3.1.11 and the elasticsearch node is looking for `elasticsearch-storage` PVC but `oc new-app logging-deployer-template` (and by extension, the ansible installer) is creating `logging-es-1` (`/etc/ansible/host` has the below content).  Also, I'm in the process of creating a new ticket and validating the above solution.


# Currently logging deployment is disabled by default, enable it by setting this 
openshift_hosted_logging_deploy=true 

# Option B - External NFS Host 
# NFS volume must already exist with path "nfs_directory/_volume_name" on 
# the storage_host. For example, the remote volume path using these 
# options would be "nfs.example.com:/exports/logging" 
openshift_hosted_logging_storage_kind=nfs 
openshift_hosted_logging_storage_access_modes=['ReadWriteOnce'] 
openshift_hosted_logging_storage_host=10.139.63.13 
openshift_hosted_logging_storage_nfs_directory=/testnfs4_NFS_volume/logging 
openshift_hosted_logging_storage_volume_name=logging 
openshift_hosted_logging_storage_volume_size=105Gi 


# Configure loggingPublicURL in the master config for aggregate logging, defaults 
# to https://kibana.{{ openshift_master_default_subdomain }} 
openshift_master_logging_public_url=https://kibana.apps.ocppoc.woodmen.net 

# Configure the number of elastic search nodes, unless you're using dynamic provisioning 
# this value must be 1 
#openshift_hosted_logging_elasticsearch_cluster_size=1 
openshift_hosted_logging_hostname=logging.apps.ocppoc.woodmen.net 

# Configure the prefix and version for the deployer image 
#openshift_hosted_logging_deployer_prefix=registry.example.com:8888/openshift3/ 
#openshift_hosted_logging_deployer_version=3.3.0


Note You need to log in before you can comment on or make changes to this bug.