Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1487573 - Deploy logging 3.7 via ansible, it failed at "Invalid version specified for Elasticsearch".
Deploy logging 3.7 via ansible, it failed at "Invalid version specified for E...
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging (Show other bugs)
3.7.0
Unspecified Unspecified
high Severity high
: ---
: 3.7.0
Assigned To: Jeff Cantrill
Junqi Zhao
: TestBlocker
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-09-01 06:04 EDT by Junqi Zhao
Modified: 2017-11-28 17:09 EST (History)
8 users (show)

See Also:
Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-11-28 17:09:17 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
ansilbe running log (664.71 KB, text/plain)
2017-09-01 06:04 EDT, Junqi Zhao
no flags Details
es pod log (8.09 KB, text/plain)
2017-09-12 03:00 EDT, Junqi Zhao
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Github openshift/openshift-ansible/pull/5289 None None None 2017-09-01 15:26 EDT
Red Hat Product Errata RHSA-2017:3188 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update 2017-11-28 21:34:54 EST

  None (edit)
Description Junqi Zhao 2017-09-01 06:04:51 EDT
Created attachment 1320886 [details]
ansilbe running log

Description of problem:
Deploy logging 3.7 via ansible, set openshift_logging_image_version=v3.7, but it failed at Invalid version specified for Elasticsearch.

maybe "es_version": "3_7" it is the root cause.
This issue blocks the whole installation.

TASK [openshift_logging_elasticsearch : set_fact] *****************************************************************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging_elasticsearch/tasks/determine_version.yaml:14
ok: [qe-juzhao-37-gce-master-container-etcd-nfs-1.0831-gwf.qe.rhcloud.com] => {
    "ansible_facts": {
        "es_version": "3_7"
    }, 
    "changed": false
}

TASK [openshift_logging_elasticsearch : fail] *********************************************************************************************************************************************************************
task path: /usr/share/ansible/openshift-ansible/roles/openshift_logging_elasticsearch/tasks/determine_version.yaml:17
fatal: [qe-juzhao-37-gce-master-container-etcd-nfs-1.0831-gwf.qe.rhcloud.com]: FAILED! => {
    "changed": false, 
    "failed": true
}

MSG:

Invalid version specified for Elasticsearch
    to retry, use: --limit @/usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-logging.retry

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. Deploy logging 3.7 via ansible, Inventory file see the "Additional" info part
2.
3.

Actual results:
Deployment is failed

Expected results:
Deployment should be successful.

Additional info:
Inventory file:
[OSEv3:children]
masters

[masters]
${MASTER_URL} openshift_public_hostname=${MASTER_URL}

[OSEv3:vars]
ansible_ssh_user=root
ansible_ssh_private_key_file="~/libra.pem"
deployment_type=openshift-enterprise

# Logging
openshift_logging_install_logging=true
openshift_logging_kibana_hostname=kibana.${SUB_DOMAIN}
public_master_url=https://${MASTER_URL}:8443
openshift_logging_image_prefix=${IMAGE_PREFIX}
openshift_logging_image_version=v3.7
openshift_logging_namespace=logging
Comment 1 Junqi Zhao 2017-09-01 06:06:35 EDT
Version
# rpm -qa | grep openshift-ansible
openshift-ansible-lookup-plugins-3.7.0-0.123.0.git.0.248cba6.el7.noarch
openshift-ansible-3.7.0-0.123.0.git.0.248cba6.el7.noarch
openshift-ansible-callback-plugins-3.7.0-0.123.0.git.0.248cba6.el7.noarch
openshift-ansible-filter-plugins-3.7.0-0.123.0.git.0.248cba6.el7.noarch
openshift-ansible-playbooks-3.7.0-0.123.0.git.0.248cba6.el7.noarch
openshift-ansible-docs-3.7.0-0.123.0.git.0.248cba6.el7.noarch
openshift-ansible-roles-3.7.0-0.123.0.git.0.248cba6.el7.noarch


This issue blocks the whole installation.
Comment 2 Jan Wozniak 2017-09-05 03:10:59 EDT
The list of allowed versions specified only 3.5 and 3.6, but given the master is going to become 3.7, I updated the allowed versions. Perhaps in future releases, when we branch out of master and start working on a new version, we can have a "branch out" script that would automate these chores.

https://github.com/openshift/openshift-ansible/pull/5297
Comment 3 Jan Wozniak 2017-09-05 07:58:17 EDT
it seems to be resolved by Jeff already, I missed his comment.
Comment 4 Xia Zhao 2017-09-08 02:13:17 EDT
bug fix is not in openshift-ansible-playbooks-3.7.0-0.125.0.git.0.91043b6.el7.noarch
Comment 5 openshift-github-bot 2017-09-08 11:16:08 EDT
Commit pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/d2bf958251e4092ba90218cd3cc20621483b3057
bug 1487573. Bump the allowed ES versions
Comment 6 Junqi Zhao 2017-09-10 21:31:54 EDT
Fixed in openshift-ansible-3.7.0-0.125.1, the latest openshift-ansible version is openshift-ansible-3.7.0-0.125.0, will verify this defect when we get the fixed openshift-ansible packages.
Comment 7 Junqi Zhao 2017-09-12 02:56:47 EDT
Tested with openshift-ansible-3.7.0-0.126.0 and 
logging-elasticsearch:v3.7.0-0.125.0.0

It does not fail at "Invalid version specified for Elasticsearch", but es pod failed to start up dur to java.lang.IllegalArgumentException
# oc logs logging-es-data-master-hm7cbr4f-1-tp76j
[2017-09-12 06:53:54,431][INFO ][container.run            ] Begin Elasticsearch startup script
[2017-09-12 06:53:54,546][INFO ][container.run            ] Comparing the specified RAM to the maximum recommended for Elasticsearch...
[2017-09-12 06:53:54,548][INFO ][container.run            ] Inspecting the maximum RAM available...
[2017-09-12 06:53:54,595][INFO ][container.run            ] ES_HEAP_SIZE: '512m'
[2017-09-12 06:53:54,597][INFO ][container.run            ] Setting heap dump location /elasticsearch/persistent/heapdump.hprof
[2017-09-12 06:53:54,602][INFO ][container.run            ] Checking if Elasticsearch is ready on https://localhost:9200
Exception in thread "main" java.lang.IllegalArgumentException: Unknown Discovery type [kubernetes]
	at org.elasticsearch.discovery.DiscoveryModule.configure(DiscoveryModule.java:100)
	at <<<guice>>>
	at org.elasticsearch.node.Node.<init>(Node.java:213)
	at org.elasticsearch.node.Node.<init>(Node.java:140)
	at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:143)
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:194)
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:286)
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:45)
Refer to the log for complete error details.
Comment 8 Junqi Zhao 2017-09-12 03:00 EDT
Created attachment 1324725 [details]
es pod log
Comment 9 Jan Wozniak 2017-09-12 03:20:46 EDT
The 3.7 should include new discovery mechanism for ES pods, it was merged into master on Thursday and Friday.

brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-elasticsearch   latest              9e932136598b        23 hours ago        434.4 MB
brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-elasticsearch   v3.7                9e29edfea4b3        6 days ago          434.4 MB

From what I see in the brew registry, the 'latest' image already has it, the 'v3.7' does not but ansible 'master' branch requires it. 

We should update the elasticsearch:v3.7 image
Comment 11 Jeff Cantrill 2017-09-13 11:49:11 EDT
Moving back to ON_QA since the 'new' issue in comment#7 is separate from the reported image.  Please use older 3.7 images to validate the ansible fix resolves the problem.
Comment 12 Jan Wozniak 2017-09-13 12:14:41 EDT
Jeff, on the contrary, using older 3.7 image doesn't help because it already is too old. We either need new 3.7 image matching 3.7 openshift-ansible or QE to temporarily set this in openshift-ansible

roles/openshift_logging_elasticsearch/templates/elasticsearch.yml.j2

cloud:
  kubernetes:
    service: ${SERVICE_DNS}
Comment 13 Junqi Zhao 2017-09-14 00:21:04 EDT
Close this defect as VERIFIED, since the original issue was fixed, see Comment 7, for the new issue in Comment 7, it is blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1491171 now, will open one defect once BZ # 1491171 is fixed and the error still exist.
Comment 15 Jeff Cantrill 2017-09-18 17:04:17 EDT
This most likely is a result of the images being updated from under the deployment without re-running ansible.  This can occur if the pull policy for images is set to always pull and the inventory does not explicitly set logging image versions.  

If the logging-elasticsearch configmap does not have this section:

    cloud:
      kubernetes:
        pod_label: ${POD_LABEL}
        pod_port: 9300
        namespace: ${NAMESPACE}


Most likely the problem is they have old configs but new readiness probe.  They should be able to workaround by updating the configmap to:

cloud:
  kubernetes:
    service: ${SERVICE_DNS}
    namespace: ${NAMESPACE}
Comment 19 errata-xmlrpc 2017-11-28 17:09:17 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188

Note You need to log in before you can comment on or make changes to this bug.