Bug 1487573
Summary: | Deploy logging 3.7 via ansible, it failed at "Invalid version specified for Elasticsearch". | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Junqi Zhao <juzhao> | ||||||
Component: | Logging | Assignee: | Jeff Cantrill <jcantril> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Junqi Zhao <juzhao> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | 3.7.0 | CC: | aos-bugs, erich, javier.ramirez, jwozniak, rmeggins, wsun, xiazhao, xtian | ||||||
Target Milestone: | --- | Keywords: | TestBlocker | ||||||
Target Release: | 3.7.0 | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | No Doc Update | |||||||
Doc Text: |
undefined
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2017-11-28 22:09:17 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Junqi Zhao
2017-09-01 10:04:51 UTC
Version # rpm -qa | grep openshift-ansible openshift-ansible-lookup-plugins-3.7.0-0.123.0.git.0.248cba6.el7.noarch openshift-ansible-3.7.0-0.123.0.git.0.248cba6.el7.noarch openshift-ansible-callback-plugins-3.7.0-0.123.0.git.0.248cba6.el7.noarch openshift-ansible-filter-plugins-3.7.0-0.123.0.git.0.248cba6.el7.noarch openshift-ansible-playbooks-3.7.0-0.123.0.git.0.248cba6.el7.noarch openshift-ansible-docs-3.7.0-0.123.0.git.0.248cba6.el7.noarch openshift-ansible-roles-3.7.0-0.123.0.git.0.248cba6.el7.noarch This issue blocks the whole installation. The list of allowed versions specified only 3.5 and 3.6, but given the master is going to become 3.7, I updated the allowed versions. Perhaps in future releases, when we branch out of master and start working on a new version, we can have a "branch out" script that would automate these chores. https://github.com/openshift/openshift-ansible/pull/5297 it seems to be resolved by Jeff already, I missed his comment. bug fix is not in openshift-ansible-playbooks-3.7.0-0.125.0.git.0.91043b6.el7.noarch Commit pushed to master at https://github.com/openshift/openshift-ansible https://github.com/openshift/openshift-ansible/commit/d2bf958251e4092ba90218cd3cc20621483b3057 bug 1487573. Bump the allowed ES versions Fixed in openshift-ansible-3.7.0-0.125.1, the latest openshift-ansible version is openshift-ansible-3.7.0-0.125.0, will verify this defect when we get the fixed openshift-ansible packages. Tested with openshift-ansible-3.7.0-0.126.0 and logging-elasticsearch:v3.7.0-0.125.0.0 It does not fail at "Invalid version specified for Elasticsearch", but es pod failed to start up dur to java.lang.IllegalArgumentException # oc logs logging-es-data-master-hm7cbr4f-1-tp76j [2017-09-12 06:53:54,431][INFO ][container.run ] Begin Elasticsearch startup script [2017-09-12 06:53:54,546][INFO ][container.run ] Comparing the specified RAM to the maximum recommended for Elasticsearch... [2017-09-12 06:53:54,548][INFO ][container.run ] Inspecting the maximum RAM available... [2017-09-12 06:53:54,595][INFO ][container.run ] ES_HEAP_SIZE: '512m' [2017-09-12 06:53:54,597][INFO ][container.run ] Setting heap dump location /elasticsearch/persistent/heapdump.hprof [2017-09-12 06:53:54,602][INFO ][container.run ] Checking if Elasticsearch is ready on https://localhost:9200 Exception in thread "main" java.lang.IllegalArgumentException: Unknown Discovery type [kubernetes] at org.elasticsearch.discovery.DiscoveryModule.configure(DiscoveryModule.java:100) at <<<guice>>> at org.elasticsearch.node.Node.<init>(Node.java:213) at org.elasticsearch.node.Node.<init>(Node.java:140) at org.elasticsearch.node.NodeBuilder.build(NodeBuilder.java:143) at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:194) at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:286) at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:45) Refer to the log for complete error details. Created attachment 1324725 [details]
es pod log
The 3.7 should include new discovery mechanism for ES pods, it was merged into master on Thursday and Friday. brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-elasticsearch latest 9e932136598b 23 hours ago 434.4 MB brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/logging-elasticsearch v3.7 9e29edfea4b3 6 days ago 434.4 MB From what I see in the brew registry, the 'latest' image already has it, the 'v3.7' does not but ansible 'master' branch requires it. We should update the elasticsearch:v3.7 image Moving back to ON_QA since the 'new' issue in comment#7 is separate from the reported image. Please use older 3.7 images to validate the ansible fix resolves the problem. Jeff, on the contrary, using older 3.7 image doesn't help because it already is too old. We either need new 3.7 image matching 3.7 openshift-ansible or QE to temporarily set this in openshift-ansible roles/openshift_logging_elasticsearch/templates/elasticsearch.yml.j2 cloud: kubernetes: service: ${SERVICE_DNS} Close this defect as VERIFIED, since the original issue was fixed, see Comment 7, for the new issue in Comment 7, it is blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1491171 now, will open one defect once BZ # 1491171 is fixed and the error still exist. This most likely is a result of the images being updated from under the deployment without re-running ansible. This can occur if the pull policy for images is set to always pull and the inventory does not explicitly set logging image versions. If the logging-elasticsearch configmap does not have this section: cloud: kubernetes: pod_label: ${POD_LABEL} pod_port: 9300 namespace: ${NAMESPACE} Most likely the problem is they have old configs but new readiness probe. They should be able to workaround by updating the configmap to: cloud: kubernetes: service: ${SERVICE_DNS} namespace: ${NAMESPACE} Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188 |