Bug 1503563
Summary: | Logging upgrade from 3.5 to 3.6 fails with "Exception in thread "main" java.lang.IllegalArgumentException: Unknown Discovery type [kubernetes]" | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Peter Portante <pportant> |
Component: | Logging | Assignee: | Jan Wozniak <jwozniak> |
Status: | CLOSED ERRATA | QA Contact: | Anping Li <anli> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 3.6.0 | CC: | aos-bugs, jcantril, jwozniak, pdwyer, rmeggins, rromerom, stwalter, tatanaka, xtian |
Target Milestone: | --- | Keywords: | OpsBlocker |
Target Release: | 3.6.z | ||
Hardware: | All | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: |
undefined
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2017-12-07 07:13:19 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Peter Portante
2017-10-18 12:01:10 UTC
Jan, I know you responded in the email but can you provide information here. Is this a result of new images being pulled without being officially deployed via ansible? Is there some other scenario that can lead us to this situation? This occurs when trying to deploy outdated ES image with the newest ansible (after label discovery and readiness probe were merged). At the time of writing, there are still a couple of images in our registries, that deserve a rebuild. The label discovery and readiness probe were merged in the second half of September and it should be contained in 3.6 and will be contained in 3.7 once released. 1) https://access.redhat.com/containers/?tab=tags#/registry.access.redhat.com/openshift3/logging-elasticsearch Here 3.6 tag and latest have not been rebuilt in two months 2) https://hub.docker.com/r/openshift/origin-logging-elasticsearch/tags/ Here latest contains the proper library but 3.6 lacks update for three months A fast fix could be either to get ES image built after mid-September when the feature was merged or remove readiness probe from ES. https://github.com/openshift/openshift-ansible/issues/5497#issuecomment-331372471 If the openshift-ansible logging tasks are designed to work with a certain version of the logging images, why aren't those tasks requiring that minimum version be used? I am not sure how to require 3.6 image built after mid-September. Readiness probe was requested to be backported to 3.6 but images containing this functionality weren't rebuilt yet. *** Bug 1505860 has been marked as a duplicate of this bug. *** While it is certainly good to make a short-term release to address this problem, the long term problem is that for any number of valid reasons, the openshift-ansible playbooks can be told to install using a version of OpenShift for which those playbooks are not compatible. This fact appears to be the core problem. We need to engineer a way for the playbooks and images to work together to avoid these kinds of problems. Please find a way to track this need via another BZ, an upstream issue in the repos, or Trello card. From test result, the readiness will not be added to ES. The following scenarios pass: 1) openshift-ansible:v3.6.173.0.49 deploy v3.6.173.0.5 2) openshift-ansible:v3.6.173.0.49 deploy v3.6.173.0.49 3) openshift-ansible:v3.6.173.0.49 upgrade logging from 3.5.0 to v3.6.173.0.49. 4) openshift-ansible:v3.6.173.0.49 upgrade logging from the current latest release images ( v3.6.173.0.5: elasticsearch, v3.6.173.0.21: fluentd/kibana/auth-proxy) to v3.6.173.0.49. Please ignore the comment 9, I used the image openshift3/ose-ansible:v3.6.173.0.49. I found the openshift3/ose-ansible:v3.6.173.0.49 is built with openshift-ansile-v3.6.173.0.5. The following scenario pass testing with openshift-ansible-v3.6.173.0.59. so move bug to verified. Scenarios 1) Deploy Logging v3.6.173.0.49 on OCP v3.6.173.0.49 Scenarios 2) Upgrade Logging 3.5.0 deployed by openshift-ansile-3.5.132 to OCP v3.6.173.0.49 on OCP v3.6.173.0.49 By the way, If you want to deploy Elasticsearch:v3.6.173.0.5, you must use openshift-ansible-3.7.0.21 and prior. this will provide partial solution, until we have a better way https://github.com/openshift/origin-aggregated-logging/pull/758 Relevant information on how to revert the discovery mechanism or disable the probe: https://github.com/openshift/openshift-ansible/issues/5497#issuecomment-331372471 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3389 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days |