Bug 1506736
Summary: | Misleading error messages in Heapster log when namespace deleted - Hawkular returns 500 to Heapster | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Mike Fiedler <mifiedle> |
Component: | Hawkular | Assignee: | Jan Martiska <jmartisk> |
Status: | CLOSED ERRATA | QA Contact: | Mike Fiedler <mifiedle> |
Severity: | low | Docs Contact: | |
Priority: | low | ||
Version: | 3.7.0 | CC: | aos-bugs, jmartisk, jsanda, juzhao, lmeyer, mifiedle, mwringe, rvargasp, wsun |
Target Milestone: | --- | ||
Target Release: | 3.11.z | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2019-02-20 14:11:01 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Mike Fiedler
2017-10-26 17:01:03 UTC
This issue only happens if you are a cluster admin or reader and you are requesting metrics for a project which does not exist. Hawkular Metrics see that OpenShift has authorised this user but then cannot find the corresponding project and throws the error message. Its being tracked with https://issues.jboss.org/browse/HWKMETRICS-741 but its considered a low priority issue. Actually, wait, the Heapster issue is something else. In this case Heapster is trying to write metrics to Hawkular Metrics for a project which no longer exists. Is this something you have seen before? Or only after trying Micke test container? I've seen it before with non-patched 3.7. Not sure about 3.6. I think it is definitely when heapster tries to write metrics for the deleted project the first time. It does not continue indefinitely. Maybe a timing window while project is in Terminating state. Low priority in any case. Mike, The changes for HWMETRICS-741 went into OCP 3.9. Do you know if this is still a problem? In OpenShift 3.10.0-0.53.0 I see a pair of messages in the heapster log for each deleted namespace: W0529 21:23:05.204029 1 namespace_based_enricher.go:67] Namespace doesn't exist: clusterproject16 E0529 21:23:05.673293 1 client.go:254] Hawkular returned status code 404, error message: Reply could not be parsed: invalid character '<' looking for beginning of value In the Hawkular log I just see: 2018-05-29 21:23:05,370 WARN [org.hawkular.openshift.auth.org.hawkular.openshift.namespace.NamespaceListener] (default I/O-8) Error getting project metadata. Code 404: Not Found I no longer see the 500. John, What do you think? All that will be done here? If so, go ahead and put this ON_QA. I think the parse error in the Heapster log is ugly, but not a big deal. I agree that the parser error is ugly. Let's leave it open for now. There are some other tickets we have to tackle first, but the changes involved to clean this up some should be minor and low risk. John, I will take this if you don't mind. PR for 0.31.x which is used in OpenShift 3.11: https://github.com/hawkular/hawkular-metrics/pull/994 If needed, I can backport to older versions too... Hi Mike Please help check if this bug could be verified. Thanks! On OCP 3.11.51 I still see the messages: W1205 15:55:05.073712 1 namespace_based_enricher.go:67] Namespace doesn't exist: clusterproject7 E1205 15:55:05.268897 1 client.go:254] Hawkular returned status code 404, error message: Reply could not be parsed: invalid character '<' looking for beginning of value The hawkular version in this build is: Hawkular Metrics version: 0.31.0.Final-redhat-1+72a93bbbf5 @jmartisk Do you expect the fix to be in ^^ level of hawkular? No, this is now merged in 0.31 branch but hasn't been part of any actual release yet, I suppose it will be in 0.31.1.Final. Not sure if MODIFIED is the right state for this as it seems to trigger some automatic transitions?! Should this be in POST? I think MODIFIED is correct since the PR has merged. POST is usually when the PR has been created but not yet merged. On OCP 3.11.59 I still see the messages: E1221 05:17:35.775110 1 client.go:254] Hawkular returned status code 404, error message: Reply could not be parsed: invalid character '<' looking for beginning of value W1221 05:18:35.073278 1 namespace_based_enricher.go:67] Namespace doesn't exist: p6 all metrics images version: v3.11.59-2 Hawkular Metrics version: 0.31.0.Final-redhat-1 Move back to MODIFIED I suggest to remove this defect from errata if we don't want to fix it on time, since it is not a critical issue. This is fixed in image brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/metrics-hawkular-metrics:v3.11.70 (Hawkular Metrics 0.31.1.Final-redhat-1) Marking verified on hawkular v0.31.0.Final-redhat-1+72a93bbbf5 as installed from OpenShift 3.11.69 INFO 2019-01-18 14:42:06,591 [main] org.hawkular.metrics.schema.Installer:logVersion:110 - Hawkular Metrics Schema Installer v0.31.0.Final-redhat-1+72a93bbbf5 registry.reg-aws.openshift.com:443/openshift3/metrics-cassandra v3.11.69 7d57819ca4f9 44 hours ago 596 MB registry.reg-aws.openshift.com:443/openshift3/metrics-heapster v3.11.69 486ca24c0e16 44 hours ago 284 MB registry.reg-aws.openshift.com:443/openshift3/metrics-hawkular-metrics v3.11.69 d5689ada5ca7 44 hours ago 1.6 GB After namespace deletion, heapster logs now contain a reasonable pair of messages: W0118 16:08:35.062221 1 namespace_based_enricher.go:67] Namespace doesn't exist: clusterproject10 E0118 16:08:35.668171 1 client.go:254] Hawkular returned status code 404, error message: Could not determine a namespace id for namespace clusterproject10 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0326 |