"Cluster upgrade should maintain a functioning cluster" failed because the Insights operator reported degraded status: "Unable to report: gateway server reported unexpected error code: 415 (request=4dbe8c44218f43c7ad32be69309fd976): ". https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/5976
From the Insight operator pod logs: I0821 17:59:56.637562 1 insightsclient.go:111] Uploading application/vnd.redhat.openshift.periodic to https://cloud.redhat.com/api/ingress/v1/upload I0821 18:00:01.266355 1 insightsuploader.go:132] Unable to upload report after 4.62s: gateway server reported unexpected error code: 415 (request=0abffba15e904f23babad5f1b2725ee0):
Looks like the current code to set Media-Type [1] may be insufficient? Or https://cloud.redhat.com/api/ingress/v1/upload is being too picky about what it accepts? [1]: https://github.com/openshift/insights-operator/blob/915a77d65a9862fa2411fac208e5b477e0f57924/pkg/insights/insightsclient/insightsclient.go#L90
s/Media-Type/Content-Type/
"Unknown" is a better holding component than "Installer"
Bumped to Urgent because this is shutting down 4.2 upgrade CI: https://ci-search-ci-search-next.svc.ci.openshift.org/chart?name=release-.*-upgrade$&search=gateway%20server%20reported%20unexpected%20error%20code:%20415
Filling in here, the upstream server is complaining with logs like: {"level":"error","ts":1566410401.2626693,"caller":"upload/upload.go:76","msg":"Unable to find file or upload parts","error":"multipart: NextPart: EOF","request_id":"0abffba15e904f23babad5f1b2725ee0"} The timing of the outage roughly corresponds to [1], although we don't understand how that could be leading to the 415s yet. We're trying to work out the disconnect between the receiving code and the apparently fast uploads from the client: $ curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/5976/artifacts/e2e-aws-upgrade/must-gather/namespaces/openshift-insights/pods/insights-operator-b9466f584-tj2bb/operator/operator/logs/current.log | grep 'Uploading ' | tail -n2 2019-08-21T18:21:56.106817612Z I0821 18:21:56.106759 1 insightsuploader.go:126] Uploading latest report since 0001-01-01T00:00:00Z 2019-08-21T18:21:56.106899903Z I0821 18:21:56.106858 1 insightsclient.go:111] Uploading application/vnd.redhat.openshift.periodic to https://cloud.redhat.com/api/ingress/v1/upload [1]: https://github.com/RedHatInsights/uhc-auth-proxy/commit/400b13527667056e403e96fbb8a97fc825598d9e
We expect to be setting file [1] and the server code choking on it and setting the 415 is [2]. [1]: https://github.com/openshift/insights-operator/blob/915a77d65a9862fa2411fac208e5b477e0f57924/pkg/insights/insightsclient/insightsclient.go#L93 [2]: https://github.com/RedHatInsights/insights-ingress-go/blob/06e05176c610f2b8fe0acb039ad11d0f1765274d/upload/upload.go#L82-L83
The PR that landed just unblocks CI; it does not fix the underlying problem.
The underlying problem was a change in Akamai handling that led to payload removal from upload requests smaller than ~8 KiB (for example, see this test [1]). Jesse Jaggars is continuing to work on resolving the Akamai issue. Getting that issue resolved so we can revert #7 is still a 4.2 release blocker. [1]: https://github.com/openshift/insights-operator/pull/9#issuecomment-524419565
The Akamai config has been fixed, and by 2019-08-27T15:04Z the UploadFailed degradations had all gone away [1]. I've filed [2] to revert the earlier workaround. [1]: count(cluster_operator_conditions{name="insights",condition="Degraded",reason="UploadFailed"}) [2]: https://github.com/openshift/insights-operator/pull/12
Verified on 4.2.0-0.ci-2019-09-19-043318. Reports are uploaded correctly.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2922