Description of problem:smartstate analysis for OpenShift provier fails Version-Release number of selected component (if applicable): 5.6.2.1 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: I have processed the CFME logs and isolated the process logs for the two items identified in the case description: from the logs associated with dz2lrcfp405.divbiz.net the refresh worker process - 30183 => http error Unexpected Exception during refresh: HTTP status code 403, User "system:serviceaccount:management-infra:management-admin" cannot list all componentstatuses in the cluster and two smartstate worker processes: 29011 and 29019 both encountering errors but not while processing any smartstate message =================================== from the logs associated with dz2lrcfp404.divbiz.net no ems refresh error in this log set. several smartstate scanning errors captured in worker process logs for pids 11690, 11698, 11706, and 11714 ==================================== and for the logs associated with dz2lrcfp403.divbiz.net the refresh worker processes - 25983 , 11986 , 14085 (and others) http error and several smartstate scanning errors capture in the work process logs for pids 11730,11714, I have collected the OpenShift logs into the location http://file.rdu.redhat.com/~thenness/SF-01720000-CFME-OpenShift/ under the directory "OpenShift Materials" and the CFME logs with the extracted processes noted above in the same http://file.rdu.redhat.com/~thenness/SF-01720000-CFME-OpenShift/ location under the directory "CFME Materials".
Erez, Thank you for responding. I don't know if you missed the reference in the original text. I have collected the OpenShift logs into the location http://file.rdu.redhat.com/~thenness/SF-01720000-CFME-OpenShift/ under the directory "OpenShift Materials" and the CFME logs with the extracted processes noted above in the same http://file.rdu.redhat.com/~thenness/SF-01720000-CFME-OpenShift/ location under the directory "CFME Materials". there are three CFME instances in this environment. As I mentioned before, I have zero experience and/or training with OpenShift so when you say *you* need more information, *I* need to know exactly what *you* need since my role in this is only passing along to the customer what *you* need to resolve this issue. Please advise. Tom Hennessy
Erez, Any update from your side? Do you want any other info from customer?
Erez, Please find attached logs
Martin, I have created a PR to fix this problem but I am still not sure how to replicate it on my own. I would appreciate it if you could explain how this proxy is defined exactly? [1]https://github.com/ManageIQ/manageiq/pull/14578
I have tested the modified scan job from the PR in our environment successfully.
Hi Erez! I looking for way to replicate and check this issue, From what i understand the problem related to specific customer proxy settings that change error messages from Openshift, so its cause problem with healthz poll status from Image inspector POD. Since that do you have an idea if i replicate this problem here?
I never succeeded in replicating it myself, I would have tried doing so by creating a proxy between ManageIQ and Openshift that will change 404 messages to 500 (HTTPBadRequest). Also notice that the fix was backported to fine, so to reproduce you will have to use and early fine version.
Verified. After discussion with dev+PM this BZ cannot be reproduced. Hence, should this issue reappear this BZ can then be reopened.