Description of problem: Smart State Analysis for OpenStack container image fails with following error reported : [----] E, [2016-08-30T13:14:07.329208 #11710:5c5990] ERROR -- : Unexpected Exception during refresh: HTTP status code 403, User "system:serviceaccount:management-infra:management-admin" cannot list all componentstatuses in the cluster [----] I, [2016-08-30T13:14:07.329418 #11710:5c5990] INFO -- : MIQ(ManageIQ::Providers::OpenshiftEnterprise::ContainerManager#with_provider_connection) Connecting through ManageIQ::Providers::OpenshiftEnterprise::ContainerManager: [OSE] [----] I, [2016-08-30T13:14:08.488554 #2471:5c5990] INFO -- : Q-task_id([bc991032-6ea2-11e6-8bea-005056957282]) MIQ(ManageIQ::Providers::Kubernetes::ContainerManager::Scanning::Job#pod_wait) unknown access error to pod management-infra/manageiq-img-scan-bc991: #<Net::HTTPForbidden:0x00000010b409a0> [----] I, [2016-08-30T13:14:08.503177 #2471:5c5990] INFO -- : Q-task_id([bc991032-6ea2-11e6-8bea-005056957282]) MIQ(MiqQueue.put) Message id: [99000004706424], id: [], Zone: [default], Role: [smartstate], Server: [], Ident: [generic], Target id: [], Instance id: [99000000000750], Task id: [bc991032-6ea2-11e6-8bea-005056957282], Command: [Job.signal], Timeout: [600], Priority: [20], State: [ready], Deliver On: [], Data: [], Args: [:abort_job, "unknown access error to pod management-infra/manageiq-img-scan-bc991: #<Net::HTTPForbidden:0x00000010b409a0>", "error"] [----] I, [2016-08-30T13:14:08.503461 #2471:5c5990] INFO -- : Q-task_id([bc991032-6ea2-11e6-8bea-005056957282]) MIQ(MiqQueue#delivered) Message id: [99000004706258], State: [ok], Delivered in [40.791559192] seconds [----] I, [2016-08-30T13:14:08.506426 #2471:5c5990] INFO -- : Q-task_id([bc991032-6ea2-11e6-8bea-005056957282]) MIQ(MiqQueue#m_callback) Message id: [99000004706258], Invoking Callback with args: ["ok", "Message delivered successfully", "#<MiqQueue id: 99000004706424, target_id: nil, priority: 20, method_name: \"signal\", state: \"ready\", created_on: \"2016-08-30 11:14:08\", updated_on: \"2016-08-30 11:14:08\", lock_version: 0, task_id: \"bc991032-6ea2-11e6-8bea-005056957282\", deliver_on: nil, queue_name: \"generic\", class_name: \"Job\", instance_id: 99000000000750, args: [:abort_job, \"unknown access error to pod management-infra/manageiq-img-scan-bc991: #<Net::HTTPForbidden:0x00000010b409a0>\", \"error\"], miq_callback: {:class_name=>\"Manage..."] [----] I, [2016-08-30T13:14:10.276051 #2281:5c5990] INFO -- : MIQ(MiqServer#populate_queue_messages) Fetched 14 miq_queue rows for queue_name=ems_metrics_processor, wcount=2, priority=200 [----] I, [2016-08-30T13:14:10.278278 #2281:5c5990] INFO -- : MIQ(MiqServer#populate_queue_messages) Fetched 1 miq_queue rows for queue_name=generic, wcount=4, priority=200 [----] I, [2016-08-30T13:14:10.539479 #2471:5c5990] INFO -- : MIQ(MiqPriorityWorker::Runner#get_message_via_drb) Message id: [99000004706424], MiqWorker id: [99000000072537], Zone: [default], Role: [smartstate], Server: [], Ident: [generic], Target id: [], Instance id: [99000000000750], Task id: [bc991032-6ea2-11e6-8bea-005056957282], Command: [Job.signal], Timeout: [600], Priority: [20], State: [dequeue], Deliver On: [], Data: [], Args: [:abort_job, "unknown access error to pod management-infra/manageiq-img-scan-bc991: #<Net::HTTPForbidden:0x00000010b409a0>", "error"], Dequeued in: [2.045585294] seconds We have verified that the service account is created as per the instructions given in : https://access.redhat.com/documentation/en/red-hat-cloudforms/version-4.0/managing-providers/#configuring_service_accounts Version-Release number of selected component (if applicable): cfme 5.6.1.2 (5.6.1.2-20160810181333_8ba817b) How reproducible: always at customer end Steps to Reproduce: 1.Add OpenShift Provider 2. Select a container image and initiate smart state analysis Actual results: Smart state analysis fails with error (HTTPForbidden) Expected results: smart state should work Additional info: evm.log
Erez, Mooli can you look at this? I have the feeling that there may be missing permissions in the above cluster. FYI this error is just for information, it's not critical (nor related to SmartState): [----] E, [2016-08-30T13:14:07.329208 #11710:5c5990] ERROR -- : Unexpected Exception during refresh: HTTP status code 403, User "system:serviceaccount:management-infra:management-admin" cannot list all componentstatuses in the cluster
Parasad, I would like to see that we have cluster-reader permissions on the cluster (Per Federico's comment that this might be a missing permissions problem) Can you access the Openshift cluster? How was the manageiq account set up? Is management-infra/management-admin in the output of the command: oc get clusterrolebinding cluster-readers Do we have the image-auditor / image-puller permission? oc get clusterrolebinding system:image-auditor oc get clusterrolebinding system:image-puller (do the container management-infra/inspector-admin and management-infra/management-admin ?) Thank you
Hello Erez, Here are the commands output: # oc get clusterrolebinding cluster-readers -o yaml apiVersion: v1 groupNames: - system:cluster-readers kind: ClusterRoleBinding metadata: creationTimestamp: 2016-04-04T14:53:17Z name: cluster-readers resourceVersion: "6193113" selfLink: /oapi/v1/clusterrolebindings/cluster-readers uid: f6ead8e0-fa74-11e5-b316-005056950deb roleRef: name: cluster-reader subjects: - kind: ServiceAccount name: management-admin namespace: management-infra - kind: ServiceAccount name: heapster namespace: openshift-infra - kind: SystemUser name: system:serviceacccount:openshift-infra:heapster - kind: ServiceAccount name: aggregated-logging-fluentd namespace: testmn - kind: User name: U443666 - kind: User name: U410976 - kind: ServiceAccount name: cfme namespace: management-infra - kind: SystemGroup name: system:cluster-readers userNames: - system:serviceaccount:management-infra:management-admin - system:serviceaccount:openshift-infra:heapster - system:serviceacccount:openshift-infra:heapster - system:serviceaccount:testmn:aggregated-logging-fluentd - U443666 - U410976 - system:serviceaccount:management-infra:cfme ------------------------------------------------------------------------------------- # oc get clusterrolebinding system:image-auditor -o yaml apiVersion: v1 groupNames: null kind: ClusterRoleBinding metadata: creationTimestamp: 2016-07-12T09:37:39Z name: system:image-auditor resourceVersion: "6193234" selfLink: /oapi/v1/clusterrolebindings/system:image-auditor uid: 4627884e-4814-11e6-8554-005056950deb roleRef: name: system:image-auditor subjects: - kind: ServiceAccount name: management-admin namespace: management-infra userNames: - system:serviceaccount:management-infra:management-admin ------------------------------------------------------------------------------------- # oc get clusterrolebinding system:image-puller -o yaml apiVersion: v1 groupNames: - system:authenticated kind: ClusterRoleBinding metadata: creationTimestamp: 2016-04-06T12:08:47Z name: system:image-puller resourceVersion: "6193236" selfLink: /oapi/v1/clusterrolebindings/system:image-puller uid: 50d63054-fbf0-11e5-9719-005056950deb roleRef: name: system:image-puller subjects: - kind: ServiceAccount name: inspector-admin namespace: management-infra - kind: SystemGroup name: system:authenticated userNames: - system:serviceaccount:management-infra:inspector-admin Service Account was created by referring following documentation : https://access.redhat.com/documentation/en/red-hat-cloudforms/4.0/managing-providers/chapter-3-containers-providers Let us know if any additional information needed.
Can we get the logs please? Was this working at some point and stopped? never worked?
No it's not missing, We've already checked that. #oc get clusterrole management-infra-admin -o yaml apiVersion: v1 kind: ClusterRole metadata: creationTimestamp: 2016-04-05T07:06:38Z name: management-infra-admin resourceVersion: "50" selfLink: /oapi/v1/clusterroles/management-infra-admin uid: f119c760-fafc-11e5-8ab9-00505695658d rules: - apiGroups: null attributeRestrictions: null resources: - pods/proxy verbs: - '*'
Created attachment 1204775 [details] master-config
This is most likely a proxy misconfiguration, please read: https://docs.openshift.com/container-platform/3.4/install_config/install/advanced_install.html#advanced-install-configuring-global-proxy And the relevant configuration file is: /etc/sysconfig/atomic-openshift-master Where you can configure the NO_PROXY parameter to reach the nodes.
Please let us know if comment 37 addressed the issue.
Sachin, Using input from a similar BZ I made a PR [1] that I think might also fix your problem. [1]https://github.com/ManageIQ/manageiq/pull/14578
Erez, Should this be move to POST?
Mooli, I assumed that this issue is caused by something similar to [1] which is fixed by [2]. We need more details to be sure that this is indeed solved now. [1]https://bugzilla.redhat.com/show_bug.cgi?id=1384629 [2]https://github.com/ManageIQ/manageiq/pull/14578
reverting needinfo removed by mistake(See comment 38)
We should consider closing this one on insufficient data as it has an unanswered needinfo since 2017-02-28.