1371803 – unknown access error to pod management-infra/manageiq-img-scan: #<Net::HTTPForbidden:>

Bug 1371803 - unknown access error to pod management-infra/manageiq-img-scan: #<Net::HTTPForbidden:>

Summary: unknown access error to pod management-infra/manageiq-img-scan: #<Net::HTTPFo...

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Red Hat CloudForms Management Engine
Classification:	Red Hat
Component:	Providers
Sub Component:
Version:	5.6.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	GA
Target Release:	5.8.0
Assignee:	Erez Freiberger
QA Contact:	Einat Pacifici
Docs Contact:
URL:
Whiteboard:	container
Depends On:
Blocks:	1372412
TreeView+	depends on / blocked

Reported:	2016-08-31 07:19 UTC by Prasad Mukhedkar
Modified:	2021-09-09 11:55 UTC (History)
CC List:	13 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1372412 (view as bug list)
Environment:
Last Closed:	2017-04-24 19:57:01 UTC
Category:	---
Cloudforms Team:	Container Management
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
master-config (4.60 KB, text/plain) 2016-09-26 10:38 UTC, Prasad Mukhedkar	no flags	Details
View All

Description Prasad Mukhedkar 2016-08-31 07:19:39 UTC

Description of problem:

Smart State Analysis for OpenStack container image fails with 
following error reported :


[----] E, [2016-08-30T13:14:07.329208 #11710:5c5990] ERROR -- : Unexpected Exception during refresh: HTTP status code 403, User "system:serviceaccount:management-infra:management-admin" cannot list all componentstatuses in the cluster


[----] I, [2016-08-30T13:14:07.329418 #11710:5c5990]  INFO -- : MIQ(ManageIQ::Providers::OpenshiftEnterprise::ContainerManager#with_provider_connection) Connecting through ManageIQ::Providers::OpenshiftEnterprise::ContainerManager: [OSE]
[----] I, [2016-08-30T13:14:08.488554 #2471:5c5990]  INFO -- : Q-task_id([bc991032-6ea2-11e6-8bea-005056957282]) MIQ(ManageIQ::Providers::Kubernetes::ContainerManager::Scanning::Job#pod_wait) unknown access error to pod management-infra/manageiq-img-scan-bc991: #<Net::HTTPForbidden:0x00000010b409a0>
[----] I, [2016-08-30T13:14:08.503177 #2471:5c5990]  INFO -- : Q-task_id([bc991032-6ea2-11e6-8bea-005056957282]) MIQ(MiqQueue.put) Message id: [99000004706424],  id: [], Zone: [default], Role: [smartstate], Server: [], Ident: [generic], Target id: [], Instance id: [99000000000750], Task id: [bc991032-6ea2-11e6-8bea-005056957282], Command: [Job.signal], Timeout: [600], Priority: [20], State: [ready], Deliver On: [], Data: [], Args: [:abort_job, "unknown access error to pod management-infra/manageiq-img-scan-bc991: #<Net::HTTPForbidden:0x00000010b409a0>", "error"]
[----] I, [2016-08-30T13:14:08.503461 #2471:5c5990]  INFO -- : Q-task_id([bc991032-6ea2-11e6-8bea-005056957282]) MIQ(MiqQueue#delivered) Message id: [99000004706258], State: [ok], Delivered in [40.791559192] seconds
[----] I, [2016-08-30T13:14:08.506426 #2471:5c5990]  INFO -- : Q-task_id([bc991032-6ea2-11e6-8bea-005056957282]) MIQ(MiqQueue#m_callback) Message id: [99000004706258], Invoking Callback with args: ["ok", "Message delivered successfully", "#<MiqQueue id: 99000004706424, target_id: nil, priority: 20, method_name: \"signal\", state: \"ready\", created_on: \"2016-08-30 11:14:08\", updated_on: \"2016-08-30 11:14:08\", lock_version: 0, task_id: \"bc991032-6ea2-11e6-8bea-005056957282\", deliver_on: nil, queue_name: \"generic\", class_name: \"Job\", instance_id: 99000000000750, args: [:abort_job, \"unknown access error to pod management-infra/manageiq-img-scan-bc991: #<Net::HTTPForbidden:0x00000010b409a0>\", \"error\"], miq_callback: {:class_name=>\"Manage..."]
[----] I, [2016-08-30T13:14:10.276051 #2281:5c5990]  INFO -- : MIQ(MiqServer#populate_queue_messages) Fetched 14 miq_queue rows for queue_name=ems_metrics_processor, wcount=2, priority=200
[----] I, [2016-08-30T13:14:10.278278 #2281:5c5990]  INFO -- : MIQ(MiqServer#populate_queue_messages) Fetched 1 miq_queue rows for queue_name=generic, wcount=4, priority=200
[----] I, [2016-08-30T13:14:10.539479 #2471:5c5990]  INFO -- : MIQ(MiqPriorityWorker::Runner#get_message_via_drb) Message id: [99000004706424], MiqWorker id: [99000000072537], Zone: [default], Role: [smartstate], Server: [], Ident: [generic], Target id: [], Instance id: [99000000000750], Task id: [bc991032-6ea2-11e6-8bea-005056957282], Command: [Job.signal], Timeout: [600], Priority: [20], State: [dequeue], Deliver On: [], Data: [], Args: [:abort_job, "unknown access error to pod management-infra/manageiq-img-scan-bc991: #<Net::HTTPForbidden:0x00000010b409a0>", "error"], Dequeued in: [2.045585294] seconds


We have verified that the service account is created as per the instructions
given in :

https://access.redhat.com/documentation/en/red-hat-cloudforms/version-4.0/managing-providers/#configuring_service_accounts


Version-Release number of selected component (if applicable):
cfme 5.6.1.2 (5.6.1.2-20160810181333_8ba817b)

How reproducible:
always at customer end

Steps to Reproduce:
1.Add OpenShift Provider
2. Select a container image and initiate smart state analysis

Actual results:

Smart state analysis fails with error (HTTPForbidden)

Expected results:

smart state should work

Additional info:
evm.log

Comment 2 Federico Simoncelli 2016-09-01 14:12:00 UTC

Erez, Mooli can you look at this? I have the feeling that there may be missing permissions in the above cluster.


FYI this error is just for information, it's not critical (nor related to SmartState):

[----] E, [2016-08-30T13:14:07.329208 #11710:5c5990] ERROR -- : Unexpected Exception during refresh: HTTP status code 403, User "system:serviceaccount:management-infra:management-admin" cannot list all componentstatuses in the cluster

Comment 3 Erez Freiberger 2016-09-01 15:25:53 UTC

Parasad, I would like to see that we have cluster-reader permissions on the cluster (Per Federico's comment that this might be a missing permissions problem) Can you access the Openshift cluster? 
How was the manageiq account set up?

Is management-infra/management-admin in the output of the command:
oc get clusterrolebinding cluster-readers

Do we have the image-auditor / image-puller permission?
oc get clusterrolebinding system:image-auditor
oc get clusterrolebinding system:image-puller
(do the container management-infra/inspector-admin and management-infra/management-admin ?)

Thank you

Comment 5 Prasad Mukhedkar 2016-09-02 08:45:51 UTC

Hello Erez,

Here are the commands output: 

# oc get clusterrolebinding cluster-readers -o yaml
apiVersion: v1
groupNames:
- system:cluster-readers
kind: ClusterRoleBinding
metadata:
  creationTimestamp: 2016-04-04T14:53:17Z
  name: cluster-readers
  resourceVersion: "6193113"
  selfLink: /oapi/v1/clusterrolebindings/cluster-readers
  uid: f6ead8e0-fa74-11e5-b316-005056950deb
roleRef:
  name: cluster-reader
subjects:
- kind: ServiceAccount
  name: management-admin
  namespace: management-infra
- kind: ServiceAccount
  name: heapster
  namespace: openshift-infra
- kind: SystemUser
  name: system:serviceacccount:openshift-infra:heapster
- kind: ServiceAccount
  name: aggregated-logging-fluentd
  namespace: testmn
- kind: User
  name: U443666
- kind: User
  name: U410976
- kind: ServiceAccount
  name: cfme
  namespace: management-infra
- kind: SystemGroup
  name: system:cluster-readers
userNames:
- system:serviceaccount:management-infra:management-admin
- system:serviceaccount:openshift-infra:heapster
- system:serviceacccount:openshift-infra:heapster
- system:serviceaccount:testmn:aggregated-logging-fluentd
- U443666
- U410976
- system:serviceaccount:management-infra:cfme

-------------------------------------------------------------------------------------

# oc get clusterrolebinding system:image-auditor -o yaml
apiVersion: v1
groupNames: null
kind: ClusterRoleBinding
metadata:
  creationTimestamp: 2016-07-12T09:37:39Z
  name: system:image-auditor
  resourceVersion: "6193234"
  selfLink: /oapi/v1/clusterrolebindings/system:image-auditor
  uid: 4627884e-4814-11e6-8554-005056950deb
roleRef:
  name: system:image-auditor
subjects:
- kind: ServiceAccount
  name: management-admin
  namespace: management-infra
userNames:
- system:serviceaccount:management-infra:management-admin

-------------------------------------------------------------------------------------

# oc get clusterrolebinding system:image-puller -o yaml
apiVersion: v1
groupNames:
- system:authenticated
kind: ClusterRoleBinding
metadata:
  creationTimestamp: 2016-04-06T12:08:47Z
  name: system:image-puller
  resourceVersion: "6193236"
  selfLink: /oapi/v1/clusterrolebindings/system:image-puller
  uid: 50d63054-fbf0-11e5-9719-005056950deb
roleRef:
  name: system:image-puller
subjects:
- kind: ServiceAccount
  name: inspector-admin
  namespace: management-infra
- kind: SystemGroup
  name: system:authenticated
userNames:
- system:serviceaccount:management-infra:inspector-admin

Service Account was created by referring following documentation :

https://access.redhat.com/documentation/en/red-hat-cloudforms/4.0/managing-providers/chapter-3-containers-providers

Let us know if any additional information needed.

Comment 6 Mooli Tayer 2016-09-04 14:23:44 UTC

Can we get the logs please? 
Was this working at some point and stopped? never worked?

Comment 15 Prasad Mukhedkar 2016-09-21 06:35:09 UTC

No it's not missing, We've already checked that.

 #oc get clusterrole management-infra-admin -o yaml
apiVersion: v1
kind: ClusterRole
metadata:
  creationTimestamp: 2016-04-05T07:06:38Z
  name: management-infra-admin
  resourceVersion: "50"
  selfLink: /oapi/v1/clusterroles/management-infra-admin
  uid: f119c760-fafc-11e5-8ab9-00505695658d
rules:
- apiGroups: null
  attributeRestrictions: null
  resources:
  - pods/proxy
  verbs:
  - '*'

Comment 18 Prasad Mukhedkar 2016-09-26 10:38:56 UTC

Created attachment 1204775 [details]
master-config

Comment 37 Federico Simoncelli 2017-02-28 15:01:21 UTC

This is most likely a proxy misconfiguration, please read:

https://docs.openshift.com/container-platform/3.4/install_config/install/advanced_install.html#advanced-install-configuring-global-proxy

And the relevant configuration file is:

/etc/sysconfig/atomic-openshift-master

Where you can configure the NO_PROXY parameter to reach the nodes.

Comment 38 Federico Simoncelli 2017-02-28 15:01:57 UTC

Please let us know if comment 37 addressed the issue.

Comment 39 Erez Freiberger 2017-03-31 12:34:10 UTC

Sachin,
Using input from a similar BZ I made a PR [1] that I think might also fix your problem.


[1]https://github.com/ManageIQ/manageiq/pull/14578

Comment 40 Mooli Tayer 2017-04-23 12:55:40 UTC

Erez, Should this be move to POST?

Comment 41 Erez Freiberger 2017-04-23 13:05:29 UTC

Mooli,
I assumed that this issue is caused by something similar to [1] which is fixed by [2]. We need more details to be sure that this is indeed solved now.

[1]https://bugzilla.redhat.com/show_bug.cgi?id=1384629
[2]https://github.com/ManageIQ/manageiq/pull/14578

Comment 42 Mooli Tayer 2017-04-23 15:38:12 UTC

reverting needinfo removed by mistake(See comment 38)

Comment 43 Mooli Tayer 2017-04-23 15:39:49 UTC

We should consider closing this one on insufficient data as it has an unanswered needinfo since 2017-02-28.

Note You need to log in before you can comment on or make changes to this bug.