1436915 – perf_capture_realtime method errors with "[RuntimeError]: Unsupported type ManageIQ::Providers::Kubernetes::ContainerManager::Container"

Bug 1436915 - perf_capture_realtime method errors with "[RuntimeError]: Unsupported type ManageIQ::Providers::Kubernetes::ContainerManager::Container"

Summary: perf_capture_realtime method errors with "[RuntimeError]: Unsupported type Ma...

Keywords:
Status:	CLOSED DUPLICATE of bug 1426683
Alias:	None
Product:	Red Hat CloudForms Management Engine
Classification:	Red Hat
Component:	C&U Capacity and Utilization
Sub Component:
Version:	5.7.0
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	GA
Target Release:	cfme-future
Assignee:	Federico Simoncelli
QA Contact:	Einat Pacifici
Docs Contact:
URL:
Whiteboard:	container:c&u
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-03-29 02:23 UTC by Thomas Hennessy
Modified:	2020-06-11 13:29 UTC (History)
CC List:	11 users (show)
Fixed In Version:	mfeifer@redhat.com
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-04-03 10:26:05 UTC
Category:	---
Cloudforms Team:	Container Management
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
*Process 22215 from March 24 2017 logs of 6824 appliance with multiple errors for msgids: 10000221306827,10000221346529,10000221367114,10000221381043 and many others** (3.03 KB, application/x-gzip) 2017-03-29 02:23 UTC, Thomas Hennessy	no flags	Details
View All

Description Thomas Hennessy 2017-03-29 02:23:22 UTC

Created attachment 1267211 [details]
Process 22215 from March 24 2017 logs of *6824 appliance with multiple errors for msgids: 10000221306827,10000221346529,10000221367114,10000221381043 and many others

Description of problem: processing by method perf_capture_realtime encounters reported error after successfully scheduling dozens of messages for other VMDB instance types.


Version-Release number of selected component (if applicable):Version: 5.7.1.3


How reproducible: Seems to be associated with OpenShift 


Steps to Reproduce:
1.
2.
3.

Actual results:metod errors consistently and never completes without error and never runs so long as to encounter timeout


Expected results:normal method terminatation after creating all performance capture messages.


Additional info:
Error log sequence follows:
=====
[----] I, [2017-03-24T09:51:05.975054 #22215:49113c]  INFO -- : MIQ(MiqPriorityWorker::Runner#get_message_via_drb) Message id: [10000221727923], MiqWorker id: [10000001072912], Zone: [CTC DMZ], Role: [ems_metrics_coordinator], Server: [], Ident: [generic], Target id: [], Instance id: [], Task id: [], Command: [Metric::Capture.perf_capture_timer], Timeout: [600], Priority: [20], State: [dequeue], Deliver On: [], Data: [], Args: [], Dequeued in: [4.352411201] seconds
[----] I, [2017-03-24T09:51:05.975459 #22215:49113c]  INFO -- : MIQ(Metric::Capture.perf_capture_timer) Queueing performance capture...
.....
[----] I, [2017-03-24T09:54:56.837117 #22215:49113c]  INFO -- : MIQ(MiqQueue.put) Message id: [10000221732431],  id: [], Zone: [CTC DMZ], Role: [ems_metrics_collector], Server: [], Ident: [openshift_enterprise], Target id: [], Instance id: [10000000003166], Task id: [], Command: [ManageIQ::Providers::Kubernetes::ContainerManager::Container.perf_capture_realtime], Timeout: [600], Priority: [100], State: [ready], Deliver On: [], Data: [], Args: [2017-03-24 11:09:40 UTC, 2017-03-24 14:54:56 UTC]
[----] E, [2017-03-24T09:55:00.459866 #22215:49113c] ERROR -- : MIQ(MiqQueue#deliver) Message id: [10000221727923], Error: [Unsupported type ManageIQ::Providers::Kubernetes::ContainerManager::Container (id: 10000000028453)]
[----] E, [2017-03-24T09:55:00.460078 #22215:49113c] ERROR -- : [RuntimeError]: Unsupported type ManageIQ::Providers::Kubernetes::ContainerManager::Container (id: 10000000028453)  Method:[rescue in deliver]
[----] E, [2017-03-24T09:55:00.460336 #22215:49113c] ERROR -- : /var/www/miq/vmdb/app/models/metric/ci_mixin/capture.rb:16:in `queue_name_for_metrics_collection'
/var/www/miq/vmdb/app/models/metric/ci_mixin/capture.rb:68:in `perf_capture_queue'
/var/www/miq/vmdb/app/models/metric/capture.rb:215:in `block in queue_captures'
/var/www/miq/vmdb/app/models/metric/capture.rb:210:in `each'
/var/www/miq/vmdb/app/models/metric/capture.rb:210:in `queue_captures'
/var/www/miq/vmdb/app/models/metric/capture.rb:51:in `perf_capture_timer'
/var/www/miq/vmdb/app/models/miq_queue.rb:347:in `block in deliver'
.....
=====

Comment 2 Mooli Tayer 2017-03-29 11:50:51 UTC

Thomas can you please provide 

1. the output of: (from rails console)

ManageIQ::Providers::Kubernetes::ContainerManager::Container.find(10000000028453)

2. Also grep of this on all logs (including rolled ones)

grep "Disconnecting Container" evm.log|grep 10000000028453

quick reference points:
a. https://github.com/manageiq/manageiq/blob/4ad0054c6a92d2b6ee63437b4f1508fb0a6952e5/app/models/container.rb#L45

b. https://github.com/manageiq/manageiq/blob/7bd3090330bf4ee8076737492fbf0dec1001da9c/app/models/metric/ci_mixin/capture.rb#L8

Comment 3 Thomas Hennessy 2017-03-29 12:30:33 UTC

Mooli,
Thanks for looking at the case, but I'm afraid that the short answer to you question is, I cannot comply as this particular problem is one of many which has surfaced at an active customer where many other issues are actively being worked, and I do not have direct access to the customer for this reason.  I have opened this case because, as with other others recently opened for this customer, while the current blocker is provider refresh, the end objective is to have a complete set of C&U reports once inventory is reliably gathered, and this case becomes one of several that will need to be addressed in order to allow complete C&U collections to proceed.

I can provide you access to the full  evm.log file if that is of any assistance,

Comment 4 Barak 2017-03-29 13:02:27 UTC

Thomas,

The phenomena described above (shown in the log) happens on C & U collection,
when the system can not resolve the ext_management_system of the target object (can not reference this object to a specific provider).
One of the reasons for this to happen is the failure of the inventory refresh, which is exactly what we know happens in this case.

I am not sure this bug has its own justification.

Comment 5 Thomas Hennessy 2017-03-29 13:15:16 UTC

So based on your description above, you appear to be saying that underlying reason for this error will have become repaired when all of the ems refresh activity has been completed for all of the providers in this customer's database.  Hopefully that state will soon be achieved.

Comment 6 Federico Simoncelli 2017-03-30 07:57:11 UTC

(In reply to Barak from comment #4)
> Thomas,
> 
> The phenomena described above (shown in the log) happens on C & U collection,
> when the system can not resolve the ext_management_system of the target
> object (can not reference this object to a specific provider).

From the error message reported in the description this seems a duplicate of bug 1408968. In the end the issue was fixed in bug 1420721:

https://github.com/ManageIQ/manageiq/pull/13686

If indeed the issue here is a duplicate of the BZs above then an inventory refresh won't help.

Comment 7 Mooli Tayer 2017-04-02 08:22:07 UTC

Right, my mistake (was looking at the already fixed master branch - I'll make sure to pull the relevant one next time)

+1 for closing as duplicate of 1420721

Comment 8 Barak 2017-04-03 10:26:05 UTC

per comments #6 & #7 moving this bug to CLOSED DUPLICATE,
It will be shipped as a part of 5.7.2

*** This bug has been marked as a duplicate of bug 1426683 ***

Note You need to log in before you can comment on or make changes to this bug.