1458797 – Validation error: ems/core not defined while ContainerGroups in the "Pending" state

Bug 1458797 - Validation error: ems/core not defined while ContainerGroups in the "Pending" state

Summary: Validation error: ems/core not defined while ContainerGroups in the "Pending"...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat CloudForms Management Engine
Classification:	Red Hat
Component:	C&U Capacity and Utilization
Sub Component:
Version:	5.7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	GA
Target Release:	5.9.0
Assignee:	Yaacov Zamir
QA Contact:	Einat Pacifici
Docs Contact:
URL:
Whiteboard:	container:c&u
Depends On:
Blocks:	1461522
TreeView+	depends on / blocked

Reported:	2017-06-05 13:35 UTC by Gellert Kis
Modified:	2020-08-13 09:18 UTC (History)
CC List:	5 users (show)
Fixed In Version:	5.9.0.1
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1461522 (view as bug list)
Environment:
Last Closed:	2018-03-06 15:47:50 UTC
Category:	---
Cloudforms Team:	Container Management
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Comment 2 Federico Simoncelli 2017-06-05 13:56:18 UTC

Yaacov those Containers and and ContainerGroups that are not associated with a node yet (e.g. still Pending) are throwing this exception at metrics collection time:

  Validation error: cores not defined


I think that one possible solution would be to transform that into a warning with something like:


      if @target.respond_to?(:hardware)
        @node_hardware = @target.hardware
      else
        @node_hardware = @target.try(:container_node).try(:hardware)
      end

...

    def validate_target
      raise TargetValidationError, "Validation error: ems not defined" unless @ext_management_system
      raise TargetValidationWarning, "Warning: object not scheduled to a node yet" unless @node_hardware
      ...
    end


Not sure if it would enough though to make this less scary. Maybe another option is to filter these targets altogether if they don't have a "container_node" association, but I feel that it would be more risky not to have any info in the logs about why an object metrics collection didn't happen.

Comment 3 Federico Simoncelli 2017-06-05 14:05:28 UTC

Gellert do you think it is acceptable to transform the error:

 ERROR -- : MIQ(ManageIQ::Providers::Kubernetes::ContainerManager::MetricsCapture#perf_collect_metrics) Container(25987) is not valid: Validation error: cores not defined

into a warning with a proper message?

 WARN -- : MIQ(ManageIQ::Providers::Kubernetes::ContainerManager::MetricsCapture#perf_collect_metrics) Container(25987) is not scheduled on any node yet


I am worried to remove these messages completely.

Comment 4 Yaacov Zamir 2017-06-05 14:20:33 UTC

submitted upstream

https://github.com/ManageIQ/manageiq-providers-kubernetes/pull/33

Comment 5 Yaacov Zamir 2017-06-05 16:00:25 UTC

in https://github.com/ManageIQ/manageiq-providers-kubernetes/pull/33

the warning format is:

[----] W, [2017-06-05T18:36:41.441675 #29072:2aec7bb4af7c] WARN -- : MIQ(ManageIQ::Providers::Kubernetes::ContainerManager::MetricsCapture#perf_collect_metrics) Container(68) has no hardware: State may be pending

Comment 6 Gellert Kis 2017-06-06 06:09:00 UTC

Certainly worth keeping log level messages for lower level and not completely removing the message on all level

Comment 7 Yaacov Zamir 2017-06-08 11:23:50 UTC

merged upstream
https://github.com/ManageIQ/manageiq-providers-kubernetes/pull/33

Note You need to log in before you can comment on or make changes to this bug.