Bug 1367473 - SmartState Analysis not working for container images
Summary: SmartState Analysis not working for container images
Keywords:
Status: CLOSED DUPLICATE of bug 1366143
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: SmartState Analysis
Version: 5.6.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: GA
: 5.7.0
Assignee: Rich Oliveri
QA Contact: Dave Johnson
URL:
Whiteboard: container
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-08-16 13:35 UTC by Prasad Mukhedkar
Modified: 2020-04-15 14:36 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-19 08:00:42 UTC
Category: ---
Cloudforms Team: ---
Target Upstream Version:


Attachments (Terms of Use)

Description Prasad Mukhedkar 2016-08-16 13:35:37 UTC
Smart State analysis for container images not working, The fleecing task is getting stuck into "waiting_to_start" status infinitely.. In log I see following :

 [----] W, [2016-08-11T09:34:32.938155 #3020:b09988]  WARN -- : Q-task_id([job_dispatcher]) MIQ(JobProxyDispatcher#dispatch_to_ems) SKIPPING remaining Container Image scan jobs for Ext Management System [99000000000001] in dispatch since there are [3] active scans in zone [default]

This is what I see in the database : 


vmdb_production=# select guid,state,status,message,name,dispatch_status from jobs where dispatch_status='active';
                 guid                 |      state      | status |      message      |           name           | dispatch_status 
--------------------------------------+-----------------+--------+-------------------+--------------------------+-----------------
 7e4d8d06-48de-11e6-9c8c-005056957282 | waiting_to_scan | ok     | process initiated | Container image analysis | active
 7e497702-48de-11e6-9c8c-005056957282 | waiting_to_scan | ok     | process initiated | Container image analysis | active
 7e4bbc2e-48de-11e6-9c8c-005056957282 | waiting_to_scan | ok     | process initiated | Container image analysis | active
(3 rows)


vmdb_production=# select guid,state,status,message,name,dispatch_status from jobs where dispatch_status!='active';
 00759e7c-5a5f-11e6-872e-005056957282 | waiting_to_start | ok     | process initiated | Container image analysis | pending
 29434312-5a60-11e6-872e-005056957282 | waiting_to_start | ok     | process initiated | Container image analysis | pending
(394 rows)

Other ERROR in the logs : 

[----] I, [2016-08-12T06:27:35.161662 #8285:b09988]  INFO -- : MIQ(MiqGenericWorker::Runner) ID [99000000031743] PID [8285] GUID [7fcc106a-6041-11e6-872e-005056957282] Exit request received. Worker exiting.
------------------------

[----] I, [2016-08-11T08:26:11.311503 #25633:b09988]  INFO -- : MIQ(ManageIQ::Providers::OpenshiftEnterprise::ContainerManager::MetricsCollectorWorker::Runner) ID [99000000028351] PID [25633] GUID [6ece4a2c-5f8c-11e6-872e-005056957282] Exit request received. Worker exiting.


----------

[----] E, [2016-08-11T07:13:18.040908 #11818:b09988] ERROR -- : MIQ(Job.check_jobs_for_timeout) Couldn't find VmOrTemplate with 'id'=99000000000003
[----] I, [2016-08-11T07:14:10.374479 #11845:b09988]  INFO -- : MIQ(MiqQueue.put) Message id: [99000002812075],  id: [], Zone: [default], Role: [], Server: [], Ident: [generic], Target id: [], Instance id: [], Task id: [], Command: [Job.check_jobs_for_timeout], Timeout: [600], Priority: [90], State: [ready], Deliver On: [], Data: [], Args: []

Can we remove the jobs from the database? Will that help? 
We dont have assertive info in the logs to understand why 
the active tasks execution is failing. Dont see any 
timeout either. 

Customer database Restored on : 10.65.200.236  root:smartvm

Comment 2 Mooli Tayer 2016-08-17 09:04:41 UTC
Prasad is this a clone of https://bugzilla.redhat.com/show_bug.cgi?id=1366143 ?

That happens if we have three failed jobs already stuck in the queue but their status isn't reported correctly.

Comment 4 Mooli Tayer 2016-08-17 12:32:35 UTC
Quick fix[1]:
cd /var/www/miq/vmdb/
source /etc/default/evm
bin/rails c

irb(main):016:0> Job.update(:state => 'finished')
irb(main):014:0> Job.destroy_all

[1] since only "finished" or "waiting_to_start" jobs can be deleted.

Comment 6 Mooli Tayer 2016-08-17 12:41:08 UTC
(In reply to Mooli Tayer from comment #4)
> Quick fix[1]:
> cd /var/www/miq/vmdb/
> source /etc/default/evm
> bin/rails c
> 
> irb(main):016:0> Job.update(:state => 'finished')
> irb(main):014:0> Job.destroy_all
> 
> [1] since only "finished" or "waiting_to_start" jobs can be deleted.

Actually that's very bad. I copied it from what I provided to qe. 

We don't want to delete all of a customer's job history.
Just update and delete the jobs that are stuck.


Note You need to log in before you can comment on or make changes to this bug.