| Summary: | SmartState Analysis not working for container images | ||
|---|---|---|---|
| Product: | Red Hat CloudForms Management Engine | Reporter: | Prasad Mukhedkar <pmukhedk> |
| Component: | SmartState Analysis | Assignee: | Rich Oliveri <roliveri> |
| Status: | CLOSED DUPLICATE | QA Contact: | Dave Johnson <dajohnso> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 5.6.0 | CC: | cpelland, jhardy, mtayer, obarenbo, pmukhedk |
| Target Milestone: | GA | ||
| Target Release: | 5.7.0 | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | container | ||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-08-19 08:00:42 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Prasad is this a clone of https://bugzilla.redhat.com/show_bug.cgi?id=1366143 ? That happens if we have three failed jobs already stuck in the queue but their status isn't reported correctly. Quick fix[1]: cd /var/www/miq/vmdb/ source /etc/default/evm bin/rails c irb(main):016:0> Job.update(:state => 'finished') irb(main):014:0> Job.destroy_all [1] since only "finished" or "waiting_to_start" jobs can be deleted. (In reply to Mooli Tayer from comment #4) > Quick fix[1]: > cd /var/www/miq/vmdb/ > source /etc/default/evm > bin/rails c > > irb(main):016:0> Job.update(:state => 'finished') > irb(main):014:0> Job.destroy_all > > [1] since only "finished" or "waiting_to_start" jobs can be deleted. Actually that's very bad. I copied it from what I provided to qe. We don't want to delete all of a customer's job history. Just update and delete the jobs that are stuck. |
Smart State analysis for container images not working, The fleecing task is getting stuck into "waiting_to_start" status infinitely.. In log I see following : [----] W, [2016-08-11T09:34:32.938155 #3020:b09988] WARN -- : Q-task_id([job_dispatcher]) MIQ(JobProxyDispatcher#dispatch_to_ems) SKIPPING remaining Container Image scan jobs for Ext Management System [99000000000001] in dispatch since there are [3] active scans in zone [default] This is what I see in the database : vmdb_production=# select guid,state,status,message,name,dispatch_status from jobs where dispatch_status='active'; guid | state | status | message | name | dispatch_status --------------------------------------+-----------------+--------+-------------------+--------------------------+----------------- 7e4d8d06-48de-11e6-9c8c-005056957282 | waiting_to_scan | ok | process initiated | Container image analysis | active 7e497702-48de-11e6-9c8c-005056957282 | waiting_to_scan | ok | process initiated | Container image analysis | active 7e4bbc2e-48de-11e6-9c8c-005056957282 | waiting_to_scan | ok | process initiated | Container image analysis | active (3 rows) vmdb_production=# select guid,state,status,message,name,dispatch_status from jobs where dispatch_status!='active'; 00759e7c-5a5f-11e6-872e-005056957282 | waiting_to_start | ok | process initiated | Container image analysis | pending 29434312-5a60-11e6-872e-005056957282 | waiting_to_start | ok | process initiated | Container image analysis | pending (394 rows) Other ERROR in the logs : [----] I, [2016-08-12T06:27:35.161662 #8285:b09988] INFO -- : MIQ(MiqGenericWorker::Runner) ID [99000000031743] PID [8285] GUID [7fcc106a-6041-11e6-872e-005056957282] Exit request received. Worker exiting. ------------------------ [----] I, [2016-08-11T08:26:11.311503 #25633:b09988] INFO -- : MIQ(ManageIQ::Providers::OpenshiftEnterprise::ContainerManager::MetricsCollectorWorker::Runner) ID [99000000028351] PID [25633] GUID [6ece4a2c-5f8c-11e6-872e-005056957282] Exit request received. Worker exiting. ---------- [----] E, [2016-08-11T07:13:18.040908 #11818:b09988] ERROR -- : MIQ(Job.check_jobs_for_timeout) Couldn't find VmOrTemplate with 'id'=99000000000003 [----] I, [2016-08-11T07:14:10.374479 #11845:b09988] INFO -- : MIQ(MiqQueue.put) Message id: [99000002812075], id: [], Zone: [default], Role: [], Server: [], Ident: [generic], Target id: [], Instance id: [], Task id: [], Command: [Job.check_jobs_for_timeout], Timeout: [600], Priority: [90], State: [ready], Deliver On: [], Data: [], Args: [] Can we remove the jobs from the database? Will that help? We dont have assertive info in the logs to understand why the active tasks execution is failing. Dont see any timeout either. Customer database Restored on : 10.65.200.236 root:smartvm