Bug 1314113 - CFME 4.0 change in logging denies support ready and reliable access to important queue depth statistics previously available from any appliance in a region.
Summary: CFME 4.0 change in logging denies support ready and reliable access to impor...
Keywords:
Status: CLOSED DUPLICATE of bug 1296192
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Appliance
Version: 5.4.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: GA
: 5.6.0
Assignee: Nick Carboni
QA Contact: Dave Johnson
URL:
Whiteboard: support
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-03-03 00:07 UTC by Thomas Hennessy
Modified: 2019-10-10 11:25 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-15 20:36:21 UTC
Category: ---
Cloudforms Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
current evm.log file from worker1 appliance containing usefult log lines (4.96 MB, application/zip)
2016-03-03 04:33 UTC, Thomas Hennessy
no flags Details
current UI worker appliance log showing no instances of log lines (5.41 MB, application/zip)
2016-03-03 04:35 UTC, Thomas Hennessy
no flags Details

Description Thomas Hennessy 2016-03-03 00:07:47 UTC
Description of problem: A change in logging of important queue state information has been exposed in a recent set of logs from a customer where different roles are active in appliances in different zones.

In a "worker" zone, we have access to the statistics generated into the evm.log file every 5 minutes showing queue counts by state for each zone in a region.

Neither of the logs from the 'db' zone appliance nor the 'ui' zone appliance have this information at all.

Historically this information is useful in determining all sorts of problems and since the 'db' zone and 'ui' zone appliance logs are typically much smaller they are routinely used as the source from which to gather this information.  Sample log lines are provided below.

The 'ask' is to restore these log lines to all appliances regardless of active roles at the same 5 minute frequency.


Version-Release number of selected component (if applicable):Version: 5.5.2.4


How reproducible:


Steps to Reproduce:
1. create a CFME 5.5.2.4 applaince and operate it with only the UI role active
2. grep the log for 'count for state' to gather log lines
3. result is zero log lines

Actual results: zero log lines (see attached customer logs from UI zone appliance)


Expected results: log lines detailing information only now available on 'worker' zone appliance (also attached for review)


Additional info:
sample log lines from customer worker1 zone applaince:
=====
[----] I, [2016-03-01T09:02:50.804294 #5156:122b98c]  INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2720)] MiqQueue count for state=["dequeue"] by zone and role: {"Worker"=>{nil=>1, "ems_metrics_coordinator"=>1}}
[----] I, [2016-03-01T09:02:50.804636 #5156:122b98c]  INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2720)] MiqQueue count for state=["error"] by zone and role: {"DB"=>{nil=>12}, "Worker"=>{nil=>4}, "UI"=>{nil=>3}}
[----] I, [2016-03-01T09:02:50.805155 #5156:122b98c]  INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2720)] MiqQueue count for state=["ready"] by zone and role: {"Worker"=>{nil=>8, "smartstate"=>1}, "default"=>{nil=>14, "event"=>1, "smartstate"=>1}, "DB"=>{"automate"=>17791, nil=>59, "smartstate"=>1, "ems_metrics_processor"=>248}, "UI"=>{"smartstate"=>1, "ems_metrics_processor"=>207, "ems_operations"=>95, nil=>58}}
[----] I, [2016-03-01T09:02:50.868936 #5156:122b98c]  INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2720)] Job count for state=["finished"] by zone and process_type: {"Worker"=>{"VmScan"=>20837}} 
=====
worker1 evm.log provided as an attachment

UI worker evm.log provided for counter example.

Comment 2 Thomas Hennessy 2016-03-03 04:33:38 UTC
Created attachment 1132624 [details]
current evm.log file from worker1 appliance containing usefult log lines

Comment 3 Thomas Hennessy 2016-03-03 04:35:51 UTC
Created attachment 1132626 [details]
current UI worker appliance log showing no instances of log lines

Comment 4 Nick Carboni 2016-03-15 18:43:41 UTC
I'm not sure if I'm misunderstanding the problem or if I was just not able to reproduce this.

I've set up two appliances both pointing to the same database, both in the default zone.

I set up one appliance which has the roles:
 Automation Engine
 Database Operations
 Event Monitor
 Provider Inventory
 Provider Operations
 Reporting
 Scheduler
 SmartState Analysis

This appliance is logging lines like the following:

[----] I, [2016-03-15T13:20:00.524641 #2977:5c3988]  INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2747)] MiqQueue count for state=["ready"] by zone and role: {"default"=>{nil=>7}}
[----] I, [2016-03-15T13:24:59.027060 #2980:da398c]  INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2747)] MiqQueue count for state=["dequeue"] by zone and role: {"default"=>{nil=>1}}
[----] I, [2016-03-15T13:24:59.027133 #2980:da398c]  INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2747)] MiqQueue count for state=["ready"] by zone and role: {"default"=>{nil=>2}}

The other appliance has only the roles:
 User Interface
 Web Services

This appliance is also logging lines like the following:

[----] I, [2016-03-15T13:18:43.041495 #2940:c9b990]  INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2733)] MiqQueue count for state=["dequeue"] by zone and role: {"default"=>{nil=>1}}
[----] I, [2016-03-15T13:18:43.041561 #2940:c9b990]  INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2733)] MiqQueue count for state=["ready"] by zone and role: {"default"=>{"smartstate"=>1}}

Even after creating a new zone and moving the UI only appliance into that one I'm still getting the log lines in question in evm.log:

[----] I, [2016-03-15T14:23:57.425074 #6335:f41988]  INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (6144)] MiqQueue count for state=["dequeue"] by zone and role: {"UI test zone"=>{nil=>1}}
[----] I, [2016-03-15T14:23:57.425286 #6335:f41988]  INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (6144)] MiqQueue count for state=["error"] by zone and role: {"UI test zone"=>{nil=>1}}
[----] I, [2016-03-15T14:23:57.425329 #6335:f41988]  INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (6144)] MiqQueue count for state=["ready"] by zone and role: {"UI test zone"=>{"smartstate"=>1, nil=>1, "automate"=>17}, "default"=>{nil=>2}}

Is this the type of situation that seemed to be causing the issue on the customer site or is there something that I have overlooked?

Thanks.

Comment 5 Nick Carboni 2016-03-15 20:36:21 UTC
This was happening because the MiqScheduleWorker was not running due to the issue described here https://bugzilla.redhat.com/show_bug.cgi?id=1296192

If the MiqScheduleWorker is running, these log lines should be present.

Closing this as a duplicate as the fix is identical to the one already implemented for https://bugzilla.redhat.com/show_bug.cgi?id=1296192

Whether this change will be applied to customers who upgrade should be investigated.

*** This bug has been marked as a duplicate of bug 1296192 ***


Note You need to log in before you can comment on or make changes to this bug.