| Summary: | CFME 4.0 change in logging denies support ready and reliable access to important queue depth statistics previously available from any appliance in a region. | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat CloudForms Management Engine | Reporter: | Thomas Hennessy <thenness> | ||||||
| Component: | Appliance | Assignee: | Nick Carboni <ncarboni> | ||||||
| Status: | CLOSED DUPLICATE | QA Contact: | Dave Johnson <dajohnso> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | medium | ||||||||
| Version: | 5.4.0 | CC: | abellott, jdeubel, jhardy, jocarter, mfeifer, ncarboni, obarenbo, thenness | ||||||
| Target Milestone: | GA | ||||||||
| Target Release: | 5.6.0 | ||||||||
| Hardware: | All | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | support | ||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2016-03-15 20:36:21 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
Created attachment 1132624 [details]
current evm.log file from worker1 appliance containing usefult log lines
Created attachment 1132626 [details]
current UI worker appliance log showing no instances of log lines
I'm not sure if I'm misunderstanding the problem or if I was just not able to reproduce this.
I've set up two appliances both pointing to the same database, both in the default zone.
I set up one appliance which has the roles:
Automation Engine
Database Operations
Event Monitor
Provider Inventory
Provider Operations
Reporting
Scheduler
SmartState Analysis
This appliance is logging lines like the following:
[----] I, [2016-03-15T13:20:00.524641 #2977:5c3988] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2747)] MiqQueue count for state=["ready"] by zone and role: {"default"=>{nil=>7}}
[----] I, [2016-03-15T13:24:59.027060 #2980:da398c] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2747)] MiqQueue count for state=["dequeue"] by zone and role: {"default"=>{nil=>1}}
[----] I, [2016-03-15T13:24:59.027133 #2980:da398c] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2747)] MiqQueue count for state=["ready"] by zone and role: {"default"=>{nil=>2}}
The other appliance has only the roles:
User Interface
Web Services
This appliance is also logging lines like the following:
[----] I, [2016-03-15T13:18:43.041495 #2940:c9b990] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2733)] MiqQueue count for state=["dequeue"] by zone and role: {"default"=>{nil=>1}}
[----] I, [2016-03-15T13:18:43.041561 #2940:c9b990] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2733)] MiqQueue count for state=["ready"] by zone and role: {"default"=>{"smartstate"=>1}}
Even after creating a new zone and moving the UI only appliance into that one I'm still getting the log lines in question in evm.log:
[----] I, [2016-03-15T14:23:57.425074 #6335:f41988] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (6144)] MiqQueue count for state=["dequeue"] by zone and role: {"UI test zone"=>{nil=>1}}
[----] I, [2016-03-15T14:23:57.425286 #6335:f41988] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (6144)] MiqQueue count for state=["error"] by zone and role: {"UI test zone"=>{nil=>1}}
[----] I, [2016-03-15T14:23:57.425329 #6335:f41988] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (6144)] MiqQueue count for state=["ready"] by zone and role: {"UI test zone"=>{"smartstate"=>1, nil=>1, "automate"=>17}, "default"=>{nil=>2}}
Is this the type of situation that seemed to be causing the issue on the customer site or is there something that I have overlooked?
Thanks.
This was happening because the MiqScheduleWorker was not running due to the issue described here https://bugzilla.redhat.com/show_bug.cgi?id=1296192 If the MiqScheduleWorker is running, these log lines should be present. Closing this as a duplicate as the fix is identical to the one already implemented for https://bugzilla.redhat.com/show_bug.cgi?id=1296192 Whether this change will be applied to customers who upgrade should be investigated. *** This bug has been marked as a duplicate of bug 1296192 *** |
Description of problem: A change in logging of important queue state information has been exposed in a recent set of logs from a customer where different roles are active in appliances in different zones. In a "worker" zone, we have access to the statistics generated into the evm.log file every 5 minutes showing queue counts by state for each zone in a region. Neither of the logs from the 'db' zone appliance nor the 'ui' zone appliance have this information at all. Historically this information is useful in determining all sorts of problems and since the 'db' zone and 'ui' zone appliance logs are typically much smaller they are routinely used as the source from which to gather this information. Sample log lines are provided below. The 'ask' is to restore these log lines to all appliances regardless of active roles at the same 5 minute frequency. Version-Release number of selected component (if applicable):Version: 5.5.2.4 How reproducible: Steps to Reproduce: 1. create a CFME 5.5.2.4 applaince and operate it with only the UI role active 2. grep the log for 'count for state' to gather log lines 3. result is zero log lines Actual results: zero log lines (see attached customer logs from UI zone appliance) Expected results: log lines detailing information only now available on 'worker' zone appliance (also attached for review) Additional info: sample log lines from customer worker1 zone applaince: ===== [----] I, [2016-03-01T09:02:50.804294 #5156:122b98c] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2720)] MiqQueue count for state=["dequeue"] by zone and role: {"Worker"=>{nil=>1, "ems_metrics_coordinator"=>1}} [----] I, [2016-03-01T09:02:50.804636 #5156:122b98c] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2720)] MiqQueue count for state=["error"] by zone and role: {"DB"=>{nil=>12}, "Worker"=>{nil=>4}, "UI"=>{nil=>3}} [----] I, [2016-03-01T09:02:50.805155 #5156:122b98c] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2720)] MiqQueue count for state=["ready"] by zone and role: {"Worker"=>{nil=>8, "smartstate"=>1}, "default"=>{nil=>14, "event"=>1, "smartstate"=>1}, "DB"=>{"automate"=>17791, nil=>59, "smartstate"=>1, "ems_metrics_processor"=>248}, "UI"=>{"smartstate"=>1, "ems_metrics_processor"=>207, "ems_operations"=>95, nil=>58}} [----] I, [2016-03-01T09:02:50.868936 #5156:122b98c] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2720)] Job count for state=["finished"] by zone and process_type: {"Worker"=>{"VmScan"=>20837}} ===== worker1 evm.log provided as an attachment UI worker evm.log provided for counter example.