Description of problem: A change in logging of important queue state information has been exposed in a recent set of logs from a customer where different roles are active in appliances in different zones. In a "worker" zone, we have access to the statistics generated into the evm.log file every 5 minutes showing queue counts by state for each zone in a region. Neither of the logs from the 'db' zone appliance nor the 'ui' zone appliance have this information at all. Historically this information is useful in determining all sorts of problems and since the 'db' zone and 'ui' zone appliance logs are typically much smaller they are routinely used as the source from which to gather this information. Sample log lines are provided below. The 'ask' is to restore these log lines to all appliances regardless of active roles at the same 5 minute frequency. Version-Release number of selected component (if applicable):Version: 5.5.2.4 How reproducible: Steps to Reproduce: 1. create a CFME 5.5.2.4 applaince and operate it with only the UI role active 2. grep the log for 'count for state' to gather log lines 3. result is zero log lines Actual results: zero log lines (see attached customer logs from UI zone appliance) Expected results: log lines detailing information only now available on 'worker' zone appliance (also attached for review) Additional info: sample log lines from customer worker1 zone applaince: ===== [----] I, [2016-03-01T09:02:50.804294 #5156:122b98c] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2720)] MiqQueue count for state=["dequeue"] by zone and role: {"Worker"=>{nil=>1, "ems_metrics_coordinator"=>1}} [----] I, [2016-03-01T09:02:50.804636 #5156:122b98c] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2720)] MiqQueue count for state=["error"] by zone and role: {"DB"=>{nil=>12}, "Worker"=>{nil=>4}, "UI"=>{nil=>3}} [----] I, [2016-03-01T09:02:50.805155 #5156:122b98c] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2720)] MiqQueue count for state=["ready"] by zone and role: {"Worker"=>{nil=>8, "smartstate"=>1}, "default"=>{nil=>14, "event"=>1, "smartstate"=>1}, "DB"=>{"automate"=>17791, nil=>59, "smartstate"=>1, "ems_metrics_processor"=>248}, "UI"=>{"smartstate"=>1, "ems_metrics_processor"=>207, "ems_operations"=>95, nil=>58}} [----] I, [2016-03-01T09:02:50.868936 #5156:122b98c] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2720)] Job count for state=["finished"] by zone and process_type: {"Worker"=>{"VmScan"=>20837}} ===== worker1 evm.log provided as an attachment UI worker evm.log provided for counter example.
Created attachment 1132624 [details] current evm.log file from worker1 appliance containing usefult log lines
Created attachment 1132626 [details] current UI worker appliance log showing no instances of log lines
I'm not sure if I'm misunderstanding the problem or if I was just not able to reproduce this. I've set up two appliances both pointing to the same database, both in the default zone. I set up one appliance which has the roles: Automation Engine Database Operations Event Monitor Provider Inventory Provider Operations Reporting Scheduler SmartState Analysis This appliance is logging lines like the following: [----] I, [2016-03-15T13:20:00.524641 #2977:5c3988] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2747)] MiqQueue count for state=["ready"] by zone and role: {"default"=>{nil=>7}} [----] I, [2016-03-15T13:24:59.027060 #2980:da398c] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2747)] MiqQueue count for state=["dequeue"] by zone and role: {"default"=>{nil=>1}} [----] I, [2016-03-15T13:24:59.027133 #2980:da398c] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2747)] MiqQueue count for state=["ready"] by zone and role: {"default"=>{nil=>2}} The other appliance has only the roles: User Interface Web Services This appliance is also logging lines like the following: [----] I, [2016-03-15T13:18:43.041495 #2940:c9b990] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2733)] MiqQueue count for state=["dequeue"] by zone and role: {"default"=>{nil=>1}} [----] I, [2016-03-15T13:18:43.041561 #2940:c9b990] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (2733)] MiqQueue count for state=["ready"] by zone and role: {"default"=>{"smartstate"=>1}} Even after creating a new zone and moving the UI only appliance into that one I'm still getting the log lines in question in evm.log: [----] I, [2016-03-15T14:23:57.425074 #6335:f41988] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (6144)] MiqQueue count for state=["dequeue"] by zone and role: {"UI test zone"=>{nil=>1}} [----] I, [2016-03-15T14:23:57.425286 #6335:f41988] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (6144)] MiqQueue count for state=["error"] by zone and role: {"UI test zone"=>{nil=>1}} [----] I, [2016-03-15T14:23:57.425329 #6335:f41988] INFO -- : Q-task_id([log_status]) MIQ(MiqServer.log_system_status) [EVM Server (6144)] MiqQueue count for state=["ready"] by zone and role: {"UI test zone"=>{"smartstate"=>1, nil=>1, "automate"=>17}, "default"=>{nil=>2}} Is this the type of situation that seemed to be causing the issue on the customer site or is there something that I have overlooked? Thanks.
This was happening because the MiqScheduleWorker was not running due to the issue described here https://bugzilla.redhat.com/show_bug.cgi?id=1296192 If the MiqScheduleWorker is running, these log lines should be present. Closing this as a duplicate as the fix is identical to the one already implemented for https://bugzilla.redhat.com/show_bug.cgi?id=1296192 Whether this change will be applied to customers who upgrade should be investigated. *** This bug has been marked as a duplicate of bug 1296192 ***