Bug 1271049

Summary: UI crashes when smartproxy role enabled
Product: Red Hat CloudForms Management Engine Reporter: Dave Johnson <dajohnso>
Component: ApplianceAssignee: Joe Rafaniello <jrafanie>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Vadim Rutkovsky <vrutkovs>
Severity: medium Docs Contact:
Priority: high    
Version: 5.5.0CC: abellott, dajohnso, gtanzill, jhardy, jprause, obarenbo, vrutkovs
Target Milestone: GA   
Target Release: 5.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: appliance:server_role:smartstate
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1291873 (view as bug list) Environment:
Last Closed: 2016-03-01 23:26:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1291873    

Description Dave Johnson 2015-10-13 03:46:07 UTC
Description of problem:
==========================
SSA test automation was having issues configuring appliances for SSA testing and it came down to the front end seeming to crash momentarily when the smartproxy role is enabled.  


[----] I, [2015-10-12T23:39:14.341928 #4600:10d5998]  INFO -- : MiqSmartProxyWorker::Runner started. ID [23], PID [4600], GUID [f2272d96-715b-11e5-8125-001a4a2dd79b], Zone [default], Role [automate,database_operations,database_owner,ems_inventory,ems_operations,event,reporting,scheduler,smartproxy,smartstate,user_interface,web_services]
[----] I, [2015-10-12T23:39:14.342022 #4600:10d5998]  INFO -- : MIQ(MiqSmartProxyWorker::Runner#do_wait_for_worker_monitor) MIQ(MiqSmartProxyWorker::Runner) Checking that worker monitor has started before doing work
[----] I, [2015-10-12T23:39:14.346468 #4600:10d5998]  INFO -- : MIQ(MiqSmartProxyWorker::Runner#do_wait_for_worker_monitor) MIQ(MiqSmartProxyWorker::Runner) Starting work since worker monitor has started
[----] I, [2015-10-12T23:39:14.349368 #4600:10d5998]  INFO -- : MIQ(MiqSmartProxyWorker::Runner#worker_monitor_drb) MIQ(MiqSmartProxyWorker::Runner) Initializing DRb Connection to MiqServer with ID=[1], NAME=[EVM], PID=[3643], GUID=[2eb2bf38-715b-11e5-8d8e-001a4a2dd79b] DRb URI=[druby://127.0.0.1:35486]
[----] I, [2015-10-12T23:39:14.354193 #4600:10d5998]  INFO -- : MIQ(MiqSmartProxyWorker::Runner#message_sync_active_roles) MIQ(MiqSmartProxyWorker::Runner) Synchronizing active roles...
[----] I, [2015-10-12T23:39:14.354282 #4600:10d5998]  INFO -- : MIQ(MiqSmartProxyWorker::Runner#message_sync_active_roles) MIQ(MiqSmartProxyWorker::Runner) Synchronizing active roles complete...
[----] I, [2015-10-12T23:39:15.299737 #4473:6f7994]  INFO -- : MIQ(MiqGenericWorker::Runner#get_message_via_drb) Message id: [65], MiqWorker id: [15], Zone: [default], Role: [], Server: [2eb2bf38-715b-11e5-8d8e-001a4a2dd79b], Ident: [generic], Target id: [], Instance id: [], Task id: [], Command: [MiqDbConfig.log_statistics], Timeout: [600], Priority: [90], State: [dequeue], Deliver On: [], Data: [], Args: [], Dequeued in: [4.425438067] seconds
[----] I, [2015-10-12T23:39:15.299857 #4473:6f7994]  INFO -- : MIQ(MiqQueue#deliver) Message id: [65], Delivering...
[----] I, [2015-10-12T23:39:15.301989 #4476:faf988]  INFO -- : MIQ(MiqGenericWorker::Runner#get_message_via_drb) Message id: [66], MiqWorker id: [16], Zone: [default], Role: [], Server: [2eb2bf38-715b-11e5-8d8e-001a4a2dd79b], Ident: [generic], Target id: [], Instance id: [], Task id: [], Command: [MiqServer.queue_update_registration_status], Timeout: [600], Priority: [90], State: [dequeue], Deliver On: [], Data: [], Args: [], Dequeued in: [4.417689448] seconds
[----] I, [2015-10-12T23:39:15.302108 #4476:faf988]  INFO -- : MIQ(MiqQueue#deliver) Message id: [66], Delivering...
[----] I, [2015-10-12T23:39:15.304293 #4476:faf988]  INFO -- : MIQ(MiqQueue#delivered) Message id: [66], State: [ok], Delivered in [0.002183599] seconds
[----] I, [2015-10-12T23:39:15.319424 #4476:faf988]  INFO -- : MIQ(MiqGenericWorker::Runner#get_message_via_drb) Message id: [67], MiqWorker id: [16], Zone: [default], Role: [], Server: [], Ident: [generic], Target id: [], Instance id: [], Task id: [], Command: [Job.check_jobs_for_timeout], Timeout: [600], Priority: [90], State: [dequeue], Deliver On: [], Data: [], Args: [], Dequeued in: [4.423509325] seconds
[----] I, [2015-10-12T23:39:15.319605 #4476:faf988]  INFO -- : MIQ(MiqQueue#deliver) Message id: [67], Delivering...
[----] I, [2015-10-12T23:39:15.326593 #4476:faf988]  INFO -- : MIQ(MiqQueue#delivered) Message id: [67], State: [ok], Delivered in [0.006979168] seconds
[----] I, [2015-10-12T23:39:15.343026 #4476:faf988]  INFO -- : MIQ(MiqGenericWorker::Runner#get_message_via_drb) Message id: [68], MiqWorker id: [16], Zone: [default], Role: [], Server: [2eb2bf38-715b-11e5-8d8e-001a4a2dd79b], Ident: [generic], Target id: [], Instance id: [], Task id: [], Command: [Session.check_session_timeout], Timeout: [600], Priority: [90], State: [dequeue], Deliver On: [], Data: [], Args: [], Dequeued in: [4.433193558] seconds
[----] I, [2015-10-12T23:39:15.343151 #4476:faf988]  INFO -- : MIQ(MiqQueue#deliver) Message id: [68], Delivering...
[----] I, [2015-10-12T23:39:15.355603 #4476:faf988]  INFO -- : MIQ(MiqQueue#delivered) Message id: [68], State: [ok], Delivered in [0.012449427] seconds
[----] I, [2015-10-12T23:39:15.397878 #4473:6f7994]  INFO -- : MIQ(DbConfig.log_activity_statistics) <<-ACTIVITY_STATS_CSV
session_id,xact_start,last_request_start_time,command,task_state,login,application,request_id,net_address,host_name,client_port,wait_time_ms,blocked_by
4338,,2015-10-13 03:39:12 UTC,"SELECT  id, lock_version, priority, role FROM ""miq_queue"" WHERE (    state = 'ready'
    AND (zone IS NULL OR zone = 'default')
    AND queue_name = 'automate'
    AND (role IS NULL OR role IN ('automate','database_operations','database_owner','ems_inventory','ems_operations','event','reporting','scheduler','smartproxy','smartstate','user_interface','web_services'))
    AND (server_guid IS NULL OR server_guid = '2eb2bf38-715b-11e5-8d8e-001a4a2dd79b')
    AND (deliver_on IS NULL OR deliver_on <= '2015-10-13 03:39:12.694788')
    AND (priority <= 200)
)  ORDER BY ""miq_queue"".""priority"" ASC, ""miq_queue"".""id"" ASC LIMIT 200",false,root,/var/www/miq/vmdb/lib/workers/bin/evm_server.rb,16384,,,-1,2,
4340,,2015-10-13 03:39:08 UTC,"SELECT  ""zones"".* FROM ""zones"" WHERE ""zones"".""id"" = $1 LIMIT 1",false,root,/var/www/miq/vmdb/bin/rails,16384,,,-1,7,
4382,,2015-10-13 03:37:46 UTC,ROLLBACK,false,root,"",16384,10.3.113.67,,41076,88,
4414,,2015-10-13 03:39:10 UTC,"SELECT ""miq_schedules"".* FROM ""miq_schedules"" WHERE (updated_at > '2015-10-13 03:38:55.844186')",false,root,/var/www/miq/vmdb/lib/workers/bin/worker.rb,16384,,,-1,4,
4502,,2015-10-13 03:39:14 UTC,SELECT pg_backend_pid(),false,root,/var/www/miq/vmdb/lib/workers/bin/worker.rb,16384,,,-1,0,
4503,,2015-10-13 03:39:14 UTC,SELECT pg_backend_pid(),false,root,/var/www/miq/vmdb/lib/workers/bin/worker.rb,16384,,,-1,0,
4527,,2015-10-13 03:39:13 UTC,SELECT pg_backend_pid(),false,root,/var/www/miq/vmdb/lib/workers/bin/worker.rb,16384,,,-1,1,
4530,,2015-10-13 03:39:12 UTC,SELECT pg_backend_pid(),false,root,/var/www/miq/vmdb/lib/workers/bin/worker.rb,16384,,,-1,2,
4533,,2015-10-13 03:39:13 UTC,SELECT pg_backend_pid(),false,root,/var/www/miq/vmdb/lib/workers/bin/worker.rb,16384,,,-1,2,
4537,2015-10-13 03:39:15 UTC,2015-10-13 03:39:15 UTC,"SELECT ""pg_stat_activity"".* FROM ""pg_stat_activity"" WHERE ""pg_stat_activity"".""datname"" = $1",false,root,/var/www/miq/vmdb/lib/workers/bin/worker.rb,16384,,,-1,0,
4540,2015-10-13 03:39:15 UTC,2015-10-13 03:39:15 UTC,"UPDATE ""miq_queue"" SET ""state"" = 'dequeue', ""handler_id"" = 16, ""handler_type"" = 'MiqWorker', ""updated_on"" = '2015-10-13 03:39:15.313790', ""lock_version"" = 1 WHERE ""miq_queue"".""id"" = $1 AND ""miq_queue"".""lock_version"" = $2",false,root,/var/www/miq/vmdb/lib/workers/bin/worker.rb,16384,,,-1,0,
4543,,2015-10-13 03:39:12 UTC,SELECT pg_backend_pid(),false,root,/var/www/miq/vmdb/lib/workers/bin/worker.rb,16384,,,-1,2,
4544,,2015-10-13 03:39:12 UTC,SELECT pg_backend_pid(),false,root,/var/www/miq/vmdb/lib/workers/bin/worker.rb,16384,,,-1,2,
4610,,2015-10-13 03:39:13 UTC,SELECT pg_backend_pid(),false,root,/var/www/miq/vmdb/lib/workers/bin/worker.rb,16384,,,-1,1,
4615,,2015-10-13 03:39:14 UTC,SELECT pg_backend_pid(),false,root,/var/www/miq/vmdb/lib/workers/bin/worker.rb,16384,,,-1,1,
ACTIVITY_STATS_CSV
[----] I, [2015-10-12T23:39:15.398496 #4473:6f7994]  INFO -- : MIQ(MiqQueue#delivered) Message id: [65], State: [ok], Delivered in [0.098624798] seconds
[----] I, [2015-10-12T23:39:22.249993 #4084:118e4fc]  INFO -- : MIQ(MiqUiWorker::Runner#message_sync_config) MIQ(MiqUiWorker::Runner) Synchronizing configuration...
[----] E, [2015-10-12T23:39:22.253815 #4084:118e4fc] ERROR -- : MIQ(MiqUiWorker::Runner) ID [9] PID [4084] GUID [63e07be6-715b-11e5-8125-001a4a2dd79b] Error in update_worker_record_at_exit: PG::UnableToSend: no connection to the server
: SELECT  "miq_workers".* FROM "miq_workers" WHERE "miq_workers"."type" IN ('MiqUiWorker') AND "miq_workers"."id" = $1 LIMIT 1
[----] E, [2015-10-12T23:39:22.253975 #4084:118e4fc] ERROR -- : MIQ(MiqUiWorker::Runner) ID [9] PID [4084] GUID [63e07be6-715b-11e5-8125-001a4a2dd79b] Error heartbeating because ActiveRecord::StatementInvalid: PG::UnableToSend: server closed the connection unexpectedly
	This probably means the server terminated abnormally
	before or while processing the request.
: SELECT "server_roles".* FROM "server_roles" INNER JOIN "assigned_server_roles" ON "server_roles"."id" = "assigned_server_roles"."server_role_id" WHERE "assigned_server_roles"."miq_server_id" = $1
/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/connection_adapters/postgresql_adapter.rb:606:in `exec_prepared'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/connection_adapters/postgresql_adapter.rb:606:in `block in exec_cache'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/connection_adapters/abstract_adapter.rb:473:in `block in log'\n/opt/rh/cfme-gemset/gems/activesupport-4.2.4/lib/active_support/notifications/instrumenter.rb:20:in `instrument'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/connection_adapters/abstract_adapter.rb:467:in `log'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/connection_adapters/postgresql_adapter.rb:605:in `exec_cache'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/connection_adapters/postgresql_adapter.rb:589:in `execute_and_clear'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/connection_adapters/postgresql/database_statements.rb:160:in `exec_query'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/connection_adapters/abstract/database_statements.rb:351:in `select'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/connection_adapters/abstract/database_statements.rb:32:in `select_all'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/connection_adapters/abstract/query_cache.rb:70:in `select_all'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/querying.rb:39:in `find_by_sql'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/statement_cache.rb:107:in `execute'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/associations/collection_association.rb:445:in `get_records'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/associations/has_many_through_association.rb:224:in `find_target'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/associations/collection_association.rb:392:in `load_target'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/associations/collection_proxy.rb:45:in `load_target'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/associations/collection_proxy.rb:945:in `to_ary'\n/opt/rh/cfme-gemset/gems/activerecord-4.2.4/lib/active_record/relation/delegation.rb:46:in `collect'\n/var/www/miq/vmdb/app/models/miq_server/role_management.rb:119:in `server_role_names'\n/var/www/miq/vmdb/app/models/miq_server/role_management.rb:125:in `role'\n/var/www/miq/vmdb/app/models/miq_server.rb:609:in `my_role'\n/var/www/miq/vmdb/app/models/miq_worker/runner.rb:278:in `sync_config'\n/var/www/miq/vmdb/app/models/miq_worker/runner.rb:268:in `message_sync_config'\n/var/www/miq/vmdb/app/models/miq_worker/runner.rb:412:in `process_message'\n/var/www/miq/vmdb/app/models/miq_worker/runner.rb:343:in `block in heartbeat'\n/var/www/miq/vmdb/app/models/miq_worker/runner.rb:343:in `each'\n/var/www/miq/vmdb/app/models/miq_worker/runner.rb:343:in `heartbeat'\n/var/www/miq/vmdb/app/models/miq_worker/runner.rb:322:in `block in do_work_loop'\n/var/www/miq/vmdb/app/models/miq_worker/runner.rb:320:in `loop'\n/var/www/miq/vmdb/app/models/miq_worker/runner.rb:320:in `do_work_loop'\n/var/www/miq/vmdb/app/models/miq_worker/runner.rb:142:in `run'\n/var/www/miq/vmdb/lib/vmdb/initializer.rb:46:in `block in init' Worker exiting.
[----] I, [2015-10-12T23:39:25.268846 #4093:dd2534]  INFO -- : MIQ(MiqWebServiceWorker::Runner#message_sync_config) MIQ(MiqWebServiceWorker::Runner) Synchronizing configuration...
[----] I, [2015-10-12T23:39:25.269617 #4093:dd2534]  INFO -- : MIQ(MiqWebServiceWorker::Runner#sync_config) ID [10], PID [4093], GUID [656b216e-715b-11e5-8125-001a4a2dd79b], Zone [default], Active Roles [automate,database_operations,database_owner,ems_inventory,ems_operations,event,reporting,scheduler,smartstate,user_interface,web_services], Assigned Roles [automate,database_operations,database_owner,ems_inventory,ems_operations,event,reporting,scheduler,smartstate,user_interface,web_services], Configuration:
[----] I, [2015-10-12T23:39:25.270200 #4093:dd2534]  INFO -- :   :count: 1
[----] I, [2015-10-12T23:39:25.270246 #4093:dd2534]  INFO -- :   :gc_interval: 900
[----] I, [2015-10-12T23:39:25.270282 #4093:dd2534]  INFO -- :   :heartbeat_freq: 60
[----] I, [2015-10-12T23:39:25.270316 #4093:dd2534]  INFO -- :   :heartbeat_timeout: 120
[----] I, [2015-10-12T23:39:25.270433 #4093:dd2534]  INFO -- :   :memory_threshold: 1073741824
[----] I, [2015-10-12T23:39:25.270477 #4093:dd2534]  INFO -- :   :nice_delta: 1
[----] I, [2015-10-12T23:39:25.270513 #4093:dd2534]  INFO -- :   :parent_time_threshold: 180
[----] I, [2015-10-12T23:39:25.270562 #4093:dd2534]  INFO -- :   :poll: 60
[----] I, [2015-10-12T23:39:25.270598 #4093:dd2534]  INFO -- :   :poll_escalate_max: 30
[----] I, [2015-10-12T23:39:25.270633 #4093:dd2534]  INFO -- :   :poll_method: :normal
[----] I, [2015-10-12T23:39:25.270666 #4093:dd2534]  INFO -- :   :restart_interval: 0
[----] I, [2015-10-12T23:39:25.270699 #4093:dd2534]  INFO -- :   :starting_timeout: 600
[----] I, [2015-10-12T23:39:25.270732 #4093:dd2534]  INFO -- :   :connection_pool_size: 5
[----] I, [2015-10-12T23:39:25.270763 #4093:dd2534]  INFO -- : ---
[----] I, [2015-10-12T23:39:25.270993 #4093:dd2534]  INFO -- :   :guid: 656b216e-715b-11e5-8125-001a4a2dd79b


Version-Release number of selected component (if applicable):
5.5.0.5 beta 1


How reproducible:
100%


Steps to Reproduce:
============================
1.  deploy 5.5.0.5
2.  configure database
3.  enable smartproxy role

Actual results:


Expected results:


Additional info:

Comment 3 Joe Rafaniello 2015-11-18 19:28:22 UTC
Sorry Dave, I dropped the ball on this one... is this still an issue?

It looks like the database was restarted while the workers were exiting.  I'm not sure how that can happen when you're initially setting up the appliance.  

I don't understand the workflow.

Comment 5 Joe Rafaniello 2016-03-01 20:57:03 UTC
QE, can this be recreated?  I'm not sure what circumstances can cause the server to close your connection.

Comment 6 Vadim Rutkovsky 2016-03-01 21:09:22 UTC
I've never seen this happening across many providers.

Lets close it until we find a reliable way to reproduce it.

Comment 7 Joe Rafaniello 2016-03-01 23:26:47 UTC
Closing due to lack of information.