Bug 1445928 - It is impossible to identify the source process/appliance for each connection in pg_stat_activity
Summary: It is impossible to identify the source process/appliance for each connection...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat CloudForms Management Engine
Classification: Red Hat
Component: Appliance
Version: 5.7.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: GA
: 5.9.0
Assignee: Joe Rafaniello
QA Contact: Tasos Papaioannou
URL:
Whiteboard: appliance:database
Depends On:
Blocks: 1449389 1449390 1458339
TreeView+ depends on / blocked
 
Reported: 2017-04-26 19:59 UTC by Joe Rafaniello
Modified: 2018-03-06 15:32 UTC (History)
5 users (show)

Fixed In Version: 5.9.0.1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1449389 1449390 1458339 (view as bug list)
Environment:
Last Closed: 2018-03-06 15:32:31 UTC
Category: ---
Cloudforms Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Joe Rafaniello 2017-04-26 19:59:03 UTC
Description of problem:  If you have several appliances and an issue occurs in postgresql, it's impossible to track postgresql connections back to the appliance and process that created that connection.  

pg_stat_activity looks something like this:

vmdb_production=# select pid, application_name from pg_stat_activity;
 pid   | application_name
 ------+--------------------------------------------------
 13770 | /var/www/miq/vmdb/lib/workers/bin/evm_server.rb
 13950 | /var/www/miq/vmdb/lib/workers/bin/evm_server.rb
 13957 | /var/www/miq/vmdb/lib/workers/bin/evm_server.rb
 13969 | /var/www/miq/vmdb/lib/workers/bin/evm_server.rb
 13975 | /var/www/miq/vmdb/lib/workers/bin/evm_server.rb
 13984 | /var/www/miq/vmdb/lib/workers/bin/evm_server.rb


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Start several appliances
2. Run this query: select pid, application_name from pg_stat_activity;
3. Cry

Actual results:  There are hundreds of connections with very little identifiable information as to the source other than a possible ip address.


Expected results:  You can customize the application_name in postgresql so each process can have a unique application_name which will make it much easier to find the source of database connection issues.


Additional info:

Comment 3 Joe Rafaniello 2017-04-26 20:00:57 UTC
When we need to do database lock or connection triaging, it's really hard to tell where each connection originated from because we don't specify an application_name. It looks something like this:

vmdb_production=# select pid, application_name from pg_stat_activity;
 pid   | application_name
 ------+--------------------------------------------------
 13770 | /var/www/miq/vmdb/lib/workers/bin/evm_server.rb
 13950 | /var/www/miq/vmdb/lib/workers/bin/evm_server.rb
 13957 | /var/www/miq/vmdb/lib/workers/bin/evm_server.rb
 13969 | /var/www/miq/vmdb/lib/workers/bin/evm_server.rb
 13975 | /var/www/miq/vmdb/lib/workers/bin/evm_server.rb
 13984 | /var/www/miq/vmdb/lib/workers/bin/evm_server.rb
The much debated format of the application_name now looks like this:

MIQ <worker pid> <worker class>[<worker id>], s[<server id>], <zone name>[<zone id>]
OR:
MIQ <server pid> Server[<server id>], <zone name>[<zone id>]

Note, the pg backend pid isn't needed in application_name since the pid column in pg_stat_activity is the backend pid. Plus, each of our processes can have more than one spid.

It now looks like this:

vmdb_development=# select pid, application_name from pg_stat_activity;
 pid  | application_name
------------------+------+-------------+----------------------
 5844 | MIQ 5835 Server[1r2], default[1r1]
 5888 | MIQ 5886 Generic[1r1111], s[1r2], default[1r1]
 5891 | MIQ 5889 Generic[1r1112], s[1r2], default[1r1]
 5894 | MIQ 5892 Priority[1r1113], s[1r2], default[1r1]
 5897 | MIQ 5895 Priority[1r1114], s[1r2], default[1r1]
 5900 | MIQ 5898 Schedule[1r1115], s[1r2], default[1r1]
 5928 | MIQ 5926 EventHandler[1r1116], s[1r2], default[1r1]
 5932 | MIQ 5929 Reporting[1r1117], s[1r2], default[1r1]
 5934 | MIQ 5931 Reporting[1r1118], s[1r2], default[1r1]
 5943 | MIQ 5941 Ui[1r1120], s[1r2], default[1r1]
 5940 | MIQ 5935 Websocket[1r1119], s[1r2], default[1r1]
 5946 | MIQ 5944 WebService[1r1121], s[1r2], default[1r1]
 5964 | MIQ 5935 Websocket[1r1119], s[1r2], default[1r1]
 5965 | MIQ 5941 Ui[1r1120], s[1r2], default[1r1]
 5966 | MIQ 5944 WebService[1r1121], s[1r2], default[1r1]

Comment 4 Joe Rafaniello 2017-04-26 20:42:01 UTC
#13856 and #14904 were manually backported to:
Euwe: https://github.com/ManageIQ/manageiq/issues/14909
Darga: https://github.com/ManageIQ/manageiq/issues/14911

Comment 8 Tasos Papaioannou 2017-10-09 15:35:31 UTC
Verified on 5.9.0.1.


Note You need to log in before you can comment on or make changes to this bug.