Bug 1546099

Summary: Drop access logs and add qpid logs
Product: Red Hat Satellite Reporter: Pavel Moravec <pmoravec>
Component: Foreman DebugAssignee: Chris Roberts <chrobert>
Status: CLOSED ERRATA QA Contact: Corey Welton <cwelton>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.2.13CC: bmbouter, chrobert, egolov, jentrena, ktordeur
Target Milestone: 6.4.0Keywords: Triaged
Target Release: Unused   
Hardware: x86_64   
OS: Linux   
URL: https://projects.theforeman.org/issues/24312
Whiteboard:
Fixed In Version: katello-3.7.0-4.rc2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-16 19:05:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
hammer ping command output
none
katello-service status command output none

Description Pavel Moravec 2018-02-16 10:17:38 UTC
Description of problem:
I suggest below changes what f-d shall / should not collect:

It shall newly collect:

- /var/log/httpd/katello-reverse-proxy_access_ssl.log*
- /var/log/httpd/katello-reverse-proxy_error_ssl.log*
    - both are equivalents of foreman-ssl_*_ssl.log but for Capsules

- /var/log/foreman/dynflow_executor.output*
    - the logfile showed an error useful several times already (like unexpected "No executor available")

- /var/log/httpd/puppet_error_ssl.log*
    - ?? it might show some puppet related problems but it has been very rarely used, no problem if this sint collected

- commands "katello-service status" and "hammer ping"
  - both can reveal basi issues of Satellite health

 
foreman-debug should not further collect cmds output stored in these files:
qpid_stat_queues
qpid_stat_subscriptions
qpid-stat-resource_manager
qpid-rpm-qa

The first two are duplicates of qpid-stat-c and qpid-stat-u.

resource_manager detailed stats were added to deal with deadlocks in pulp/celery/kombu/python-qpid, not sure if it makes still sense to collect (bmbouter might know more)

qpid-rpm-qa is a subset of "installed_package", hence redundant as is


Version-Release number of selected component (if applicable):
6.2.14


How reproducible:
100%


Steps to Reproduce:
1. Collect foreman-debug
2. Check if the archive contains above files / cmds output.


Actual results:
see Description


Expected results:
see Description


Additional info:

Comment 1 Pavel Moravec 2018-02-16 10:21:33 UTC
> resource_manager detailed stats were added to deal with deadlocks in
> pulp/celery/kombu/python-qpid, not sure if it makes still sense to collect
> (bmbouter might know more)

Brian,
do you see valuable to collect

qpid-stat -q resource_manager

with detailed stats about the queue? I see it as a border line - it sometimes can help with deadlock-like issues, but such situations are quite rare to collect it every time?

Comment 2 Lukas Zapletal 2018-02-16 13:15:30 UTC
Backlog only, if you need backport into 6.3 let's fix it upstream and talk about this later.

Comment 6 Brad Buckingham 2018-02-20 18:11:07 UTC
Based on discussion in triage, CEE and PM would like to see the access logs remain in the foreman_debug. The concern is that we might remove them and later find out that they are needed to debug an issue.

Comment 7 Chris Roberts 2018-03-21 20:28:51 UTC
*** Bug 1432977 has been marked as a duplicate of this bug. ***

Comment 8 Chris Brown 2018-07-12 13:53:27 UTC
Failed QA for Satellite 6.4. foreman-debug still missing the following:

- katello-reverse-proxy_error_ssl.log*
- katello-reverse-proxy_access_ssl.log*
- puppet_error_ssl.log*

and still contains the following:
- qpid-stat-resource_manager


Output:
[root@ibm-x3550m3-10 foreman-debug-2cQxq]# find . -name qpid_stat_queues
[root@ibm-x3550m3-10 foreman-debug-2cQxq]# find . -name qpid_stat_subscriptions
[root@ibm-x3550m3-10 foreman-debug-2cQxq]# find . -name qpid-stat-resource_manager
./qpid-stat-resource_manager
[root@ibm-x3550m3-10 foreman-debug-2cQxq]# find . -name qpid-rpm-qa
[root@ibm-x3550m3-10 foreman-debug-2cQxq]# find . -name katello-reverse-proxy_error_ssl.log*
[root@ibm-x3550m3-10 foreman-debug-2cQxq]# find . -name katello-reverse-proxy_access_ssl.log*
[root@ibm-x3550m3-10 foreman-debug-2cQxq]# find . -name dynflow_executor.output*
./var/log/foreman/dynflow_executor.output
[root@ibm-x3550m3-10 foreman-debug-2cQxq]# find . -name puppet_error_ssl.log*

Comment 9 Chris Brown 2018-07-12 13:55:35 UTC
Created attachment 1458444 [details]
hammer ping command output

Comment 10 Chris Brown 2018-07-12 13:56:26 UTC
Created attachment 1458445 [details]
katello-service status command output

Comment 11 Chris Roberts 2018-07-19 15:39:59 UTC
These logs only get collected if they have info in them, this is to prevent gathering empty files.

[root@capsule1 httpd]# pwd
/var/tmp/foreman-debug-zC8vW/var/log/httpd
[root@capsule1 httpd]# ll
total 492
-rw-r--r--. 1 root root   2409 Jul 15 03:23 error_log
-rw-r--r--. 1 root root  17168 Jun 24 03:42 error_log-20180624
-rw-r--r--. 1 root root  27504 Jul  1 03:33 error_log-20180701
-rw-r--r--. 1 root root   7976 Jul  8 03:32 error_log-20180708
-rw-r--r--. 1 root root   7976 Jul 15 03:23 error_log-20180715
-rw-r--r--. 1 root root   1442 Jul 19 11:23 katello-reverse-proxy_access_ssl.log
-rw-r--r--. 1 root root      7 Jul 19 11:24 katello-reverse-proxy_error_ssl.log
-rw-r--r--. 1 root root  81149 Jul 19 11:31 pulp-https_access_ssl.log
-rw-r--r--. 1 root root 304433 Jun 22 03:46 pulp-https_access_ssl.log-20180624
-rw-r--r--. 1 root root    620 Jun 29 10:48 pulp-https_access_ssl.log-20180701
-rw-r--r--. 1 root root    310 Jul  6 03:08 pulp-https_access_ssl.log-20180708
-rw-r--r--. 1 root root    310 Jul 13 03:26 pulp-https_access_ssl.log-20180715
-rw-r--r--. 1 root root    685 Jul 19 11:07 pulp-https_error_ssl.log
-rw-r--r--. 1 root root   2787 Jun 22 03:46 pulp-https_error_ssl.log-20180624
-rw-r--r--. 1 root root    932 Jul  1 03:33 pulp-https_error_ssl.log-20180701
-rw-r--r--. 1 root root    276 Jul  6 03:08 pulp-https_error_ssl.log-20180708
-rw-r--r--. 1 root root    469 Jul 15 03:23 pulp-https_error_ssl.log-20180715
-rw-r--r--. 1 root root      7 Jul 19 11:31 puppet_error_ssl.log

Added a new PR to remove the qpid resource manager command.

Comment 12 Chris Brown 2018-08-24 19:42:20 UTC
Verified in Satellite 6.4

# ll var/log/httpd/
total 4268
-rw-r--r--. 1 root root    9470 Aug 24 16:53 error_log
-rw-r--r--. 1 root root 4355082 Aug 24 21:34 foreman-ssl_access_ssl.log

Comment 14 Bryan Kearney 2018-10-16 19:05:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2927