Bug 1646745

Summary: Ansible processes might get killed when logrotate runs for smart_proxy_dynflow_core
Product: Red Hat Satellite Reporter: Mike McCune <mmccune>
Component: Remote ExecutionAssignee: satellite6-bugs <satellite6-bugs>
Status: CLOSED ERRATA QA Contact: Lukas Pramuk <lpramuk>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 6.4CC: aruzicka, inecas, lpramuk, mmccune, mvanderw, ofalk, pcreech, sbadhwar, sghai, zhunting
Target Milestone: 6.4.1Keywords: Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: tfm-rubygem-smart_proxy_dynflow_core-0.2.1-4 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1628505 Environment:
Last Closed: 2018-12-06 22:32:53 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1628145    

Comment 1 Lukas Pramuk 2018-11-27 00:30:41 UTC
FailedQA.

@satellite-6.4.1-1.el7sat.noarch
tfm-rubygem-smart_proxy_dynflow_core-0.2.1-2.el7sat.noarch

# logrotate -f /etc/logrotate.conf -v
...
rotating pattern: /var/log/foreman-proxy/smart_proxy_dynflow_core.log  forced from command line (5 rotations)
empty log files are not rotated, old logs are removed
considering log /var/log/foreman-proxy/smart_proxy_dynflow_core.log
  log needs rotating
rotating log /var/log/foreman-proxy/smart_proxy_dynflow_core.log, log->rotateCount is 5
dateext suffix '-20181126'
glob pattern '-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
glob finding old rotated logs failed
fscreate context set to system_u:object_r:var_log_t:s0
renaming /var/log/foreman-proxy/smart_proxy_dynflow_core.log to /var/log/foreman-proxy/smart_proxy_dynflow_core.log-20181126
creating new /var/log/foreman-proxy/smart_proxy_dynflow_core.log mode = 0644 uid = 991 gid = 988
running postrotate script
error: error running shared postrotate script for '/var/log/foreman-proxy/smart_proxy_dynflow_core.log '
...

>>> pkill command is too wide as it is killing also parent (logrotate controlled) proccess

issue failed for 6.4.0 persists, did we miss pkg version bump in compose ? 
https://bugzilla.redhat.com/show_bug.cgi?id=1628505#c16

while for 6.5.0 is fixed
https://bugzilla.redhat.com/show_bug.cgi?id=1628505#c22

Comment 2 Lukas Pramuk 2018-12-02 22:19:29 UTC
VERIFIED.

@satellite-6.4.1-1.el7sat.noarch
tfm-rubygem-smart_proxy_dynflow_core-0.2.1-4.el7sat.noarch

using the following manual reproducer:

0) Remove offending file preventing to logrotate relevant logs
# rm -rf /var/log/foreman-proxy/smart_proxy_dynflow_core.log-`date +%Y%m%d`.gz

1) Schedule Ansible Command "sleep 1000" against a host and watch ps tree
# watch "ps -efH | grep ^foreman+"

2) Run logrotate
# logrotate -f /etc/logrotate.conf -v
...

rotating pattern: /var/log/foreman-proxy/smart_proxy_dynflow_core.log  forced from command line (5 rotations)
empty log files are not rotated, old logs are removed
considering log /var/log/foreman-proxy/smart_proxy_dynflow_core.log
  log needs rotating
rotating log /var/log/foreman-proxy/smart_proxy_dynflow_core.log, log->rotateCount is 5
dateext suffix '-20181202'
glob pattern '-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
glob finding old rotated logs failed
fscreate context set to system_u:object_r:var_log_t:s0
renaming /var/log/foreman-proxy/smart_proxy_dynflow_core.log to /var/log/foreman-proxy/smart_proxy_dynflow_core.log-20181202
creating new /var/log/foreman-proxy/smart_proxy_dynflow_core.log mode = 0644 uid = 991 gid = 988
running postrotate script
compressing log with: /bin/gzip
set default create context to system_u:object_r:var_log_t:s0
...

>>> relevant log rotated successfully (no errors)

3) after logrotate ps tree jobs still present REX job succeded after 1000 secs:

PLAY [all] *********************************************************************

TASK [Gathering Facts] *********************************************************

ok: [host1.example.com]

TASK [shell] *******************************************************************

changed: [host1.example.com]

TASK [debug] *******************************************************************

ok: [host1.example.com] => {
    "out": {
        "changed": true, 
        "cmd": "sleep 1000", 
        "delta": "0:16:40.006465", 
        "end": "2018-12-02 17:12:54.224481", 
        "failed": false, 
        "rc": 0, 
        "start": "2018-12-02 16:56:14.218016", 
        "stderr": "", 
        "stderr_lines": [], 
        "stdout": "", 
        "stdout_lines": []
    }
}

PLAY RECAP *********************************************************************
host1.example.com : ok=3    changed=1    unreachable=0    failed=0   


Exit status: 0

Comment 4 errata-xmlrpc 2018-12-06 22:32:53 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3799