Bug 1628505
Summary: | Ansible processes might get killed when logrotate runs for smart_proxy_dynflow_core | |||
---|---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Adam Ruzicka <aruzicka> | |
Component: | Remote Execution | Assignee: | Adam Ruzicka <aruzicka> | |
Status: | CLOSED ERRATA | QA Contact: | Lukas Pramuk <lpramuk> | |
Severity: | medium | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 6.4 | CC: | aruzicka, egolov, inecas, ktordeur, mmccune, mvanderw, ofalk, pcreech, sghai | |
Target Milestone: | 6.5.0 | Keywords: | Triaged | |
Target Release: | Unused | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | tfm-rubygem-smart_proxy_dynflow_core-0.2.0-4 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1646745 (view as bug list) | Environment: | ||
Last Closed: | 2019-05-14 12:37:59 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1628145 |
Description
Adam Ruzicka
2018-09-13 09:29:08 UTC
Created redmine issue http://projects.theforeman.org/issues/24927 from this bug Upstream bug assigned to aruzicka Upstream bug assigned to aruzicka *** Bug 1628145 has been marked as a duplicate of this bug. *** snap 23, not 63 FailedQA. @satellite-6.4.0-15.el7sat.noarch tfm-rubygem-smart_proxy_dynflow_core-0.2.1-1.el7sat.noarch using manual reproducer described in comment #0: 0) Remove offending file preventing to logrotate relevant logs # rm -rf /var/log/foreman-proxy/smart_proxy_dynflow_core.log-20180926.gz 1) Schedule Ansible Command "sleep 1000" against a host and watch ps tree # watch "ps -efH | grep ^foreman+" 2) Run logrotate -f /etc/logrotate.conf -v ... rotating log /var/log/foreman-proxy/smart_proxy_dynflow_core.log, log->rotateCount is 5 dateext suffix '-20180926' glob pattern '-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]' fscreate context set to system_u:object_r:var_log_t:s0 renaming /var/log/foreman-proxy/smart_proxy_dynflow_core.log to /var/log/foreman-proxy/smart_proxy_dynflow_core.log-20180926 ... >>> relevant log rotated 3) upon logrotate ps tree jobs disappeared and REX job failed with IOError: --- 1: PLAY [all] ********************************************************************* 2: 3: TASK [Gathering Facts] ********************************************************* 4: ok: [host1.example.com] 5: 6: TASK [shell] ******************************************************************* 7: Runner error: IOError - closed stream 8: Exit status: EXCEPTION --- Upon logrotate ps tree jobs do not disappear and REX job succeeds. However, logrotate now reports error: --- rotating log /var/log/foreman-proxy/smart_proxy_dynflow_core.log, log->rotateCount is 5 dateext suffix '-20181003' glob pattern '-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]' glob finding old rotated logs failed fscreate context set to system_u:object_r:var_log_t:s0 renaming /var/log/foreman-proxy/smart_proxy_dynflow_core.log to /var/log/foreman-proxy/smart_proxy_dynflow_core.log-20181003 creating new /var/log/foreman-proxy/smart_proxy_dynflow_core.log mode = 0644 uid = 991 gid = 988 running postrotate script error: error running shared postrotate script for '/var/log/foreman-proxy/smart_proxy_dynflow_core.log ' --- So I'd like to failqa this again what do you mean? Lets consider using "pkill --signal SIGUSR1 --full ^/usr/bin/smart_proxy_dynflow_core" Killing ^/usr/bin/smart_proxy_dynflow_core ( instead of /usr/bin/smart_proxy_dynflow_core ) would avoid killing parent process having the same string in commandline: sh -c "pkill --signal SIGUSR1 --full /usr/bin/smart_proxy_dynflow_core ..." FailedQA. @satellite-6.4.0-15.el7sat.noarch tfm-rubygem-smart_proxy_dynflow_core-0.2.1-2.el7sat.noarch pkill command is too wide as it is killing also parent (logrotate controlled) proccess VERIFIED. @satellite-6.5.0-5.beta.el7sat.noarch tfm-rubygem-smart_proxy_dynflow_core-0.2.1-3.el7sat.noarch using manual reproducer described in comment #0: 0) Remove offending file preventing to logrotate relevant logs # rm -rf /var/log/foreman-proxy/smart_proxy_dynflow_core.log-`date +%Y%m%d`.gz 1) Schedule Ansible Command "sleep 1000" against a host and watch ps tree # watch "ps -efH | grep ^foreman+" 2) Run logrotate -f /etc/logrotate.conf -v ... rotating pattern: /var/log/foreman-proxy/smart_proxy_dynflow_core.log forced from command line (5 rotations) empty log files are not rotated, old logs are removed considering log /var/log/foreman-proxy/smart_proxy_dynflow_core.log log needs rotating rotating log /var/log/foreman-proxy/smart_proxy_dynflow_core.log, log->rotateCount is 5 dateext suffix '-20181126' glob pattern '-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]' glob finding old rotated logs failed fscreate context set to system_u:object_r:var_log_t:s0 renaming /var/log/foreman-proxy/smart_proxy_dynflow_core.log to /var/log/foreman-proxy/smart_proxy_dynflow_core.log-20181126 creating new /var/log/foreman-proxy/smart_proxy_dynflow_core.log mode = 0644 uid = 993 gid = 991 running postrotate script compressing log with: /bin/gzip set default create context to system_u:object_r:var_log_t:s0 ... >>> relevant log rotated successfully (no errors) 3) after logrotate ps tree jobs still present REX job succeded after 1000 secs: 1: 2:PLAY [all] ********************************************************************* 3: 4:TASK [Gathering Facts] ********************************************************* 5: 6:ok: [host1.example.com] 7: 8:TASK [command] ***************************************************************** 9: 10:changed: [host1.example.com] 11: 12:TASK [debug] ******************************************************************* 13:ok: [host1.example.com] => { 14: "out": { 15: "changed": true, 16: "cmd": "sleep 1000", 17: "delta": "0:16:40.005265", 18: "end": "2018-11-26 18:57:50.891947", 19: "failed": false, 20: "rc": 0, 21: "start": "2018-11-26 18:41:10.886682", 22: "stderr": "", 23: "stderr_lines": [], 24: "stdout": "", 25: "stdout_lines": [] 26: } 27:} 28: 29:PLAY RECAP ********************************************************************* 30:host1.example.com : ok=3 changed=1 unreachable=0 failed=0 31: 32: 33:Exit status: 0 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:1222 |