Description of problem: Running ovirt-log-collector fails on many errors of non existing files in abrt directory. e.g. /bin/ls: cannot open directory /var/tmp/abrt/Python-2018-10-30-16:58:35-12807/sos.UensHr/sosreport-lynx10-2018-10-30-mfcvaas/var: No such file or directory Version-Release number of selected component (if applicable): ovirt-log-collector-4.2.7-1.el7ev.noarch How reproducible: happened 2x Steps to Reproduce: 1. install abrt on the host 2. run ovirt-log-collector on engine 3. Actual results: it fails with many non existing file errors Expected results: maybe it should skip the abrt processing directory Additional info: When I look to the directory on the host /var/tmp/abrt/Python-2018-10-30-16:58:35-12807/ there is no directory sos.UensHr. So it seems to me that the abrt was just running when the log collector tried to collect informations.
Moving to 4.3.2 not being identified as blocker for 4.3.1.
Being the error due to a collection of sos report while running abrt, I'm closing this as wontfix. Easy workaroundis just re-run log collector once abrt finished to collect its data.
Reported to sos upstream: https://github.com/sosreport/sos/issues/1717
To ensure the fix will land in downstream, I am reopening and reassigning to RHEL/sos . I will try to run abrt reports and sosreport in a tight loop if I will reproduce it. Anyway a particular reproducer or at least information about more timing between starting abrt report and sosreport would be welcomed.
I failed to reproduce the problem. What I did: 1) Made abrtd to generate frequent reports: yum install abrt-addon-ccpp abrt-cli abrt-tui -y service abrt-ccpp restart service abrtd restart while true; do sleep 10 & sleep 1; kill -s SIGSEGV $!; sleep 2; done 2) call sosreport with different configs/options in a loop But all sosreport tarballs or build directories were created. Could you please provide with some reproducer steps or machine?
I went through the log and checked how exactly abrt, sos and ovirt-log-collector works and it is problem of ovirt-log-collector. It gets sosreport from host properly without any error. It processes it as expected. The problem occurs in this call: 2018-10-30 16:59:17::DEBUG::__main__::242::root:: calling(['/usr/bin/ssh', '-n', '-p', '22', '-i', '/etc/pki/ovirt-engine/keys/engine_id_rsa', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=600', 'root@host1', '/bin/ls -lRZ /etc /var /rhev']) This lists files recursively also in /var/tmp/abrt/. And sos removes its temp directory in between. I don't understand why this call is needed, so I can't say what could be the proper solution. Maybe just ignore errors from this call.
Thanks for checking and root cause analysis. Reassigning to ovirt-log-collector back, leaving purely on them if to fix it or close.
Moving back to modified since ovirt-log-collector-4.4.2-2 is not included in 4.4.1
Verified on: ovirt-engine-4.4.2.3-0.6.el8ev.noarch ovirt-log-collector-4.4.3-1.el8ev.noarch Steps: 1. # yum install -y abrt abrt-addon-ccpp abrt-cli abrt-tui 2. # systemctl start abrtd 3. # systemctl start abrt-ccpp 4. On tmux: # while true; do sleep 10 & sleep 1; kill -s SIGSEGV $!; sleep 2; done 5. # ovirt-log-collector --local-tmp=/root/local_tmp Results: No error, everything ran as expected
This bugzilla is included in oVirt 4.4.2 release, published on September 17th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.