Bug 1644646 - ovirt-log-collector fails on non existing files/directories in abrt directory
Summary: ovirt-log-collector fails on non existing files/directories in abrt directory
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-log-collector
Classification: oVirt
Component: General
Version: 4.3.3
Hardware: Unspecified
OS: Linux
medium
medium
Target Milestone: ovirt-4.4.2
: 4.4.3
Assignee: Douglas Schilling Landgraf
QA Contact: Guilherme Santos
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-31 10:15 UTC by Lucie Leistnerova
Modified: 2020-09-18 07:12 UTC (History)
8 users (show)

Fixed In Version: ovirt-log-collector-4.4.3
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-18 07:12:39 UTC
oVirt Team: Integration
Embargoed:
sbonazzo: ovirt-4.4?
sbonazzo: planning_ack?
sbonazzo: devel_ack+
lleistne: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github sosreport sos issues 1717 0 'None' closed sos fails on non existing files/directories in abrt directory if collection is done while abrt report is being generated 2020-09-09 17:11:42 UTC
Github sosreport sos pull 2069 0 None closed [abrt] Add /var/tmp/abrt into the collection 2020-09-09 17:11:41 UTC
oVirt gerrit 109054 0 master ABANDONED sos: Add abrt plugin 2020-09-09 17:11:41 UTC
oVirt gerrit 110030 0 master MERGED Do not fail on 'ls -lRZ' errors 2020-09-09 17:11:41 UTC

Description Lucie Leistnerova 2018-10-31 10:15:19 UTC
Description of problem:
Running ovirt-log-collector fails on many errors of non existing files in abrt directory.
e.g.

/bin/ls: cannot open directory /var/tmp/abrt/Python-2018-10-30-16:58:35-12807/sos.UensHr/sosreport-lynx10-2018-10-30-mfcvaas/var: No such file or directory


Version-Release number of selected component (if applicable):
ovirt-log-collector-4.2.7-1.el7ev.noarch

How reproducible: happened 2x


Steps to Reproduce:
1. install abrt on the host
2. run ovirt-log-collector on engine
3.

Actual results: it fails with many non existing file errors


Expected results: maybe it should skip the abrt processing directory


Additional info:
When I look to the directory on the host /var/tmp/abrt/Python-2018-10-30-16:58:35-12807/ there is no directory sos.UensHr. So it seems to me that the abrt was just running when the log collector tried to collect informations.

Comment 2 Sandro Bonazzola 2019-02-18 07:54:49 UTC
Moving to 4.3.2 not being identified as blocker for 4.3.1.

Comment 3 Sandro Bonazzola 2019-02-27 08:40:21 UTC
Being the error due to a collection of sos report while running abrt, I'm closing this as wontfix. Easy workaroundis just re-run log collector once abrt finished to collect its data.

Comment 6 Sandro Bonazzola 2019-07-02 08:42:28 UTC
Reported to sos upstream: https://github.com/sosreport/sos/issues/1717

Comment 7 Pavel Moravec 2019-07-02 11:17:33 UTC
To ensure the fix will land in downstream, I am reopening and reassigning to RHEL/sos .

I will try to run abrt reports and sosreport in a tight loop if I will reproduce it. Anyway a particular reproducer or at least information about more timing between starting abrt report and sosreport would be welcomed.

Comment 9 Pavel Moravec 2019-07-02 20:39:53 UTC
I failed to reproduce the problem. What I did:

1) Made abrtd to generate frequent reports:

yum install abrt-addon-ccpp abrt-cli abrt-tui -y
service abrt-ccpp restart
service abrtd restart

while true; do sleep 10 & sleep 1; kill -s SIGSEGV $!; sleep 2; done


2) call sosreport with different configs/options in a loop


But all sosreport tarballs or build directories were created.


Could you please provide with some reproducer steps or machine?

Comment 10 Lucie Leistnerova 2019-07-04 10:22:39 UTC
I went through the log and checked how exactly abrt, sos and ovirt-log-collector works and it is problem of ovirt-log-collector.
It gets sosreport from host properly without any error. It processes it as expected.
The problem occurs in this call:

2018-10-30 16:59:17::DEBUG::__main__::242::root:: calling(['/usr/bin/ssh', '-n', '-p', '22', '-i', '/etc/pki/ovirt-engine/keys/engine_id_rsa', '-oStrictHostKeyChecking=no', '-oServerAliveInterval=600', 'root@host1', '/bin/ls -lRZ /etc /var /rhev'])

This lists files recursively also in /var/tmp/abrt/. And sos removes its temp directory in between.
I don't understand why this call is needed, so I can't say what could be the proper solution. Maybe just ignore errors from this call.

Comment 11 Pavel Moravec 2019-07-04 12:10:54 UTC
Thanks for checking and root cause analysis.

Reassigning to ovirt-log-collector back, leaving purely on them if to fix it or close.

Comment 14 Sandro Bonazzola 2020-07-13 09:44:32 UTC
Moving back to modified since ovirt-log-collector-4.4.2-2 is not included in 4.4.1

Comment 15 Guilherme Santos 2020-08-27 09:31:42 UTC
Verified on:
ovirt-engine-4.4.2.3-0.6.el8ev.noarch
ovirt-log-collector-4.4.3-1.el8ev.noarch

Steps:
1. # yum install -y abrt abrt-addon-ccpp abrt-cli abrt-tui
2. # systemctl start abrtd
3. # systemctl start abrt-ccpp
4. On tmux: # while true; do sleep 10 & sleep 1; kill -s SIGSEGV $!; sleep 2; done
5. # ovirt-log-collector --local-tmp=/root/local_tmp

Results:
No error, everything ran as expected

Comment 16 Sandro Bonazzola 2020-09-18 07:12:39 UTC
This bugzilla is included in oVirt 4.4.2 release, published on September 17th 2020.

Since the problem described in this bug report should be resolved in oVirt 4.4.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.