Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1548087

Summary: failed to collect logs with No such file or directory error
Product: [oVirt] ovirt-log-collector Reporter: Dafna Ron <dron>
Component: GeneralAssignee: Douglas Schilling Landgraf <dougsland>
Status: CLOSED WORKSFORME QA Contact: Pavel Stehlik <pstehlik>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.3.0CC: bugs, sbonazzo
Target Milestone: ---Flags: dron: planning_ack?
dron: devel_ack?
dron: testing_ack?
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-09 15:25:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Integration RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dafna Ron 2018-02-22 16:38:31 UTC
we failed a test in OST 003_00_metrics_bootstrap.metrics_and_log_collector

The failure reason was no such file or directory. 

I don't know yet if its reproduced 100% or randomly but I will post further details as we have them. 


Link to Job:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5829/

Link to all logs:
http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5829/artifacts



/var/tmp:
drwxr-x--x. root abrt system_u:object_r:abrt_var_cache_t:s0 abrt
-rw-------. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.aLitM7
-rw-------. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.G2r7IM
-rw-------. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.kVymZE
-rw-------. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.uPDvvU
drwx------. root root system_u:object_r:tmp_t:s0       systemd-private-cd49c74726d5463f8d6f6502380e5e12-chronyd.service-i1T5IE
drwx------. root root system_u:object_r:tmp_t:s0       systemd-private-cd49c74726d5463f8d6f6502380e5e12-systemd-timedated.service-lhoUsS

/var/tmp/abrt:
-rw-------. root root system_u:object_r:abrt_var_cache_t:s0 last-via-server

/var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-chronyd.service-i1T5IE:
drwxrwxrwt. root root system_u:object_r:tmp_t:s0       tmp

/var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-chronyd.service-i1T5IE/tmp:

/var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-systemd-timedated.service-lhoUsS:
drwxrwxrwt. root root system_u:object_r:tmp_t:s0       tmp

/var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-systemd-timedated.service-lhoUsS/tmp:

/var/yp:
)
2018-02-22 07:24:05::DEBUG::__main__::251::root:: STDERR(/bin/ls: cannot open directory /rhev/data-center/mnt/blockSD/6babba93-09c8-4846-9ccb-07728f72eecb/master/tasks/bd563276-5092-4d28-86c4-63aa6c0b4344.temp: No such file or directory
)
2018-02-22 07:24:05::ERROR::__main__::832::root:: Failed to collect logs from: lago-basic-suite-master-host-0; /bin/ls: cannot open directory /rhev/data-center/mnt/blockSD/6babba93-09c8-4846-9ccb-07728f72eecb/master/tasks/bd563276-5092-4d28-86c4-63aa6c0b4344.temp: No such file or directory

Comment 1 Douglas Schilling Landgraf 2018-02-23 00:00:29 UTC
Thanks Dafna, a reproducer would be appreciated. I see you mentioned some 'suspected patches' in the mailing list [1]. Why are you suspecting from these changes?  Can you share how to setup such environment for a local test machine?

[1] https://www.mail-archive.com/infra@ovirt.org/msg32099.html

Sandro, any ideas?

Comment 2 Dafna Ron 2018-02-23 13:22:02 UTC
As Yaniv mentioned on the list, it is probably a race. 
the only thing I can think of that would reproduce it is to run OST locally several times and see if it happens randomly. maybe create a short sleep in the code on create storage domain for example to try and delay the previous tests from finishing? 

to run ost locally you need to clone ovirt-system-tests project and install lago. 
if you run ./run_suite <suite name> it would run the tests locally. 

In the mailing list I reported the patch  that failed the OST test. 
The way the automation works is that it tests a bunch of changes and zero's in to a single change that may be causing the issue. it does not however mean the change was at fault. 

if you want to create the enviornment without it being deleted you will need to install lago locally and run ost locally.

Comment 3 Douglas Schilling Landgraf 2018-02-23 22:09:07 UTC
(In reply to Dafna Ron from comment #2)
> As Yaniv mentioned on the list, it is probably a race. 
> the only thing I can think of that would reproduce it is to run OST locally
> several times and see if it happens randomly. maybe create a short sleep in
> the code on create storage domain for example to try and delay the previous
> tests from finishing? 

Thanks for information.  Yes, that would help. Specially, if we can't reproduce it.

> 
> to run ost locally you need to clone ovirt-system-tests project and install
> lago. 
> if you run ./run_suite <suite name> it would run the tests locally. 
> 
> In the mailing list I reported the patch  that failed the OST test. 
> The way the automation works is that it tests a bunch of changes and zero's
> in to a single change that may be causing the issue. it does not however
> mean the change was at fault. 
> 
> if you want to create the enviornment without it being deleted you will need
> to install lago locally and run ost locally.


$ rpm -qa | grep lago
python-lago-ovirt-0.6.0-1.fc27.noarch
python-lago-0.6.0-1.fc27.noarch
lago-ovirt-0.6.0-1.fc27.noarch
lago-0.6.0-1.fc27.noarch


<clone ovirt-system-tests>
$ ./run_suite.sh basic-suite-4.2
<snip>
+ lago init /home/douglas/ovirt-system-tests/deployment-basic-suite-4.2 /home/douglas/ovirt-system-tests/basic-suite-4.2/LagoInitFile --template-repo-path /home/douglas/ovirt-system-tests/basic-suite-4.2/template-repo.json
./run_suite.sh: line 84: lago: command not found

should it be lagocli instead ?

Comment 4 Dafna Ron 2018-02-26 10:43:59 UTC
(In reply to Douglas Schilling Landgraf from comment #3)
> (In reply to Dafna Ron from comment #2)
> > As Yaniv mentioned on the list, it is probably a race. 
> > the only thing I can think of that would reproduce it is to run OST locally
> > several times and see if it happens randomly. maybe create a short sleep in
> > the code on create storage domain for example to try and delay the previous
> > tests from finishing? 
> 
> Thanks for information.  Yes, that would help. Specially, if we can't
> reproduce it.
> 
> > 
> > to run ost locally you need to clone ovirt-system-tests project and install
> > lago. 
> > if you run ./run_suite <suite name> it would run the tests locally. 
> > 
> > In the mailing list I reported the patch  that failed the OST test. 
> > The way the automation works is that it tests a bunch of changes and zero's
> > in to a single change that may be causing the issue. it does not however
> > mean the change was at fault. 
> > 
> > if you want to create the enviornment without it being deleted you will need
> > to install lago locally and run ost locally.
> 
> 
> $ rpm -qa | grep lago
> python-lago-ovirt-0.6.0-1.fc27.noarch
> python-lago-0.6.0-1.fc27.noarch
> lago-ovirt-0.6.0-1.fc27.noarch
> lago-0.6.0-1.fc27.noarch
> 
> 
> <clone ovirt-system-tests>
> $ ./run_suite.sh basic-suite-4.2
> <snip>
> + lago init /home/douglas/ovirt-system-tests/deployment-basic-suite-4.2
> /home/douglas/ovirt-system-tests/basic-suite-4.2/LagoInitFile
> --template-repo-path
> /home/douglas/ovirt-system-tests/basic-suite-4.2/template-repo.json
> ./run_suite.sh: line 84: lago: command not found
> 
> should it be lagocli instead ?

no :) this are the packages I have: 

[dron@dron ds-ovirt-system-tests]$ rpm -qa |grep lago
lago-ovirt-0.44.0-1.el7.centos.noarch
python-lago-0.42.0-1.el7.centos.noarch
python-lago-ovirt-0.44.0-1.el7.centos.noarch
lago-0.42.0-1.el7.centos.noarch
[dron@dron ds-ovirt-system-tests]$ 


but ping me if there is any issue running tests

Comment 5 Douglas Schilling Landgraf 2018-03-09 15:25:18 UTC
As we talked, jenkins only triggered this one once and I can't reproduce as well.
For now, closing this report. Fell free to re-open Dafna.

Thanks!