we failed a test in OST 003_00_metrics_bootstrap.metrics_and_log_collector The failure reason was no such file or directory. I don't know yet if its reproduced 100% or randomly but I will post further details as we have them. Link to Job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5829/ Link to all logs: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/5829/artifacts /var/tmp: drwxr-x--x. root abrt system_u:object_r:abrt_var_cache_t:s0 abrt -rw-------. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.aLitM7 -rw-------. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.G2r7IM -rw-------. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.kVymZE -rw-------. root root unconfined_u:object_r:user_tmp_t:s0 rpm-tmp.uPDvvU drwx------. root root system_u:object_r:tmp_t:s0 systemd-private-cd49c74726d5463f8d6f6502380e5e12-chronyd.service-i1T5IE drwx------. root root system_u:object_r:tmp_t:s0 systemd-private-cd49c74726d5463f8d6f6502380e5e12-systemd-timedated.service-lhoUsS /var/tmp/abrt: -rw-------. root root system_u:object_r:abrt_var_cache_t:s0 last-via-server /var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-chronyd.service-i1T5IE: drwxrwxrwt. root root system_u:object_r:tmp_t:s0 tmp /var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-chronyd.service-i1T5IE/tmp: /var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-systemd-timedated.service-lhoUsS: drwxrwxrwt. root root system_u:object_r:tmp_t:s0 tmp /var/tmp/systemd-private-cd49c74726d5463f8d6f6502380e5e12-systemd-timedated.service-lhoUsS/tmp: /var/yp: ) 2018-02-22 07:24:05::DEBUG::__main__::251::root:: STDERR(/bin/ls: cannot open directory /rhev/data-center/mnt/blockSD/6babba93-09c8-4846-9ccb-07728f72eecb/master/tasks/bd563276-5092-4d28-86c4-63aa6c0b4344.temp: No such file or directory ) 2018-02-22 07:24:05::ERROR::__main__::832::root:: Failed to collect logs from: lago-basic-suite-master-host-0; /bin/ls: cannot open directory /rhev/data-center/mnt/blockSD/6babba93-09c8-4846-9ccb-07728f72eecb/master/tasks/bd563276-5092-4d28-86c4-63aa6c0b4344.temp: No such file or directory
Thanks Dafna, a reproducer would be appreciated. I see you mentioned some 'suspected patches' in the mailing list [1]. Why are you suspecting from these changes? Can you share how to setup such environment for a local test machine? [1] https://www.mail-archive.com/infra@ovirt.org/msg32099.html Sandro, any ideas?
As Yaniv mentioned on the list, it is probably a race. the only thing I can think of that would reproduce it is to run OST locally several times and see if it happens randomly. maybe create a short sleep in the code on create storage domain for example to try and delay the previous tests from finishing? to run ost locally you need to clone ovirt-system-tests project and install lago. if you run ./run_suite <suite name> it would run the tests locally. In the mailing list I reported the patch that failed the OST test. The way the automation works is that it tests a bunch of changes and zero's in to a single change that may be causing the issue. it does not however mean the change was at fault. if you want to create the enviornment without it being deleted you will need to install lago locally and run ost locally.
(In reply to Dafna Ron from comment #2) > As Yaniv mentioned on the list, it is probably a race. > the only thing I can think of that would reproduce it is to run OST locally > several times and see if it happens randomly. maybe create a short sleep in > the code on create storage domain for example to try and delay the previous > tests from finishing? Thanks for information. Yes, that would help. Specially, if we can't reproduce it. > > to run ost locally you need to clone ovirt-system-tests project and install > lago. > if you run ./run_suite <suite name> it would run the tests locally. > > In the mailing list I reported the patch that failed the OST test. > The way the automation works is that it tests a bunch of changes and zero's > in to a single change that may be causing the issue. it does not however > mean the change was at fault. > > if you want to create the enviornment without it being deleted you will need > to install lago locally and run ost locally. $ rpm -qa | grep lago python-lago-ovirt-0.6.0-1.fc27.noarch python-lago-0.6.0-1.fc27.noarch lago-ovirt-0.6.0-1.fc27.noarch lago-0.6.0-1.fc27.noarch <clone ovirt-system-tests> $ ./run_suite.sh basic-suite-4.2 <snip> + lago init /home/douglas/ovirt-system-tests/deployment-basic-suite-4.2 /home/douglas/ovirt-system-tests/basic-suite-4.2/LagoInitFile --template-repo-path /home/douglas/ovirt-system-tests/basic-suite-4.2/template-repo.json ./run_suite.sh: line 84: lago: command not found should it be lagocli instead ?
(In reply to Douglas Schilling Landgraf from comment #3) > (In reply to Dafna Ron from comment #2) > > As Yaniv mentioned on the list, it is probably a race. > > the only thing I can think of that would reproduce it is to run OST locally > > several times and see if it happens randomly. maybe create a short sleep in > > the code on create storage domain for example to try and delay the previous > > tests from finishing? > > Thanks for information. Yes, that would help. Specially, if we can't > reproduce it. > > > > > to run ost locally you need to clone ovirt-system-tests project and install > > lago. > > if you run ./run_suite <suite name> it would run the tests locally. > > > > In the mailing list I reported the patch that failed the OST test. > > The way the automation works is that it tests a bunch of changes and zero's > > in to a single change that may be causing the issue. it does not however > > mean the change was at fault. > > > > if you want to create the enviornment without it being deleted you will need > > to install lago locally and run ost locally. > > > $ rpm -qa | grep lago > python-lago-ovirt-0.6.0-1.fc27.noarch > python-lago-0.6.0-1.fc27.noarch > lago-ovirt-0.6.0-1.fc27.noarch > lago-0.6.0-1.fc27.noarch > > > <clone ovirt-system-tests> > $ ./run_suite.sh basic-suite-4.2 > <snip> > + lago init /home/douglas/ovirt-system-tests/deployment-basic-suite-4.2 > /home/douglas/ovirt-system-tests/basic-suite-4.2/LagoInitFile > --template-repo-path > /home/douglas/ovirt-system-tests/basic-suite-4.2/template-repo.json > ./run_suite.sh: line 84: lago: command not found > > should it be lagocli instead ? no :) this are the packages I have: [dron@dron ds-ovirt-system-tests]$ rpm -qa |grep lago lago-ovirt-0.44.0-1.el7.centos.noarch python-lago-0.42.0-1.el7.centos.noarch python-lago-ovirt-0.44.0-1.el7.centos.noarch lago-0.42.0-1.el7.centos.noarch [dron@dron ds-ovirt-system-tests]$ but ping me if there is any issue running tests
As we talked, jenkins only triggered this one once and I can't reproduce as well. For now, closing this report. Fell free to re-open Dafna. Thanks!