Bug 1574880 - failing 004_basic_sanity.verify_suspend_resume_vm0 [NEEDINFO]
Summary: failing 004_basic_sanity.verify_suspend_resume_vm0
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.2.0
Hardware: x86_64
OS: Linux
Target Milestone: ---
: ---
Assignee: Michal Skrivanek
QA Contact: meital avital
Depends On:
TreeView+ depends on / blocked
Reported: 2018-05-04 09:09 UTC by Dafna Ron
Modified: 2018-08-15 17:05 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2018-08-15 17:05:00 UTC
oVirt Team: Virt
dfediuck: needinfo? (dron)
dron: planning_ack?
dron: devel_ack?
dron: testing_ack?

Attachments (Terms of Use)

Description Dafna Ron 2018-05-04 09:09:01 UTC
We had a failure in OST that is not related to the change tested. 
Also, this is not the first time I saw this failure. 

This is the failed job: http://jenkins.ovirt.org/job/ovirt-master_change-queue-tester/7220/

from what I can see, the vm was suspended and then got a stop from an unknown processes.

2018-05-03T04:11:55.774660Z qemu-kvm: terminating on signal 15 from pid 4782 (<unknown process>)

grepping for the pid I can see errors in the audit log:

lago-basic-suite-master-host-1/_var_log/audit/audit.log:type=VIRT_RESOURCE msg=audit(1525320727.733:5420): pid=4782 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 msg='virt=kvm resrc=rng reason=start vm="vm0" uuid=23e86d6b-ae2f-4e8e-b55e-339351c9a025 old-rng="?" new-rng="/dev/urandom" exe="/usr/sbin/libvirtd" hostname=? addr=? terminal=? res=success'

and I am not sure who this pid belongs to but on host deploy I can see it was used by ansible:

lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20180503000218-lago-basic-suite-master-host-1-4a67ad42.log:D: create     100644  1 (   0,   0)  4782 /usr/lib/python2.7/site-packages/ansible/modules/cloud/amazon/ec2_customer_gateway_facts.py;5aea8956
lago-basic-suite-master-engine/_var_log/ovirt-engine/host-deploy/ovirt-host-deploy-20180503000218-lago-basic-suite-master-host-1-4a67ad42.log:D: create     100644  2 (   0,   0)  4782 /usr/lib/python2.7/site-packages/ansible/module_utils/network/dellos10/dellos10.pyo;5aea8956

Another issue that ykaul noticed is that we are silently failing virt-sparsify (which is hopefully unrelated and my need a separate bug) - because we do not actually check the success of it (which is why its silent). 

2018-05-03 00:12:36,070-0400 DEBUG (periodic/3) [virt.periodic] Looking for stale paused VMs (periodic:388)
2018-05-03 00:12:36,080-0400 DEBUG (periodic/0) [virt.sampling.VMBulkstatsMonitor] sampled timestamp 4295744.91 elapsed 0.020 acquired True domains all (sampling:447)
2018-05-03 00:12:37,179-0400 DEBUG (tasks/8) [root] FAILED: <err> = "virt-sparsify: error: libguestfs error: guestfs_launch failed.\nThis usually means the libguestfs appliance failed to start or crashed.\nDo:\n  export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1\nand run the command again.  For further information, read:\n  http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs\nYou can also run 'libguestfs-test-tool' and post the *complete* output\ninto a bug report or message to the libguestfs mailing list.\n\nIf reporting bugs, run virt-sparsify with debugging enabled and include the \ncomplete output:\n\n  virt-sparsify -v -x [...]\n"; <rc> = 1 (commands:87)
2018-05-03 00:12:37,186-0400 INFO  (tasks/8) [storage.SANLock] Releasing Lease(name='4acdf7b8-2a3b-494b-b9db-aba65b78cbc6', path=u'/rhev/data-center/mnt/', offset=0) (clusterlock:435)
2018-05-03 00:12:37,192-0400 INFO  (tasks/8) [storage.SANLock] Successfully released Lease(name='4acdf7b8-2a3b-494b-b9db-aba65b78cbc6', path=u'/rhev/data-center/mnt/', offset=0) (clusterlock:444)
2018-05-03 00:12:37,193-0400 ERROR (tasks/8) [root] Job u'93d447b5-f5b5-45e2-9821-8e54dc04305d' failed (jobs:221)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/jobs.py", line 157, in run
  File "/usr/lib/python2.7/site-packages/vdsm/storage/sdm/api/sparsify_volume.py", line 56, in _run
  File "/usr/lib/python2.7/site-packages/vdsm/virtsparsify.py", line 71, in sparsify_inplace
    raise cmdutils.Error(cmd, rc, out, err)
Error: Command ['/usr/bin/virt-sparsify', '--machine-readable', '--in-place', u'/rhev/data-center/mnt/'] failed with rc=1 out=['3/12'] err=['virt-sparsify: error: libguestfs error: guestfs_launch failed.', 'This usually means the libguestfs appliance failed to start or crashed.', 'Do:', '  export LIBGUESTFS_DEBUG=1 LIBGUESTFS_TRACE=1', 'and run the command again.  For further information, read:', '  http://libguestfs.org/guestfs-faq.1.html#debugging-libguestfs', "You can also run 'libguestfs-test-tool' and post the *complete* output", 'into a bug report or message to the libguestfs mailing list.', '', 'If reporting bugs, run virt-sparsify with debugging enabled and include the ', 'complete output:', '', '  virt-sparsify -v -x [...]']
2018-05-03 00:12:37,194-0400 INFO  (tasks/8) [root] Job u'93d447b5-f5b5-45e2-9821-8e54dc04305d' will be deleted in 3600 seconds (jobs:249)

all the logs can be found in the job

Comment 1 Doron Fediuck 2018-06-14 09:19:45 UTC
It this still relevant?

Comment 2 Ryan Barry 2018-08-15 17:05:00 UTC
Closing since there's no response.

Please re-open if this is still relevant, Dafna.

Note You need to log in before you can comment on or make changes to this bug.