Hide Forgot
Created attachment 1216071 [details] logs Description of problem: I can see this bug only in automation run. Test start the VM and when the VM has state "Powering Up" migrates it. Version-Release number of selected component (if applicable): vdsm-yajsonrpc-4.18.15.2-1.el7ev.noarch vdsm-jsonrpc-4.18.15.2-1.el7ev.noarch vdsm-hook-vmfex-dev-4.18.15.2-1.el7ev.noarch vdsm-hook-openstacknet-4.18.15.2-1.el7ev.noarch vdsm-api-4.18.15.2-1.el7ev.noarch vdsm-infra-4.18.15.2-1.el7ev.noarch vdsm-hook-vhostmd-4.18.15.2-1.el7ev.noarch vdsm-python-4.18.15.2-1.el7ev.noarch vdsm-cli-4.18.15.2-1.el7ev.noarch vdsm-4.18.15.2-1.el7ev.x86_64 vdsm-hook-ethtool-options-4.18.15.2-1.el7ev.noarch vdsm-xmlrpc-4.18.15.2-1.el7ev.noarch vdsm-hook-fcoe-4.18.15.2-1.el7ev.noarch ovirt-engine-4.0.5.4-0.1.el7ev.noarch How reproducible: 10% only under automation tests Steps to Reproduce: 1. Start the VM 2. Migrate the VM in "Powering Up" state 3. Actual results: Migration succeeds, but the VM appears in the pause state on the destination host Expected results: Migration succeeds, and the VM has staet "Up" Additional info: I do not sure if the bug relates to the engine or vdsm
Is this a regression? did the same test pass with vdsm-v4.18.2 ? What happens if you manually continue the VM on destination? Does the guest start running all right?
Like I said, it happens to me only under automation run(not in the same test case), so I can not reproduce it manually. I believe it regression because the bug appears only in last automation runs, but again I do not sure for 100% because from all tests that we run it happens only for some single test case.
Artyom, what is the most recent version of vdsm that passes this test %100? Can you stop the test after failure, in order to see what happens inside the guest and whether manual "continue" solves things?
We do not have such long history of runs under Jenkins, but from email, I can see that 4.0.4-1 does not have tests that fail because of this bug. I will try to catch the bug on my local automation environment and will check the guest, but I do not sure if I will succeed. Until then can someone take a look at logs?
hmmm, I have not found anything too interesting in the logs, could you please try to provide the qemu and libvirt logs from the destination host? thanx
I do not have libvirt and qemu logs for the automation run. I tried to reproduce this problem locally, but without result 100 iterations: 1) Start the VM 2) Migrate the VM straight forward after start 3) Stop the VM Maybe this problem relates to some problematic host I believe you can close it with insufficient data, in case if I will see this error again I will attach logs.
Thank you! So closing as insufficient data since without libvirt/qemu logs there is not much we can do. If it will happen again and you will have this logs please reopen.