Bug 1478965 - Vdsm shows traceback when HE VM powered-off
Vdsm shows traceback when HE VM powered-off
Status: VERIFIED
Product: vdsm
Classification: oVirt
Component: Core (Show other bugs)
4.20.0
x86_64 Linux
medium Severity low (vote)
: ovirt-4.2.0
: ---
Assigned To: Francesco Romani
Artyom
: Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-08-07 11:15 EDT by Artyom
Modified: 2017-10-25 06:40 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Network
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rule-engine: ovirt‑4.2+


Attachments (Terms of Use)
vdsm log (14.57 MB, text/plain)
2017-08-07 11:15 EDT, Artyom
no flags Details
vdsm log (309.99 KB, text/plain)
2017-10-02 08:05 EDT, Artyom
no flags Details
vdsm log and VM dumpxml (53.72 KB, application/zip)
2017-10-03 04:59 EDT, Artyom
no flags Details
Another full Vdsm log of the same occurrence (956.17 KB, application/x-gzip)
2017-10-12 05:20 EDT, Francesco Romani
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 80393 master MERGED virt net: Avoid removing the display network when not defined 2017-08-23 03:09 EDT

  None (edit)
Description Artyom 2017-08-07 11:15:47 EDT
Created attachment 1310159 [details]
vdsm log

Description of problem:
Vdsm shows traceback when HE VM powered-off
2017-08-07 18:09:32,164+0300 ERROR (jsonrpc/0) [virt.vm] (vmId='b58fdeda-45bb-43d2-b336-ef9953171347') Failed to tear down device vnc, device in inconsist
ent state (vm:2120)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2117, in _teardown_devices
    device.teardown()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vmdevices/graphics.py", line 75, in teardown
    self.vmid)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/libvirtnetwork.py", line 99, in delete_network
    removeNetwork(netname)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/libvirtnetwork.py", line 110, in removeNetwork
    netName = LIBVIRT_NET_PREFIX + network
TypeError: cannot concatenate 'str' and 'NoneType' objects


Version-Release number of selected component (if applicable):
vdsm-4.20.2-25.git7499b81.el7.centos.x86_64
libvirt-client-3.2.0-14.el7_4.2.x86_64
ovirt-engine-4.2.0-0.0.master.20170803140556.git1e7d0dd.el7.centos.noarch

How reproducible:
Always

Steps to Reproduce:
1. Deploy hosted-engine
2. Add master storage domain
3. Wait for auto-import operation
4. Set global maintenance
5. Power off HE VM via HE CLI # hosted-engine --vm-poweroff

Actual results:
Action succeeds, but the traceback above appears under the vdsm log

Expected results:
Action succeeds, without any tracebacks under the VDSM log

Additional info:
Comment 1 Doron Fediuck 2017-08-08 05:31:37 EDT
Dan,
care to take a look?
By default hosted engine is not doing anything specific with the network, so this is new to me.
Comment 2 Dan Kenigsberg 2017-08-08 06:16:14 EDT
It smells as if Hosted Engine is relying on an ancient default, which Vdsm has recently dropped. Edy, would you care to look?

If my guess is correct, I'd rather we fix it in hosted engine.
Comment 3 Edward Haas 2017-08-08 07:00:02 EDT
If "displayNetwork" is not specified for a VM, we will encounter this problem.
When displayNetwork is not specified, the VM console can be reached from all existing interfaces which usually is not that safe.

We should fix this in both domains: Not to explode when it is missing and not to define a VM (Engine) with a missing displayNetwork.
Comment 4 Artyom 2017-09-25 03:54:36 EDT
Checked on vdsm-4.20.3-88.git6102334.el7.centos.x86_64
Now I have another error
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2180, in _teardown_devices
    device.teardown()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vmdevices/graphics.py", line 74, in teardown
    display_network = self.specParams['displayNetwork']
KeyError: 'displayNetwork'

Change "display_network = self.specParams['displayNetwork']" to "display_network = self.specParams.get('displayNetwork')"
Comment 5 Edward Haas 2017-10-01 02:05:39 EDT
If the displayNetwork key is missing, I would expect to fail on setup already.

fromani, any idea what is going on?
Comment 6 Francesco Romani 2017-10-02 03:01:11 EDT
(In reply to Edward Haas from comment #5)
> If the displayNetwork key is missing, I would expect to fail on setup
> already.
> 
> fromani, any idea what is going on?

Most likely with the switch to Engine XML, Engine stopped to sent the 'displayNetwork' specParam. Should be easy to check with the Vdsm logs, will do later if noone else can check quickly
Comment 7 Francesco Romani 2017-10-02 05:37:26 EDT
(In reply to Artyom from comment #4)
> Checked on vdsm-4.20.3-88.git6102334.el7.centos.x86_64
> Now I have another error
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2180, in
> _teardown_devices
>     device.teardown()
>   File "/usr/lib/python2.7/site-packages/vdsm/virt/vmdevices/graphics.py",
> line 74, in teardown
>     display_network = self.specParams['displayNetwork']
> KeyError: 'displayNetwork'
> 
> Change "display_network = self.specParams['displayNetwork']" to
> "display_network = self.specParams.get('displayNetwork')"

Please share the Vdsm logs including the VM startup and shutdown.
Comment 8 Artyom 2017-10-02 08:05 EDT
Created attachment 1333210 [details]
vdsm log

You can start to check from last traceback
Comment 9 Francesco Romani 2017-10-02 08:38:53 EDT
What exactly happened on that box? I took the liberty to login into it and look at the logs, and it seems that Vdsm was upgraded while the VM was running.

I'm not sure, but it seems that Vdsm failed to store, or to recover, the specParams in the domain XML, hence the failure on teardown.

I need a clean reproducer with all the logs to be sure:
The log snippet should include:
1. VM started, dump of the creation parameters
2. VM shutdown, traceback

If possible, this will be much appreciated:
3. output of virsh -r dumpxml while the VM is running
Comment 10 Artyom 2017-10-03 04:59 EDT
Created attachment 1333564 [details]
vdsm log and VM dumpxml

I added vdsm log that has a start and destroy of HE VM, also you can find dumpxml under archive.
Comment 11 Dan Kenigsberg 2017-10-03 06:27:12 EDT
According to fromani, displayNetwork was missing on destroy() because it was not stored in the domxml metadata during startup, and was therefore lost when vdsmd is restarted.
Comment 12 Francesco Romani 2017-10-12 05:20 EDT
Created attachment 1337634 [details]
Another full Vdsm log of the same occurrence
Comment 13 Francesco Romani 2017-10-12 07:09:15 EDT
(In reply to Francesco Romani from comment #12)
> Created attachment 1337634 [details]
> Another full Vdsm log of the same occurrence

Still can't reproduce the bug.
Comment 14 Francesco Romani 2017-10-12 08:29:00 EDT
There is a high chance this bug was fixed by commit 70daa1566 in master branch.
Comment 15 Yaniv Kaul 2017-10-15 07:02:15 EDT
What is preventing this from moving to MODIFIED?
Comment 16 Francesco Romani 2017-10-16 03:24:37 EDT
(In reply to Francesco Romani from comment #14)
> There is a high chance this bug was fixed by commit 70daa1566 in master
> branch.

I'm confident this is the fix, moving to MODIFIED
Comment 17 Artyom 2017-10-17 04:11:07 EDT
Verified on vdsm-4.20.3-173.gitae7a215.el7.centos.x86_64

Note You need to log in before you can comment on or make changes to this bug.