Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1409834

Summary: Exception on VDSM after host becomes non-responsive
Product: [oVirt] vdsm Reporter: Arik <ahadas>
Component: GeneralAssignee: Milan Zamazal <mzamazal>
Status: CLOSED CURRENTRELEASE QA Contact: sefi litmanovich <slitmano>
Severity: medium Docs Contact:
Priority: unspecified    
Version: ---CC: ahadas, bugs, gklein, mgoldboi, tjelinek
Target Milestone: ovirt-4.1.0-betaFlags: rule-engine: ovirt-4.1+
rule-engine: planning_ack+
tjelinek: devel_ack+
mavital: testing_ack+
Target Release: 4.19.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-01 14:34:12 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vdsm log none

Description Arik 2017-01-03 15:08:11 UTC
Created attachment 1236910 [details]
vdsm log

Description of problem:
Unplugging the cable from the host, waiting for it to be non-responsive and then plugging it back VDSM is down with an error.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
vdsm jsonrpc.JsonRpcServer ERROR Internal server error
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/yajsonrpc/__init__.py", line 547, in _handle_request
 res = method(**params)
File "/usr/lib/python2.7/site-packages/vdsm/rpc/Bridge.py", line 202, in _dynamicMethod
 result = fn(*methodArgs)
File "/usr/share/vdsm/API.py", line 1410, in getAllVmIoTunePolicies
 io_tune_policies_dict = self._cif.getAllVmIoTunePolicies()
File "/usr/share/vdsm/clientIF.py", line 447, in getAllVmIoTunePolicies
 vm_io_tune_policies[v.id] = {'policy': v.getIoTunePolicy(),
File "/usr/share/vdsm/virt/vm.py", line 2730, in getIoTunePolicy
 io_tune = vmxml.find_first(qos, "ioTune", None)
File "/usr/share/vdsm/virt/vmxml.py", line 110, in find_first
 return next(find_all(element, tag))
File "/usr/share/vdsm/virt/vmxml.py", line 89, in find_all
 if tag(element) == tag_:
File "/usr/share/vdsm/virt/vmxml.py", line 148, in tag
 return element.tag
AttributeError: 'NoneType' object has no attribute 'tag'

Expected results:
VDSM is up

Additional info:

Comment 1 Dan Kenigsberg 2017-01-03 15:17:59 UTC
which precise vdsm.rpm is this?

Comment 2 Arik 2017-01-03 20:18:47 UTC
(In reply to Dan Kenigsberg from comment #1)
> which precise vdsm.rpm is this?

Version     : 4.20.0
Release     : 38.git59c645a.fc24
From repo   : ovirt-master-snapshot

Comment 3 Moran Goldboim 2017-01-04 09:46:43 UTC
I assume you meant network cable?
wasn't sure about the impact here? is vdsm down afterwards and not coming up?

thanks.

Comment 4 Arik 2017-01-04 09:48:46 UTC
(In reply to Moran Goldboim from comment #3)
> I assume you meant network cable?

Yes :) 

> wasn't sure about the impact here? is vdsm down afterwards and not coming up?

Right, it doesn't come up

Comment 5 Milan Zamazal 2017-01-04 10:07:53 UTC
I can't reproduce the problem (my host's network is only virtual after all) but I understand it can happen under certain circumstances and I believe the posted patch fixes it.

Comment 6 Milan Zamazal 2017-01-04 16:14:09 UTC
Arik, could you please verify the patch (http://gerrit.ovirt.org/69555) whether it fixes the problem?

Comment 7 Arik 2017-01-12 22:29:59 UTC
(In reply to Milan Zamazal from comment #6)
> Arik, could you please verify the patch (http://gerrit.ovirt.org/69555)
> whether it fixes the problem?

Yes, done.

Comment 8 sefi litmanovich 2017-01-26 11:20:29 UTC
Verified with rhevm-4.1.0.2-0.2.el7 and host: vdsm-4.19.2-2.el7ev.x86_64.
host is nested and I was unplugging the virtual cable, so I hope this is enough, but don't see reason why behaviour should be different.
After plugging back the network, host returned back to state 'up' and the error attached in the description doesn't appear.