Bug 1551971
| Summary: | [OVN] cannot start VM with ovn network | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [oVirt] vdsm | Reporter: | Michael Burman <mburman> | ||||||
| Component: | General | Assignee: | Francesco Romani <fromani> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Michael Burman <mburman> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | high | ||||||||
| Version: | 4.20.19 | CC: | ahadas, amusil, bugs, danken, fromani, lveyde, mburman, michal.skrivanek, mkalfon | ||||||
| Target Milestone: | ovirt-4.2.2 | Keywords: | Regression | ||||||
| Target Release: | --- | Flags: | rule-engine:
ovirt-4.2+
rule-engine: blocker+ |
||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | vdsm v4.20.22 | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2018-03-29 11:14:34 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1535006 | ||||||||
| Attachments: |
|
||||||||
Adding some logging to /usr/libexec/vdsm/hooks/before_device_create/ovirt_provider_ovn_hook shows that it is executed and does it job. It seems that vdsm ignores its output and uses the Engine-generated device xml instead. Thus I believe that this is a recent virt regression. final iface <interface type="bridge"><address bus="0x00" domain="0x0000" function="0x0" slot="0x03" type="pci"/><mac address="00:00:00:00:0 0:20"/><model type="virtio"/><source bridge="br-int"/><filterref filter="vdsm-no-mac-spoofing"/><boot order="2"/><alias name="ua-77b53de8-9 7d5-4c90-96ca-6ab12b34f96f"/><virtualport type="openvswitch"><parameters interfaceid="11572e42-47fb-4372-9b3d-7a86aa21081c"/></virtualport> </interface> The problem does not occur on hotplug. This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP. vmfex effected as well, wrong interface type is passed in the xml Patch https://gerrit.ovirt.org/#/c/88547/ is supposed to fix this. RPMs: http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-on-demand-el7-x86_64/821/ (In reply to Francesco Romani from comment #4) > Patch https://gerrit.ovirt.org/#/c/88547/ is supposed to fix this. RPMs: > http://jenkins.ovirt.org/job/vdsm_master_build-artifacts-on-demand-el7- > x86_64/821/ Hi Francesco, I have verified the ovn + vmfex flow with this patch(will give you +1) Hi
VM with SR-IOV vNIC can't start as well, does it the same issue/bug?
2018-03-15 10:27:05,788+0200 ERROR (vm/f9bd0e85) [virt.vm] (vmId='f9bd0e85-54e5-46ec-9ae7-5a61861120a9') The vm start process failed (vm:940)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 869, in _startUnderlyingVm
self._run()
File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2829, in _run
dom = self._connection.defineXML(domxml)
File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper
return func(inst, *args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3676, in defineXML
if ret is None:raise libvirtError('virDomainDefineXML() failed', conn=self)
libvirtError: XML error: non unique alias detected: ua-04c2decd-4e33-4023-84de-a2205c777af7
2018-03-15 10:27:05,789+0200 INFO (vm/f9bd0e85) [virt.vm] (vmId='f9bd0e85-54e5-46ec-9ae7-5a61861120a9') Changed state to Down: XML error: non unique alias detected: ua-04c2decd-4e33-4023-84de-a2205c777af7 (code=1) (vm:1677)
(In reply to Michael Burman from comment #6) > Hi > > VM with SR-IOV vNIC can't start as well, does it the same issue/bug? > > 2018-03-15 10:27:05,788+0200 ERROR (vm/f9bd0e85) [virt.vm] > (vmId='f9bd0e85-54e5-46ec-9ae7-5a61861120a9') The vm start process failed > (vm:940) > Traceback (most recent call last): > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 869, in > _startUnderlyingVm > self._run() > File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2829, in _run > dom = self._connection.defineXML(domxml) > File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", > line 130, in wrapper > ret = f(*args, **kwargs) > File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, > in wrapper > return func(inst, *args, **kwargs) > File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3676, in > defineXML > if ret is None:raise libvirtError('virDomainDefineXML() failed', > conn=self) > libvirtError: XML error: non unique alias detected: > ua-04c2decd-4e33-4023-84de-a2205c777af7 > 2018-03-15 10:27:05,789+0200 INFO (vm/f9bd0e85) [virt.vm] > (vmId='f9bd0e85-54e5-46ec-9ae7-5a61861120a9') Changed state to Down: XML > error: non unique alias detected: ua-04c2decd-4e33-4023-84de-a2205c777af7 > (code=1) (vm:1677) This is likely caused by some recent changes in Engine about management of aliases. Let's ping Arik for more insights. (In reply to Francesco Romani from comment #7) > (In reply to Michael Burman from comment #6) > > This is likely caused by some recent changes in Engine about management of > aliases. Let's ping Arik for more insights. Michael, can you please attach the engine and vdsm logs of this failure? (In reply to Arik from comment #8) > (In reply to Francesco Romani from comment #7) > > (In reply to Michael Burman from comment #6) > > > > This is likely caused by some recent changes in Engine about management of > > aliases. Let's ping Arik for more insights. > > Michael, can you please attach the engine and vdsm logs of this failure? Sure thing, the versions are the latest no need to mention - 4.2.2.2-0.1.el7 vdsm-4.20.20-1.el7ev.x86_64 Created attachment 1408384 [details]
sr-iov vm failed to run
(In reply to Michael Burman from comment #10) > Created attachment 1408384 [details] > sr-iov vm failed to run Thanks. The XML we generate seems valid. The fix for the recent issue with user-aliases we reported to libvirt seems specific to unplugging and then plugging a device with the same user-alias, but its investigation lead to few other changes in that area in libvirt. Can we test this flow against libvirt version that includes all those recent changes? (In reply to Arik from comment #11) > (In reply to Michael Burman from comment #10) > > Created attachment 1408384 [details] > > sr-iov vm failed to run > > Thanks. > The XML we generate seems valid. > The fix for the recent issue with user-aliases we reported to libvirt seems > specific to unplugging and then plugging a device with the same user-alias, > but its investigation lead to few other changes in that area in libvirt. Can > we test this flow against libvirt version that includes all those recent > changes? Do you say that this bug is blocked on libvirt as well(i saw Francesco uploaded patched here) ? then we need to change summary and wait for libvirt fix..I'm wondering if we need new bug for this specific issue? This bug and the new SR_IOV issue are not involving unplug/plug Any how, even if the new libvirt fix this issue, it deserves a bug to track the issue. Reporting new bug for the SR-IOV flow. Arik, the issue reproduced with new libvrit libvirt-3.9.0-14.el7.x86_64
2018-03-15 12:27:16,562+0200 ERROR (vm/f9bd0e85) [virt.vm] (vmId='f9bd0e85-54e5-46ec-9ae7-5a61861120a9') The vm start process failed (vm:940)
Traceback (most recent call last):
File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 869, in _startUnderlyingVm
self._run()
File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2829, in _run
dom = self._connection.defineXML(domxml)
File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper
return func(inst, *args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3676, in defineXML
if ret is None:raise libvirtError('virDomainDefineXML() failed', conn=self)
libvirtError: XML error: non unique alias detected: ua-04c2decd-4e33-4023-84de-a2205c777af7
2018-03-15 12:27:16,566+0200 INFO (vm/f9bd0e85) [virt.vm] (vmId='f9bd0e85-54e5-46ec-9ae7-5a61861120a9') Changed state to Down: XML error: non unique alias detected: ua-04c2decd-4e33-4023-84de-a2205c777af7 (code=1
) (vm:1677)
I have reported new bug BZ 1556828
Verified on - 4.2.2.4-0.1.el7 and vdsm-4.20.22-1.el7ev.x86_64 with libvirt-client-3.9.0-14.el7.x86_64 libvirt-daemon-3.9.0-14.el7.x86_64 OVN and vfmex flows are fixed now no doc_text required This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. |
Created attachment 1404699 [details] Logs Description of problem: [OVN] - ovn is broken on latest d/s build - can't start VM with ovn network. Trying to start VM with ovn network on 4.2.2.2-0.1.el7 and failing with the generic error - 2018-03-06 11:32:54,015+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-4) [] EVENT_ID: VM_DOWN_ERROR(119), VM V3 is down with error. Exit message: Cannot get interface MTU on 'ovn_test_custom': No such device. 2018-03-06 11:32:52,011+0200 ERROR (vm/5cdbe981) [virt.vm] (vmId='5cdbe981-039c-44cf-95cd-84081e5bd688') The vm start process failed (vm:940) Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 869, in _startUnderlyingVm self._run() File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2832, in _run dom.createWithFlags(flags) File "/usr/lib/python2.7/site-packages/vdsm/common/libvirtconnection.py", line 130, in wrapper ret = f(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/common/function.py", line 92, in wrapper return func(inst, *args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 1099, in createWithFlags if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) libvirtError: Cannot get interface MTU on 'ovn_test_custom': No such device Version-Release number of selected component (if applicable): 4.2.2.2-0.1.el7 How reproducible: 100% Steps to Reproduce: 1. Try to start VM with ovn network on latest d/s build Actual results: Failed Expected results: Must work