Bug 1294747

Summary: Migration fails when the SRIOV PF is not online
Product: [Community] RDO Reporter: Michael Liu <ztehypervisor>
Component: openstack-novaAssignee: nlevinki <nlevinki>
Status: CLOSED EOL QA Contact: Prasanth Anbalagan <panbalag>
Severity: medium Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: berrange, chris.brown, dasmith, dyuan, eglynn, fjin, laine, libvirt-maint, mzhan, rbryant, sbauza, sgordon, srevivo, vromanso, yafu, zpeng, ztehypervisor
Target Milestone: ---Flags: ztehypervisor: needinfo-
ztehypervisor: needinfo-
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-03 09:35:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Michael Liu 2015-12-30 05:54:59 UTC
Description of problem:


Version-Release number of selected component (if applicable):
libvirt 1.2.21

How reproducible:


Steps to Reproduce:
1.start the virtual machine with SRIOV VF macvtap devices
2.the SRIOV PF is not online
3.migrate the virtual machine

Actual results:
Migration operation failed.

ERROR nova.virt.libvirt.driver [-] [instance: 3d830dcd-b73e-44ca-83bc-786461709a49] Live Migration failure: internal error: Unable to configure VF 62 of PF 'efr' because the PF is not online. Please change host network config to put the PF online.
ERROR nova.virt.libvirt.driver [-] [instance: 3d830dcd-b73e-44ca-83bc-786461709a49] Migration operation has aborted


Expected results:


Additional info:

Comment 1 yafu 2016-01-28 09:09:49 UTC
(In reply to Michael Liu from comment #0)
> Description of problem:
> 
> 
> Version-Release number of selected component (if applicable):
> libvirt 1.2.21
> 
> How reproducible:
> 
> 
> Steps to Reproduce:
> 1.start the virtual machine with SRIOV VF macvtap devices
> 2.the SRIOV PF is not online
> 3.migrate the virtual machine
> 
> Actual results:
> Migration operation failed.
> 
> ERROR nova.virt.libvirt.driver [-] [instance:
> 3d830dcd-b73e-44ca-83bc-786461709a49] Live Migration failure: internal
> error: Unable to configure VF 62 of PF 'efr' because the PF is not online.
> Please change host network config to put the PF online.
> ERROR nova.virt.libvirt.driver [-] [instance:
> 3d830dcd-b73e-44ca-83bc-786461709a49] Migration operation has aborted
> 
> 
> Expected results:
> 
> 
> Additional info:

Hi, I tried to reproduce the issue and have some doubt about step2. Was the offline SRIOV PF on the source host or target host? If the offline SRIOV PF is on the target host, then the migration failed as design referring to https://bugzilla.redhat.com/show_bug.cgi?id=893738 .
And if the offline SRIOV PF is on the source host, I can not reproduce the issue. 

Would you please check that? Thanks.

Comment 2 Michael Liu 2016-04-27 04:51:19 UTC
(In reply to yafu from comment #1)
> (In reply to Michael Liu from comment #0)
> > Description of problem:
> > 
> > 
> > Version-Release number of selected component (if applicable):
> > libvirt 1.2.21
> > 
> > How reproducible:
> > 
> > 
> > Steps to Reproduce:
> > 1.start the virtual machine with SRIOV VF macvtap devices
> > 2.the SRIOV PF is not online
> > 3.migrate the virtual machine
> > 
> > Actual results:
> > Migration operation failed.
> > 
> > ERROR nova.virt.libvirt.driver [-] [instance:
> > 3d830dcd-b73e-44ca-83bc-786461709a49] Live Migration failure: internal
> > error: Unable to configure VF 62 of PF 'efr' because the PF is not online.
> > Please change host network config to put the PF online.
> > ERROR nova.virt.libvirt.driver [-] [instance:
> > 3d830dcd-b73e-44ca-83bc-786461709a49] Migration operation has aborted
> > 
> > 
> > Expected results:
> > 
> > 
> > Additional info:
> 
> Hi, I tried to reproduce the issue and have some doubt about step2. Was the
> offline SRIOV PF on the source host or target host? If the offline SRIOV PF
> is on the target host, then the migration failed as design referring to
> https://bugzilla.redhat.com/show_bug.cgi?id=893738 .
> And if the offline SRIOV PF is on the source host, I can not reproduce the
> issue. 
> 
> Would you please check that? Thanks.

Hi, the offline SRIOV PF is on the target host.Thanks.

Comment 3 Christopher Brown 2017-06-17 19:51:34 UTC
Hi,

Is this still a problem?

I don't think this should be assigned to RDO / openstack-nova as this appears to be a libvirt issue....

Comment 4 Laine Stump 2017-06-19 16:16:42 UTC
This is not a libvirt issue. In order for a VF to function properly, its PF must be online (otherwise all the setup is successful, but traffic doesn't pass), so libvirt verifies that is the case before starting the guest. But libvirt *intentionally* will not automatically set an offline PF online due to a guest using one of the VF's associated with that PF. The reason we won't do that is that the mere act of setting the PF online will by default (i.e. with *no* config on the host) enable IPv6, potentially resulting in the host being opened up to incoming connections from likely unwanted sources *all without the approval or knowledge of the host administrator*. In other words, automatically setting an offline PF to online is a security risk.

So I consider this a host config problem that is beyond libvirt's authority to change - if a host will have its VFs used by guests, then the administrator should modify the host's net config for the PF so that it is online (with IPv6 enabled or disabled as desired).

If openstack nova controls the host's network configuaration, then this is a nova issue. If not, then it is a host configuration issue. I'm moving it back to openstack-nova, where it can either be resolved by enhancing nova's host network config or (if nova doesn't manage host network configuration) by closing it as NOTABUG.

Comment 5 Christopher Brown 2017-06-19 20:10:54 UTC
Laine, many thanks for the fulsome explanation.

Michael, can you first confirm this is still an issue for you as this is an old bug I'm in the process of triaging.