Bug 1586019 - [SR-IOV] - VF leakage when shutting down a VM from powering UP state
Summary: [SR-IOV] - VF leakage when shutting down a VM from powering UP state
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: Backend.Core
Version: ---
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ovirt-4.2.5
: 4.2.5
Assignee: Ales Musil
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-06-05 10:34 UTC by Michael Burman
Modified: 2018-07-31 15:27 UTC (History)
5 users (show)

Fixed In Version: ovirt-engine-4.2.5
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-07-31 15:27:01 UTC
oVirt Team: Network
Embargoed:
rule-engine: ovirt-4.2+
rule-engine: ovirt-4.3+


Attachments (Terms of Use)
logs (927.63 KB, application/x-gzip)
2018-06-05 10:34 UTC, Michael Burman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 92280 0 master MERGED core: Fix SR-IOV virtual functions leak 2021-01-28 07:20:19 UTC
oVirt gerrit 92547 0 ovirt-engine-4.2 MERGED core: Fix SR-IOV virtual functions leak 2021-01-28 07:19:36 UTC

Description Michael Burman 2018-06-05 10:34:33 UTC
Created attachment 1447811 [details]
logs

Description of problem:
[SR-IOV] - VF leakage when shutting down a VM from powering UP state 

It is allowed to shutdown/poweroff the VM before it's reached UP state, we can shut down the VM from powering UP state, if doing it with SR-IOV vNIC, the VF will leak and engine will consider it as in use until we reboot the host. 

'Cannot edit host NIC VFs configuration. The selected network interface enp8s0f0 has VFs that are in use.'

Engine thinks that the VF is taken, although the VM is down and seems to be free(appears on host and in the UI). But we can't use this VF or change the number of VFs on the associated PF. 

Version-Release number of selected component (if applicable):
vdsm-4.20.29-1.el7ev.x86_64

How reproducible:
Seems to be 100%

Steps to Reproduce:
1. Enable 1 VF on a capable sr-iov host
2. Start VM with sr-iov vNIC
3. Shut down the VM when it's in 'powering UP' state(before it's UP)
4. Try to change the number of Vfs back to zero

Actual results:
Cannot edit host NIC VFs configuration. The selected network interface enp8s0f0 has VFs that are in use.

VF has leaked. Didn't released properly.

Expected results:
Should work. If we allow to shut down the VM from powering up state, then we should handle the release of the VF/s in such case.

Additional info:
Discovered during automation run(as we didn't wait for the VM be fully UP before shutting it down) on teardown stage.

Comment 1 Michael Burman 2018-06-05 10:36:20 UTC
2018-06-04 23:27:43,464+03 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-12) [] VM '6c4bc71c-565d-442d-8ae6-99c563840109'(golden_env_mixed_virtio_1_0) moved from 'PoweringUp' --> 'Down'
2018-06-04 23:27:43,550+03 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-12) [] EVENT_ID: VM_DOWN(61), VM golden_env_mixed_virtio_1_0 is down.

2018-06-04 23:27:54,227+03 WARN  [org.ovirt.engine.core.bll.network.host.UpdateHostNicVfsConfigCommand] (default task-18) [host_nics_syncAction_d7212f9e-a9b3-4] Validation of action 'UpdateHostNicVfsConfig' failed for user admin@internal-authz. Reasons: VAR__ACTION__UPDATE,VAR__TYPE__HOST_NIC_VFS_CONFIG,ACTION_TYPE_FAILED_NUM_OF_VFS_CANNOT_BE_CHANGED,$nicName enp3s0f1
2018-06-04 23:27:54,228+03 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-18) [] Operation Failed: [Cannot edit host NIC VFs configuration. The selected network interface enp3s0f1 has VFs that are in use.]

Comment 2 Dan Kenigsberg 2018-06-11 11:15:20 UTC
I suspect that this might be an Engine bug, not a Vdsm one.

Is this new in 4.2? (VM startup code has changed considerably in Engine)
Does refresh capabilities (or Engine restart) reset the leak?

Comment 3 Michael Burman 2018-06-11 12:02:23 UTC
(In reply to Dan Kenigsberg from comment #2)
> I suspect that this might be an Engine bug, not a Vdsm one.
> 
> Is this new in 4.2? (VM startup code has changed considerably in Engine)
> Does refresh capabilities (or Engine restart) reset the leak?

I don't know if it's new. It's an edge case i guess, but we do allow it, so it's a problem and it's 100 reproducible. 

Refresh caps doesn't reset the leak, engine restart does.

Comment 4 Michael Burman 2018-07-03 14:09:27 UTC
Verified on - 4.2.5-0.1.el7ev with vdsm-4.20.33-1.el7ev.x86_64

Comment 5 Sandro Bonazzola 2018-07-31 15:27:01 UTC
This bugzilla is included in oVirt 4.2.5 release, published on July 30th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.5 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.