Bug 1563165

Summary: [SR-IOV] - vdsm no longer persisting and restoring the number on VFs after reboot
Product: [oVirt] vdsm Reporter: Michael Burman <mburman>
Component: CoreAssignee: Edward Haas <edwardh>
Status: CLOSED CURRENTRELEASE QA Contact: Michael Burman <mburman>
Severity: high Docs Contact:
Priority: urgent    
Version: 4.20.19CC: bugs, danken, lveyde, mburman, ylavi
Target Milestone: ovirt-4.2.3Keywords: Regression
Target Release: ---Flags: rule-engine: ovirt-4.2+
rule-engine: blocker+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: vdsm v4.20.27.1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-10 06:28:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Network RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
vdsm logs
none
failedQA vdsm logs
none
upgrade log
none
upgrade log, ignpre the first one none

Description Michael Burman 2018-04-03 10:13:20 UTC
Created attachment 1416693 [details]
vdsm logs

Description of problem:
[SR-IOV] - vdsm no longer persisting and restoring the number on VFs after reboot.

vdsm should persist and restore the number of enabled VFs on a PF during reboots.
vdsm should take care for persisting and restoring enabled VFs on a PF during the HW reboots. 
Number of VFs should be persistent after reboots. 

This is a regression from BZ 1301349

restore-net::INFO::2018-04-03 12:31:59,996::restore_net_config::94::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:05:00.1.
restore-net::INFO::2018-04-03 12:31:59,996::restore_net_config::94::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:05:00.0.

Version-Release number of selected component (if applicable):
vdsm-4.20.23-1.el7ev.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Enable few VFs on a capable SR-IOV server via setup networks dialogue
2. Reboot the server

Actual results:
restore-net::INFO::2018-04-03 12:31:59,996::restore_net_config::94::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:05:00.1.
restore-net::INFO::2018-04-03 12:31:59,996::restore_net_config::94::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:05:00.0.

Expected results:
The VFs are gone after reboot

Additional info:
Regression from BZ 1301349

Comment 1 Michael Burman 2018-04-03 10:14:54 UTC
This is the state before the reboot - 

[root@puma22 ~]# tree /var/lib/vdsm/persistence/netconf/virtual_functions/
/var/lib/vdsm/persistence/netconf/virtual_functions/
└── 0000:05:00.0

0 directories, 1 file
[root@puma22 ~]# tree /var/lib/vdsm/staging/netconf/virtual_functions/
/var/lib/vdsm/staging/netconf/virtual_functions/
└── 0000:05:00.0

0 directories, 1 file

Comment 2 Red Hat Bugzilla Rules Engine 2018-04-04 07:34:50 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 3 Michael Burman 2018-04-15 12:45:48 UTC
Note for my self - make sure to enable VFs before vdsm update, to test the upgrade scenario for this bug.

Comment 4 Michael Burman 2018-04-22 12:34:22 UTC
Edy, 

Although we did a pre-integration test for this bug, it is failedQA ->

The scenario was - 
1) Enable 2 VFs on vdsm- vdsm-4.20.25-1.el7ev.x86_64
2) Update to vdsm-4.20.26-1.el7ev.x86_64 - VFs not gone from host
3) Reboot host - VFs are gone from host

Comment 5 Michael Burman 2018-04-22 12:35:51 UTC
Created attachment 1425324 [details]
failedQA vdsm logs

Comment 6 Michael Burman 2018-04-22 12:49:07 UTC
Edy, i think that the problem here is in the upgrade, after upgrade var/lib/vdsm/persistence/netconf/devices/ was empty although is had 2 VFs enabled.

And it's why they didn't restored on boot.

Comment 7 Dan Kenigsberg 2018-04-24 09:03:11 UTC
Michael, don't you have any upgrade.log on your machine?
Can you reproduce the non-peristence if you start with an old-style numvfs persistence and run

vdsm-tool --vvverbose --append --logfile=/tmp/upgrade.log upgrade-networks

Comment 8 Michael Burman 2018-04-24 12:19:03 UTC
Yes

Comment 9 Michael Burman 2018-04-24 12:21:02 UTC
Created attachment 1425983 [details]
upgrade log

Comment 10 Michael Burman 2018-04-24 12:22:44 UTC
Created attachment 1425984 [details]
upgrade log, ignpre the first one

Comment 11 Michael Burman 2018-04-24 12:25:01 UTC
This is after update

[root@puma22 ~]# tree /var/lib/vdsm/persistence/netconf/devices/
/var/lib/vdsm/persistence/netconf/devices/

0 directories, 0 files

I have 2 Vfs enabled though on the host prior the upgrade - 
41: enp5s16f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 02:00:00:00:00:01 brd ff:ff:ff:ff:ff:ff
42: enp5s16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 02:00:00:00:00:01 brd ff:ff:ff:ff:ff:ff

Comment 12 Michael Burman 2018-05-03 16:21:20 UTC
Verified on - vdsm-4.20.27.1-1.el7ev.x86_64

Comment 13 Sandro Bonazzola 2018-05-10 06:28:56 UTC
This bugzilla is included in oVirt 4.2.3 release, published on May 4th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.3 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.