Bug 1563165 - [SR-IOV] - vdsm no longer persisting and restoring the number on VFs after reboot
Summary: [SR-IOV] - vdsm no longer persisting and restoring the number on VFs after re...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.20.19
Hardware: x86_64
OS: Linux
urgent
high
Target Milestone: ovirt-4.2.3
: ---
Assignee: Edward Haas
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-04-03 10:13 UTC by Michael Burman
Modified: 2018-05-10 06:28 UTC (History)
5 users (show)

Fixed In Version: vdsm v4.20.27.1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-10 06:28:56 UTC
oVirt Team: Network
Embargoed:
rule-engine: ovirt-4.2+
rule-engine: blocker+


Attachments (Terms of Use)
vdsm logs (517.71 KB, application/x-gzip)
2018-04-03 10:13 UTC, Michael Burman
no flags Details
failedQA vdsm logs (540.00 KB, application/x-gzip)
2018-04-22 12:35 UTC, Michael Burman
no flags Details
upgrade log (2.32 KB, text/plain)
2018-04-24 12:21 UTC, Michael Burman
no flags Details
upgrade log, ignpre the first one (4.77 KB, text/plain)
2018-04-24 12:22 UTC, Michael Burman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 90219 0 ovirt-4.2 MERGED net tests: Complement the persistence tests with bonds 2018-04-17 07:59:01 UTC
oVirt gerrit 90220 0 ovirt-4.2 MERGED net: Refactor SRIOV setup flow and its persistency 2018-04-17 07:59:04 UTC
oVirt gerrit 90732 0 master MERGED net: Fix sriov configuration upgrade 2018-04-29 11:59:41 UTC
oVirt gerrit 90733 0 ovirt-4.2 MERGED net: Fix sriov configuration upgrade 2018-04-29 12:00:53 UTC
oVirt gerrit 90747 0 ovirt-4.2.3 MERGED net: Fix sriov configuration upgrade 2018-04-30 14:30:24 UTC

Description Michael Burman 2018-04-03 10:13:20 UTC
Created attachment 1416693 [details]
vdsm logs

Description of problem:
[SR-IOV] - vdsm no longer persisting and restoring the number on VFs after reboot.

vdsm should persist and restore the number of enabled VFs on a PF during reboots.
vdsm should take care for persisting and restoring enabled VFs on a PF during the HW reboots. 
Number of VFs should be persistent after reboots. 

This is a regression from BZ 1301349

restore-net::INFO::2018-04-03 12:31:59,996::restore_net_config::94::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:05:00.1.
restore-net::INFO::2018-04-03 12:31:59,996::restore_net_config::94::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:05:00.0.

Version-Release number of selected component (if applicable):
vdsm-4.20.23-1.el7ev.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Enable few VFs on a capable SR-IOV server via setup networks dialogue
2. Reboot the server

Actual results:
restore-net::INFO::2018-04-03 12:31:59,996::restore_net_config::94::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:05:00.1.
restore-net::INFO::2018-04-03 12:31:59,996::restore_net_config::94::root::(_restore_sriov_numvfs) SRIOV network device which is not persisted found at: 0000:05:00.0.

Expected results:
The VFs are gone after reboot

Additional info:
Regression from BZ 1301349

Comment 1 Michael Burman 2018-04-03 10:14:54 UTC
This is the state before the reboot - 

[root@puma22 ~]# tree /var/lib/vdsm/persistence/netconf/virtual_functions/
/var/lib/vdsm/persistence/netconf/virtual_functions/
└── 0000:05:00.0

0 directories, 1 file
[root@puma22 ~]# tree /var/lib/vdsm/staging/netconf/virtual_functions/
/var/lib/vdsm/staging/netconf/virtual_functions/
└── 0000:05:00.0

0 directories, 1 file

Comment 2 Red Hat Bugzilla Rules Engine 2018-04-04 07:34:50 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 3 Michael Burman 2018-04-15 12:45:48 UTC
Note for my self - make sure to enable VFs before vdsm update, to test the upgrade scenario for this bug.

Comment 4 Michael Burman 2018-04-22 12:34:22 UTC
Edy, 

Although we did a pre-integration test for this bug, it is failedQA ->

The scenario was - 
1) Enable 2 VFs on vdsm- vdsm-4.20.25-1.el7ev.x86_64
2) Update to vdsm-4.20.26-1.el7ev.x86_64 - VFs not gone from host
3) Reboot host - VFs are gone from host

Comment 5 Michael Burman 2018-04-22 12:35:51 UTC
Created attachment 1425324 [details]
failedQA vdsm logs

Comment 6 Michael Burman 2018-04-22 12:49:07 UTC
Edy, i think that the problem here is in the upgrade, after upgrade var/lib/vdsm/persistence/netconf/devices/ was empty although is had 2 VFs enabled.

And it's why they didn't restored on boot.

Comment 7 Dan Kenigsberg 2018-04-24 09:03:11 UTC
Michael, don't you have any upgrade.log on your machine?
Can you reproduce the non-peristence if you start with an old-style numvfs persistence and run

vdsm-tool --vvverbose --append --logfile=/tmp/upgrade.log upgrade-networks

Comment 8 Michael Burman 2018-04-24 12:19:03 UTC
Yes

Comment 9 Michael Burman 2018-04-24 12:21:02 UTC
Created attachment 1425983 [details]
upgrade log

Comment 10 Michael Burman 2018-04-24 12:22:44 UTC
Created attachment 1425984 [details]
upgrade log, ignpre the first one

Comment 11 Michael Burman 2018-04-24 12:25:01 UTC
This is after update

[root@puma22 ~]# tree /var/lib/vdsm/persistence/netconf/devices/
/var/lib/vdsm/persistence/netconf/devices/

0 directories, 0 files

I have 2 Vfs enabled though on the host prior the upgrade - 
41: enp5s16f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 02:00:00:00:00:01 brd ff:ff:ff:ff:ff:ff
42: enp5s16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP mode DEFAULT group default qlen 1000
    link/ether 02:00:00:00:00:01 brd ff:ff:ff:ff:ff:ff

Comment 12 Michael Burman 2018-05-03 16:21:20 UTC
Verified on - vdsm-4.20.27.1-1.el7ev.x86_64

Comment 13 Sandro Bonazzola 2018-05-10 06:28:56 UTC
This bugzilla is included in oVirt 4.2.3 release, published on May 4th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.3 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.