1188251 – After failure to setupNetworks: restore-nets with unified persistence does not restore pre-vdsm ifcfg

Bug 1188251 - After failure to setupNetworks: restore-nets with unified persistence does not restore pre-vdsm ifcfg

Summary: After failure to setupNetworks: restore-nets with unified persistence does no...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	vdsm
Sub Component:
Version:	3.5.0
Hardware:	x86_64
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	ovirt-3.6.0-rc
Target Release:	3.6.0
Assignee:	Petr Horáček
QA Contact:	Meni Yakove
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1197441 1197668
TreeView+	depends on / blocked

Reported:	2015-02-02 12:29 UTC by Petr Horáček
Modified:	2019-07-11 08:36 UTC (History)
CC List:	35 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Previously, VDSM did not consume pre-defined ifcfg interfaces and did not consider them as belonging to it. When a setupNetworks command was issued and failed, VDSM failed to restore the original ifcfg file. Note that this bug only occurs when failing to set up network on top of a pre-existing ifcfg file. Now, VDSM stores pre-defined ifcfg files before they are modified, even with unified persistence.
Clone Of:
Clones:	1197668 (view as bug list)
Environment:
Last Closed:	2016-03-09 19:30:31 UTC
oVirt Team:	Network
Target Upstream Version:
Embargoed:
Flags:	ylavi: Triaged+

Attachments	(Terms of Use)
vdsm.log from my RHEV 3.5 upgrade last night (16.06 MB, text/plain) 2015-02-17 22:11 UTC, Greg Scott	no flags	Details
vdsm.log archive dated midnight. This should have the eth2 issues (786.74 KB, application/x-xz) 2015-02-17 22:45 UTC, Greg Scott	no flags	Details
supervdsm.log in case it's useful. (10.55 MB, text/plain) 2015-02-17 22:50 UTC, Greg Scott	no flags	Details
connectivity.log with some info from last night. (13.21 KB, text/plain) 2015-02-17 22:52 UTC, Greg Scott	no flags	Details
mom.log from last night (1.84 MB, text/plain) 2015-02-17 22:53 UTC, Greg Scott	no flags	Details
and upgrade.log showing the 3.5 upgrade I hope. (3.81 KB, text/plain) 2015-02-17 22:55 UTC, Greg Scott	no flags	Details
The whole /var/lib/vdsm tree (898 bytes, application/x-gzip) 2015-02-17 23:09 UTC, Greg Scott	no flags	Details
vdsClient -s 0 getVdsCaps on host rheva (11.21 KB, text/plain) 2015-02-18 12:23 UTC, Greg Scott	no flags	Details
vdsClient -s 0 getVdsCaps on host rhevb (11.18 KB, text/plain) 2015-02-18 12:24 UTC, Greg Scott	no flags	Details
Show Obsolete (1) View All

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	1346443	None	None	None	Never
Red Hat Product Errata	RHBA-2016:0362	normal	SHIPPED_LIVE	vdsm 3.6.0 bug fix and enhancement update	2016-03-09 23:49:32 UTC
oVirt gerrit	37453	master	MERGED	network: store non-Vdsm ifcfgs with unified persistence	Never
oVirt gerrit	38240	ovirt-3.5	MERGED	network: store non-Vdsm ifcfgs with unified persistence	Never

Description Petr Horáček 2015-02-02 12:29:36 UTC

Description of problem:
If we change non-VDSM network devices with VDSM and then call
`vdsm-tool restore nets`, non-VDSM devices are not restored.

Unified restoration first remove everything in the running config
and then tries to regenerate everything that's in the persisted config,
problem is, that original devices are not persisted.


Version-Release number of selected component (if applicable):
Tested on VDSM 4.16.


How reproducible:
Always


Steps to Reproduce:
1. vdsClient -s 0 setupNetworks "networks={ovirtmgmt:{bonding:bond0,bridged:true,ipaddr:${IP},netmask:${NETMASK},gateway:${GATEWAY}}}" "bondings={bond0:{nics:${ETH_NIC_NAME}}}"
2. vdsm-tool restore-nets

Actual results:
VDSM devices are removed, ifcfg-eth0 is removed, but not restored.


Expected results:
We want ifcfg-eth0 in the original state.

Comment 1 Dan Kenigsberg 2015-02-05 14:27:40 UTC

As a workaround for pre-defined bonding ifcfg files, look at https://bugzilla.redhat.com/show_bug.cgi?id=1154399#c3

Comment 2 Lior Vernia 2015-02-09 13:48:06 UTC

In order to not make this more difficult for customers and GSS than it has to be, let's skip the linking to another bug and be explicit here.

To my understanding, it's best to ask vdsm to own the existing bonds prior to upgrading to 3.5. This can be done by running:

"persist /etc/sysconfig/network-scripts/ifcfg-<bond-name>"

Dan, please corroborate, and also does one need to specify a path for the persist script?

If this was not done prior to upgrade, then bonds will have to be re-constructed post upgrade. If there management still has connectivity to the host, that can be performed via the GUI; otherwise (say if the management network was on such a bond) physical access to the host is needed, where the bonds can be created using vdsm command line:

"vdsClient -s 0 setupNetworks bondings='{<bond-name>:{nics:<nic1-name>+<nic2-name>+...}"

Dan, what would be the following command to also configure the management network on the newly-created bond?

Comment 3 Greg Scott 2015-02-17 11:20:39 UTC

This bug may not be specific to bond interfaces.  I just ran into the same thing last night with a plain, ordinary, old-fashioned eth2 interface.   See

https://access.redhat.com/discussions/1350873

for details.  After an upgrade to 3.5 from 3.4, on each host, my eth0 interface connected to the rhevm network works as expected.  But no automation I can find will make interface eth2 come alive.  Interface eth2 is connected to a home-made network named storage.  I have to to ifup eth2 by hand on each host to make it live. 

I tried setting up eth2 again in the RHEVM GUI - no luck.  

I noticed ifcfg-eth2 on the hosts have "ONBOOT=no".  I tried editing, then  "persist ifcfg-eth2", and then a reboot.  Still no luck.  And ifcfg-eth2 went back to the way it was, even though I persisted it.  

- Greg Scott

Comment 4 Krzysztof Mazurek 2015-02-17 12:49:19 UTC

I was experimenting with this problem and I came to situation, where I had 2 bond interfaces with 8 VLANs attached to them, and RHEV bridges attached to them running (typed in), persisted. 
On every reboot all of those interfaces were changed "ONBOOT=no" no matter if I changes were persisted, saved via RHEV-M network configuration GUI. 

I saw those interfaces starting up during boot, and after login I got no network connectivity and all of them reconfigured as described above.

Comment 5 Dan Kenigsberg 2015-02-17 17:32:12 UTC

Vdsm intentionally sets ONBOOT=no in order to disable ifcfg-based network persistence; vdsm is expected to start the interface a bit later, as vdsmd starts. Does it happen, Greg?

If not, could you supply your /var/lib/vdsm, and /var/log/vdsm/*log during the service startup process?

Either way, your report does not seem to be much related to this bug: unlike this bug, you do not refer to pre-vdsm ifcfg file, are you?

Comment 6 Greg Scott 2015-02-17 21:56:18 UTC

In my case, interface eth2 does not start up until I do "ifup eth2" by hand. Both hosts in my RHEV 3.5 environment act this way.  I have a 16 MB monster vdsm.log and I'm looking for a place here in bugzilla to upload it.

- Greg

Comment 7 Greg Scott 2015-02-17 22:11:01 UTC

Created attachment 992928 [details]
vdsm.log from my RHEV 3.5 upgrade last night

Comment 8 Greg Scott 2015-02-17 22:35:04 UTC

Aw nuts.  Forget about that vdsm.log I just uploaded.  It rolled over and doesn't have anything from last night.  Hang on while I go find the right one.

- Greg

Comment 9 Greg Scott 2015-02-17 22:45:52 UTC

Created attachment 992933 [details]
vdsm.log archive dated midnight.  This should have the eth2 issues

GMT is 5 hours later than Central time.  Or 6 hours this time of year?  Look for entries around 9 PM Central time last night.

Comment 10 Greg Scott 2015-02-17 22:50:50 UTC

Created attachment 992934 [details]
supervdsm.log in case it's useful.

And I obsoleted the original vdsm.log.

Comment 11 Greg Scott 2015-02-17 22:52:21 UTC

Created attachment 992935 [details]
connectivity.log with some info from last night.

Comment 12 Greg Scott 2015-02-17 22:53:56 UTC

Created attachment 992936 [details]
mom.log from last night

Comment 13 Greg Scott 2015-02-17 22:55:43 UTC

Created attachment 992937 [details]
and upgrade.log showing the 3.5 upgrade I hope.

Comment 14 Greg Scott 2015-02-17 23:09:50 UTC

Created attachment 992939 [details]
The whole /var/lib/vdsm tree

tar -cvzf /tmp/vdsm.tgz vdsm

Comment 15 Dan Kenigsberg 2015-02-18 10:11:57 UTC

Greg, this is the wrong place to discuss your bug, as it has nothing to do with ifcfg files that has been created outside vdsm. Please open a new bug about your issue. In particular, please include the output of `vdsClient -s 0 getVdsCaps` after reboot, as supervdsm.log reports that eth2 has been ifup'ed when vdsmd started.

Comment 16 Greg Scott 2015-02-18 10:20:18 UTC

Why do you assume I created an ifcfg file outside VDSM?  

Until things broke and my RHEV-H host was dead in the water, I did everything via the RHEV-M GUI.  After....***AFTER***  I found host rhevb dead, I dug into the problem and found interface eth2 had no IP Address.  Then I looked at ifcfg-eth2 and noticed it said ONBOOT=no. I change it by hand to ONBOOT=yes and persisted it.  Then rebooted host rhevb and found it want back to the way it was.  Then I looked at host rheva - which was not yet updated to rhevh-6.6 and found its ifcfg-eth2 was also set to ONBOOT=no.  

At any rate, I did not mess with any config files by hand until ***AFTER***  running into this problem.  

If supervdsm.log says eth2 was ifup'ed with vdsm started then either (1) supervdsm.log is not telling the truth, or (2) something else during boot ifdown'ed it.  If I boot either of my hosts, the only way I will be able to get it back alive inside RHEV-M is to ifup eth2 by hand.  

- Greg

Comment 17 Greg Scott 2015-02-18 12:23:21 UTC

Created attachment 993061 [details]
vdsClient -s 0 getVdsCaps on host rheva

Comment 18 Greg Scott 2015-02-18 12:24:05 UTC

Created attachment 993063 [details]
vdsClient -s 0 getVdsCaps on host rhevb

Comment 19 Eyal Edri 2015-02-25 08:40:05 UTC

3.5.1 is already full with bugs (over 80), and since none of these bugs were added as urgent for 3.5.1 release in the tracker bug, moving to 3.5.2

Comment 20 Dan Kenigsberg 2015-02-25 10:01:19 UTC

(In reply to Greg Scott from comment #16)
> Why do you assume I created an ifcfg file outside VDSM?  

Greg, I don't. But this bug was explicitly opened to address the problem of vdsm removing manually-written ifcfg files. You use this bug to discuss another problem, which confused me, and is certain to confuse future readers of this bug. Hence I asked you to discuss this bug elsewhere.

Comment 21 Greg Scott 2015-02-25 11:40:55 UTC

OK, now I get it.  Sorry for the confusion.  I found this bug report through some community posts and it looked like the same problem.  Do you still want me to open another bugzilla?   

- Greg

Comment 29 Dan Kenigsberg 2015-03-02 09:37:06 UTC

The backported patch is ready https://gerrit.ovirt.org/#/c/38240/; I'm waiting for qa_ack to clone.

Comment 32 Marina Kalinin 2015-03-17 20:44:58 UTC

Dan, Petr, please help.

1.
I am trying to understand the relation of this current bug to this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1197668

They both seem to talk about same problem, but one is closed errata and one is open.

2. If I understand correctly, this bug is related to this solution:
https://access.redhat.com/solutions/1346443
Please confirm.

3. Comment#2 on this bug by Lior is not clear to me.
Why do we need to run persist commands on a regular RHEL host.
Or he is referring to RHEV-H hosts, that can be affected by this problem as well and we need to update our solution mentioned on #2?

Bottom line what I am trying to understand is whether we fixed the correct bug and when it is making to the customers.

Comment 33 Michael Burman 2015-04-30 07:36:14 UTC

Verified on - 3.6.0-0.0.master.20150412172306.git55ba764.el6 with
vdsm-4.17.0-632.git19a83a2.el7.x86_64

Comment 36 errata-xmlrpc 2016-03-09 19:30:31 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0362.html

Note You need to log in before you can comment on or make changes to this bug.

adahms
adevolder
anande
bazulay
danken
eedri
gregscott
gwatson
iheim
jhunsaker
jraju
kmazurek
ldelouw
lpeer
lsurette
mburman
melewis
mgoldboi
mkalinin
myakove
nashok
nicolas
phoracek
pzhukov
rbalakri
redhat-bugzilla
rhodain
rmercer
robert.scheck
rpai
rvdwees
sherold
yeylon
ykaul
ylavi