Bug 1262431

Summary: The system looses the default gateway on the management bridge upgrading from hosted-engine 3.5 to hosted-engine 3.6
Product: [oVirt] vdsm Reporter: Simone Tiraboschi <stirabos>
Component: GeneralAssignee: Dan Kenigsberg <danken>
Status: CLOSED NOTABUG QA Contact: Aharon Canan <acanan>
Severity: high Docs Contact:
Priority: urgent    
Version: 4.17.5CC: bugs, danken, ibarkan, nsednev, sbonazzo, stirabos, ylavi
Target Milestone: ovirt-3.6.0-rc3Keywords: Regression
Target Release: ---Flags: sbonazzo: ovirt-3.6.0?
rule-engine: blocker?
sbonazzo: planning_ack?
sbonazzo: devel_ack?
sbonazzo: testing_ack?
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: integration
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-10-09 16:22:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
supervdsm logs none

Description Simone Tiraboschi 2015-09-11 16:04:23 UTC
Description of problem:
The system looses the default gateway on the management bridge upgrading from 3.5 to 3.6.

On 3.5 VDSM had an heuristic to decide about the defaultRoute based on the bridge name: "ovirtmgmt" or "rhevm" was the management network and so it was setting defaultRoute=True on that.
Now the heuristic is not anymore there on VDSM 4.17 but the network created by HE <= 3.5 didn't know about DEFROUTE parameter.

https://bugzilla.redhat.com/1253939 solves it on ovirt-hosted-engine-setup for new deployments but we are not running ovirt-hosted-engine-setup on upgrades. 


Version-Release number of selected component (if applicable):
1.3.0

How reproducible:
100%

Steps to Reproduce:
1. Deploy HE from 3.5
2. Upgrade rpms to 3.6
3.

Actual results:
The host looses the default gateway

Expected results:
upgrade is smooth

Additional info:

Comment 1 Sandro Bonazzola 2015-09-30 08:00:27 UTC
Step to reproduce:
- install Hosted Engine form 3.5 repo
- enable 3.6 repo
- yum update "vdsm*"

Comment 2 Simone Tiraboschi 2015-09-30 08:02:39 UTC
Hosted-engine-setup in the past wasn't setting the defaultRoute attribute.
Now it breaks just updating VDSM cause the heuristic on the network name to set the default gateway is not more there.

A partial solution is to manually use the setupNetworks verb via vdsClient to set
defaultRoute=True
but we have to find a solution to fix it just upgrading VDSM.

Comment 3 Sandro Bonazzola 2015-10-02 14:39:11 UTC
Can we set defaultRoute in 3.5.6 so while upgrading from 3.5.6 to 3.6.z it won't have the same issue?

Comment 4 Red Hat Bugzilla Rules Engine 2015-10-02 14:39:12 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 5 Simone Tiraboschi 2015-10-02 17:12:00 UTC
Yes we can and we solve

Comment 6 Simone Tiraboschi 2015-10-02 17:15:41 UTC
Sorry, it not will solve only for who directly install hosted-engine from 3.5.6 and than upgrade to 3.6 but it will be not enough for who installed hosted-engine from 3.5.z < 3.5.6 and then upgrade to 3.5.6 moving finally to 3.6: on 3.5.z upgrades for hosted-engine we are basically just updating the agent rpm but the user is not executing hosted-engine-setup again.

Comment 7 Yaniv Lavi 2015-10-07 09:35:29 UTC
Moved it back to integration since we will need to resolve this in integration with co-op with network team.

Comment 8 Simone Tiraboschi 2015-10-07 09:47:32 UTC
Yaniv, I'm not that sure.
If I take an host from hosted-engine 3.5, I add 3.6 repo, I fetch only fresher vdsm rpms from there without getting a newer hosted-engine HA agent rpm the system will still loose its default gateway.
So it's probably better to fix on vdsm side, otherwise we have also to conflict vdsm >= 4.17 on ha agent from 3.5.z.

Comment 9 Yaniv Lavi 2015-10-07 11:13:12 UTC
Do you why this changes and how we can workaround this in upgrade?

Comment 10 Sandro Bonazzola 2015-10-07 13:35:20 UTC
(In reply to Simone Tiraboschi from comment #8)
> Yaniv, I'm not that sure.
> If I take an host from hosted-engine 3.5, I add 3.6 repo, I fetch only
> fresher vdsm rpms from there without getting a newer hosted-engine HA agent
> rpm the system will still loose its default gateway.
> So it's probably better to fix on vdsm side, otherwise we have also to
> conflict vdsm >= 4.17 on ha agent from 3.5.z.

this won't prevent existing 3.5.4 installation to be upgraded to to 3.6.

Comment 11 Ido Barkan 2015-10-08 09:29:29 UTC
Simone can you please attach a supervdsm.log of such a host? can you attach the contents of  /var/lib/vdsm/persistence/netconf/* of such a host -before the upgrade- ?

Comment 12 Simone Tiraboschi 2015-10-09 16:20:35 UTC
Created attachment 1081388 [details]
supervdsm logs

Comment 13 Simone Tiraboschi 2015-10-09 16:22:49 UTC
I tried twice (DHCP and static addressing for the host) and I wasn't able to reproduce.

On the static case we got on 3.5:
[root@c71heup20151009 ~]# cat /var/lib/vdsm/persistence/netconf/nets/ovirtmgmt 
{"nic": "eth0", "netmask": "255.255.255.0", "bootproto": "none", "ipaddr": "192.168.1.211", "gateway": "192.168.1.1"}

and it survived the 3.5 -> 3.6 upgrade so no issue there.

I attached the supervdsm logs as a reference but I think we can close it.

Comment 14 Ido Barkan 2015-10-11 05:54:13 UTC
*** Bug 1262026 has been marked as a duplicate of this bug. ***