Bug 1184497 - SetupNetworks> Can't change network state from dhcp to static ip
Summary: SetupNetworks> Can't change network state from dhcp to static ip
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: General
Version: ---
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ovirt-3.6.0-rc3
: ---
Assignee: Ondřej Svoboda
QA Contact: Michael Burman
URL:
Whiteboard: network
: 1192778 (view as bug list)
Depends On:
Blocks: 1112861 1242235 1279824
TreeView+ depends on / blocked
 
Reported: 2015-01-21 14:45 UTC by Michael Burman
Modified: 2016-02-10 19:15 UTC (History)
15 users (show)

Fixed In Version: vdsm-4.17.0-632.git19a83a2
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1279824 (view as bug list)
Environment:
Last Closed: 2015-11-04 12:58:59 UTC
oVirt Team: Network
Embargoed:
rule-engine: ovirt-3.6.0+
rule-engine: blocker+
ylavi: Triaged+
ylavi: planning_ack+
rule-engine: devel_ack+
rule-engine: testing_ack+


Attachments (Terms of Use)
logs (360.15 KB, application/x-gzip)
2015-01-21 14:45 UTC, Michael Burman
no flags Details
screenshot (124.60 KB, image/png)
2015-07-08 14:22 UTC, Michael Burman
no flags Details
vdsCaps_bootproto_update_failed (23.32 KB, text/plain)
2015-08-16 12:47 UTC, Eliraz Levi
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 37059 0 master MERGED Handle bridge reconfiguration Never
oVirt gerrit 45504 0 master MERGED netinfo: report DHCP on devices only when configured (like for networks) Never
oVirt gerrit 46247 0 ovirt-3.6 MERGED netinfo: report DHCP on devices only when configured (like for networks) Never
oVirt gerrit 46430 0 master MERGED netinfo: rework reporting of DHCPv4/6 on network devices Never
oVirt gerrit 48399 0 ovirt-3.5 ABANDONED netinfo: rework reporting of DHCPv4/6 on network devices Never
oVirt gerrit 48511 0 ovirt-3.6 MERGED netinfo: rename variables in get() Never
oVirt gerrit 48513 0 ovirt-3.6 MERGED netinfo: rework reporting of DHCPv4/6 on network devices Never

Description Michael Burman 2015-01-21 14:45:14 UTC
Created attachment 982352 [details]
logs

Description of problem:
SetupNetworks> Can't change 'ovirtmgmt' from dhcp to static ip.
When trying to change 'ovirtmgmt' from dhcp to static ip via SN, i get in the event log:
Network changes were saved on host orchid-vds1.qa.lab.tlv.redhat.com

But, when checking -
cat /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt
BOOTPROTO=dhcp

And in SN ovirtmgmt stays with dhcp, even if the event log shows that network changes were saved on host.

Version-Release number of selected component (if applicable):
3.6.0-0.0.master.20150105124153.git669ddc1.el6
vdsm-4.17.0-304.gitf191666.el7.x86_64

How reproducible:
100

Steps to Reproduce:
1. Hosts>SN> edit 'ovirtmgmt'(with pencil)
2. Change the Boot Protocol from dhcp to static ip
3. Approve operation

Actual results:
'ovirtmgmt' stays with dhcp BootProto, in SN and in ifcfg-ovirtmgmt

Expected results:
Should succeed to change 'ovirtmgmt' from dhcp to static ip

Comment 1 Michael Burman 2015-04-14 15:07:48 UTC
Verified on - 3.6.0-0.0.master.20150412172306.git55ba764.el6.noarch
and vdsm-4.17.0-633.git7ad88bc.el7.x86_64

Comment 2 Michael Burman 2015-07-08 14:20:52 UTC
I see this behavior again on 3.6.0-0.0.master.20150627185750.git6f063c1.el6 and
vdsm-4.17.0-1054.git562e711.el7.noarch

If trying to change network's bootproto from DHCP to static ip via Setup Networks,(not only ovirtmgmt network) it looks like network configuration has been saved successfully, but ifcfg report that bootproto changed to none with static ip, netmask and GW:

cat /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt 
# Generated by VDSM version 4.17.0-1054.git562e711.el7
DEVICE=ovirtmgmt
TYPE=Bridge
DELAY=0
STP=off
ONBOOT=yes
IPADDR=10.35.128.8
NETMASK=255.255.255.0
GATEWAY=10.35.128.254
BOOTPROTO=none
MTU=1500
DEFROUTE=yes
NM_CONTROLLED=no
IPV6INIT=no
HOTPLUG=no

But Setup Network report that the boot proto for this network is still DHCP.

vdsCaps report:
'ovirtmgmt': {'addr': '10.35.128.8',
                                  'bridged': True,
                                  'cfg': {'BOOTPROTO': 'none',
                                          'DEFROUTE': 'yes',
                                          'DELAY': '0',
                                          'DEVICE': 'ovirtmgmt',
                                          'GATEWAY': '10.35.128.254',
                                          'HOTPLUG': 'no',
                                          'IPADDR': '10.35.128.8',
                                          'IPV6INIT': 'no',
                                          'MTU': '1500',
                                          'NETMASK': '255.255.255.0',
                                          'NM_CONTROLLED': 'no',
                                          'ONBOOT': 'yes',
                                          'STP': 'off',
                                          'TYPE': 'Bridge'},
                                  'dhcpv4': False,
                                  'dhcpv6': False,
                                  'gateway': '10.35.128.254',
                                  'iface': 'ovirtmgmt',
                                  'ipv4addrs': ['10.35.128.8/24'],
                                  'ipv6addrs': ['fe80::21d:9ff:fe68:71c1/64'],
                                  'ipv6gateway': '::',
                                  'mtu': '1500',
                                  'netmask': '255.255.255.0',
                                  'ports': ['eno1'],
                                  'stp': 'off'},

This a bit different from original report, cause now the ifcfg-'network' has been updated to static, but SN still report dhcp, and only manual configuration will bring ifcfg-'network' back to dhcp, Setup Networks command doesn't take any effect here.

So, gonna reopen as regression. Thanks.

Comment 3 Michael Burman 2015-07-08 14:22:10 UTC
Created attachment 1049888 [details]
screenshot

Comment 4 Ido Barkan 2015-07-12 07:06:06 UTC
can you attach ovirtmgmt ifcfg file? I want to find out if this is a race.

Comment 5 Ido Barkan 2015-07-12 07:32:38 UTC
nm. I have found in the logs:

MainProcess|Thread-20::DEBUG::2015-01-19 15:05:35,251::ifcfg::538::root::(writeConfFile) Writing to file /etc/sysconfig/network-scripts/ifcfg-ovirtmgmt configuration:
# Generated by VDSM version 4.16.8.1-5.el7ev
DEVICE=ovirtmgmt
TYPE=Bridge
DELAY=0
STP=off
ONBOOT=yes
BOOTPROTO=dhcp
MTU=1500
DEFROUTE=yes
NM_CONTROLLED=no
HOTPLUG=no

So this is indeed a race that might be resolved providing either blockingdhcp=True by the engine or make it the default behaviour of vdsm.

Dan, so you see any reason why not making it the default behaviour in vdsm? If you do, this might be supplied by the engine when it wants.

Comment 6 Ido Barkan 2015-07-12 07:54:04 UTC
this was my mistake. vdsm is fine here as it reports the proper caps. this is probably an engine bug.

Comment 7 Dan Kenigsberg 2015-07-13 12:36:49 UTC
(In reply to Ido Barkan from comment #5)
> Dan, so you see any reason why not making it the default behaviour in vdsm?
> If you do, this might be supplied by the engine when it wants.

The historic reason is that dhcp server can stall for long minutes, and Engine needs a quicker response (or else it would have suffered a timeout and attempted a revert).

Comment 8 Ori Gofen 2015-07-30 14:45:15 UTC
This bug reproduces on every network configured on host, thus absolute the ability to create an iscsi multipathing bonds, further more, if attempting the creation of such bond, when the network revert back to dhcp, storage becomes inaccessible (because of wrong masking)

Comment 9 Eliraz Levi 2015-08-16 12:47:19 UTC
i believe it is a vdsm bug.

engine is sending host setup network vds command with the following arguments (taken from engine log):
2015-08-16 14:59:19,559 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HostSetupNetworksVDSCommand] (default task-30) [799333a2] START, HostSetupNetworksVDSCommand(HostName = centos7-1, HostSetupNetworksVdsCommandParameters:{runAsync='true', hostId='d836d879-d9e9-4073-aa7b-b74804f0e133', vds='Host[centos7-1,d836d879-d9e9-4073-aa7b-b74804f0e133]', rollbackOnFailure='true', conectivityTimeout='120', hostNetworkQosSupported='true', networks='[HostNetwork:{defaultRoute='false', bonding='false', networkName='NET0', nicName='eth1', vlan='null', mtu='0', vmNetwork='true', stp='false', properties='[]', bootProtocol='STATIC_IP', address='192.168.0.1', netmask='255.255.0.0', gateway=''}]', removedNetworks='[]', bonds='[]', removedBonds='[]'}), log id: 6d595db7

when finished, engine calls getCaps.
the output of vdsGetCaps is attached.
vdsm is reporting  'dhcpv4': True , resulting engine to think bootproto is DHCP.

Comment 10 Eliraz Levi 2015-08-16 12:47:48 UTC
Created attachment 1063519 [details]
vdsCaps_bootproto_update_failed

Comment 11 Ondřej Svoboda 2015-08-18 09:19:15 UTC
I have to clarify here a bit.

While dhclient probably left behind an active DHCP lease (in /var/lib/dhclient/) that VDSM uses to decide the 'dhcpv4' property for devices (here it reports True for the _bridge_ NET0), there is special handling (workaround) for networks:

"If DHCP is not configured now, do not report it even though you found an active lease".

You can see that for the _network_ NET0 'dhcpv4' is False because of this. This behaviour was adopted not to confuse the engine (and the user).

We assumed that the engine used information from the network (and not the bridge) for display in "Edit Management Network". Is this assumption wrong?

Comment 12 Dan Kenigsberg 2015-08-25 13:42:42 UTC
At the moment, Vdsm reports the new static IP address for the bridge, but claims that the bridge has an active dhcpv4 lease. This is misleading, as the reported IP address was not provided by a DHCP server.

I'm afraid that Vdsm should report dhcpv4==True only if there is a running dhclient process bound to the interface (AND has a valid lease).

Can you tweak the vdsm-side logic?

Comment 13 Dan Kenigsberg 2015-09-16 11:45:16 UTC
*** Bug 1192778 has been marked as a duplicate of this bug. ***

Comment 14 Eyal Edri 2015-10-10 14:26:13 UTC
this fix is already in 3.6.0-15, moving to ON_QA.

Comment 15 Michael Burman 2015-10-11 07:54:33 UTC
This bug still exist :

'ovirtmgmt': {'addr': '10.35.128.8',
                                  'bridged': True,
                                  'cfg': {'BOOTPROTO': 'none',
                                          'DEFROUTE': 'yes',
                                          'DELAY': '0',
                                          'DEVICE': 'ovirtmgmt',
                                          'GATEWAY': '10.35.128.254',
                                          'HOTPLUG': 'no',
                                          'IPADDR': '10.35.128.8',
                                          'IPV6INIT': 'no',
                                          'MTU': '1500',
                                          'NETMASK': '255.255.255.0',
                                          'NM_CONTROLLED': 'no',
                                          'ONBOOT': 'yes',
                                          'STP': 'off',
                                          'TYPE': 'Bridge'},
                                  'dhcpv4': True,

Network configured with static ip, but dhcpv4 = true and engine still reports the bootproto as dhcp.

Comment 16 Michael Burman 2015-10-11 14:08:04 UTC
On 3.6 if changing manually the bootproto, vdsm will not recognize the change(not reading from cfg).
It means this bug can be verified.

Changing network state from dhcp to static ip and vise versa via setup networks working now as expected. vdsCaps and ifcfg-* files reporting as expected.

Tested and verified on - 3.6.0-0.18.el6 with vdsm-4.17.8-1.el7ev.noarch

Comment 17 Ondřej Svoboda 2015-10-11 14:18:59 UTC
Thank you, Michael!

The logic we adapted in the end doesn't watch dhclient instances as that would be probably too intrusive for a backport. Instead, the currently active ("running") network configuration is queried and reported also for network devices that represent a given network (most often, bridges).

There is some refactoring work to be merged later, but the behaviour will stay the same.

Inspection of running dhclient instances' cmdlines were actually the first idea I had more than year and half ago. For 4.0 it will be reconsidered.

Comment 18 Sandro Bonazzola 2015-11-04 12:58:59 UTC
oVirt 3.6.0 has been released on November 4th, 2015 and should fix this issue.
If problems still persist, please open a new BZ and reference this one.


Note You need to log in before you can comment on or make changes to this bug.